Aug. 3rd, 2002

alexpgp: (Default)
LJ friend [livejournal.com profile] vuzh comments:

It seems they are using some kind of Translation Memory program to pre-translate electronic files. The deal is that I get full freight for material that's not been pre-translated, but only 1/3 of my rate for "editing" the rest.

i've done a lot of this type of thing with spanish translations.
make them send the original text as well --
those translation programs screw the text up so much, it's often difficult to impossible to figure out what the original said.

next time, you'll probably want to insist on a higher rate for "editing".
maybe someday automatic translation programs will work well, but right now, they're crap.
Well, it's not as bad as all that.

Machine translation attempts to translate from source to target using a dictionary and some complex collection of rules that try to figure out what the original means. As you point out, the result is generally crappy.

Translation memory programs, on the other hand, rely on a database of previous (presumably human) translations.

For example, if - in a previous translation - the sentence "La plume de ma tante est ici" is rendered as "My aunt's pen is here," then upon encountering the same sentence, the program fetches the translation and inserts it into the target.

If the program encounters the sentence "La plume de mon oncle est ici," it will typically insert the previous translation ("My aunt's pen..."),, highlight "mon oncle," and indicate that the translation is 80% (or whatever number it calculates) identical to a previously encountered sentence.

The translator then reviews the sentence, translates "mon oncle," and makes any other required changes, and goes on to the next sentence. It's all pretty cut-and-dried and the program makes no pretense of knowing more than you about what needs doing. You can think of TM as a very sophisticated, but dumb pattern-matcher.

TM programs are increasingly popular among translators, since they (a) help assure consistency and (b) help make sure no sentences are omitted from the translation. They are also a tremendous boon when it comes to translating text with a lot of boilerplate, because once you translate such text once, it is propagated throughout the document (and subsequent documents, too), and any small, subtle differences are highlighted, so your attention can focus on the differences instead of rereading the rest of the text.

The down side of TM programs is that agencies are using them increasingly, too. I used to participate on a translator's mailing list - LANTRA-L - until the noise-to-signal ratio shot through the roof, but recall a number of discussions about how agencies would send freelancers databases of pre-translated text for TM programs such as Trados or Déjà Vu, and basically impose the same kinds of conditions as I'm seeing with Client U (full pay for new text; partial pay for stuff that's been translated before).

Of course, everything you send back in as new text is subsequently added to the database, so that next time, there's a broader range of text that needs no translation. Technically, one might assume that eventually the translator's job will be eliminated, but that's not quite accurate.

What is accurate is that the quantity of text that needs to be verified goes up. Moreover, unless I miss my guess, what you do with such text is not so much "editing" as it is "tweaking" (but I suspect I'll be better qualified to address this point after I do this project).

Talking about doing this project... I should probably get back to work. :^)

Cheers...
alexpgp: (Default)
...and, of course, it's not something you get paid for.

My work for the day would invoice out at peanuts, because I decided to process client U's 8500-entry "glossary" into something actually usable. I need something usable, because the work approach proposed by my editor has some serious shortcomings.

For example: Why do editors assume I have nothing better to do than use Word's search feature to plow through their multi-megabyte files?

My mania for a useful glossary does not take into account the editor's insistence that I must use the company's "glossary" while doing the translation, which, in the face of his expecting me to manually translate about 7,000 words is BEYOND ridiculous.

* * *
I remember the same issue came up when I was spearheading a drive to get NASA/JSC to sit down with both the company I worked for (i.e., me) and with the Russian side and come up with a definitive "lexicon" of program-critical terms.

You might think this was one of the first things that was attended to back when the program first started in 1993, but it wasn't. There was a predecessor to the Lexicon, as it came to be called, but it was incomplete, inconsistent, and arranged to suit the tastes of engineers and not translators.

The idea here was two-fold: identify and codify unique terms (i.e., terms that brook no dalliance with variety), and identify and codify terms that may have multiple critical meanings in different contexts. In the absence of such a guiding document, nobody could ever be sure what the heck an English translation was referring to, unless they were experts in a particular subject.

What finally put the project over the top was support from the Astronaut Office, which quickly came to the conclusion that, say, having a one-to-one correspondence between 'консервант' and 'pretreat' (a substance used to treat collected human waste) would make life much easier than having to read documents that referred to 'preservative', 'additive', 'conserving agent', and (occasionally) 'pretreat solution' where all the terms denoted the same thing.

One of my core goals, besides actually collecting and verifying terminology, was to keep the size of the Lexicon at or below 3,000 terms (once some folks at NASA got into the swing of things, they wanted to add every possible term to the document, which would have been a project unto itself, on a par with the compilation of a full-blown dictionary). The reason for the limitation was this: if you were going to hold a translator's feet to the fire and require these terms to be used - and that was the intent - the Lexicon had to be of a scope that could be grasped within a reasonable amount of time, say three to six months of daily use.

Let me draw an analogy.

If you've ever used a "style book" (e.g., the AP or UPI books, which prescribe how certain things are written for their respective shops), you'll notice the thing is reasonably compact: not more than 200 pages or so. If you read it a couple of times, you have a pretty good idea of how to deal with about 90% of the problems you're likely to cover while writing a story. If you forget something specific, you'll at least remember that it's covered in the book, and can look it up in jig time.

However, once you get to the level (and size) of something like the Chicago Manual of Style, you now start to lose a lot of users who instead will go with their best judgment - or simply guess - about how to deal with some style issue, rather than slog through a 546-page book (that's the count for the 12th edition, BTW, which resides on my reference shelf).

It's the same with glossaries. Most translators can deal with a list of a few hundred words pretty quickly. Larger glossaries require more time for familiarization and "imprinting." You could probably go above my 3,000-word limit if all your work is oriented that way, but even so, it would take some time to master, with the bulk of that time devoted to realizing what terms are in the glossary, and which aren't.

Unfortunately, my client assumes I have some kind of paranormal power that tells me which of the several thousand words in my assignment are in his glossary, so I can look them up and make sure I'm using the "right" term.

* * *
Reading client U's "glossary" is not an option. If I were to spend a mere 10 seconds per entry, it would take me nearly 24 solid hours to read through my client's "glossary," and neither my recall nor my absorption are that good. (Neither is the glossary's layout, but that's another issue.)

So... the alternative has been to extract the Russian and English from the Word file and then run the result through some Perl and a text editor in preparation for importation into Déjà Vu. I am about 95% done. I will complete the edit in a few minutes (after this steam-releasing break), and then import the file into DV.

The real work starts tomorrow... or - seeing how much rain we've had today (almost 5/8 of an inch over by Fred H.'s) - maybe I'll take an hour or two and see if any mycological fruiting bodies have come up for air.

Cheers...

Profile

alexpgp: (Default)
alexpgp

January 2018

S M T W T F S
  1 2 3456
7 8910111213
14 15 16 17181920
21222324252627
28293031   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 4th, 2025 09:38 am
Powered by Dreamwidth Studios