Nov. 30th, 2001

alexpgp: (Default)
Soon after the development of the electronic computer, various people set their sights on the solution of pie-in-the-sky type problems.

In 1948, Claude Shannon outlined a general algorithm that is, in fact, used today to implement chess-playing programs. It took computer scientists just over 20 years to produce a program (MacHack IV) that could draw one game out of six against real chessplayers, and about 50 years to produce a hardware/software complex that could defeat the World Champion in match play.

At the about the same time, researchers in various parts of the world turned to analysis of a more direct, age-old problem: the translation of text from one language to another.

To the naive observer, this shouldn't be a terribly difficult problem. Get yourself a big bilingual dictionary, resolve the source text into individual words, look up their corresponding words in the target language, and then reconstitute the text.

It sounds simple, but the reality is much more complex, as anyone who has seriously undertaken a translation knows.

IBM had an exhibit at the 1964 New York World's Fair where visitors could type in sentences on a Selectric typewriter in English. The typewriters were connected as input terminals to a System 360, which was running a program that analyzed the sentence and then tried to output the same thought in Russian.

A very famous "funny" translation from that era was the rendering of "out of sight, out of mind" as "blind idiot" in Russian. The saying "the flesh is strong, but the spirit is weak," was dutifully rendered as "the meat is good, but the liquor is inferior."

Recently, when the Clinton-Lewinski materials were subjected to this process of "machine translation," there were still an uncomfortable number of howlers in the rendered text (e.g., the "White House physician" became a house doctor who was white in, if memory serves, the German version).

If you want to drive a machine translation program to the wall (and maybe derive a chuckle if you read the target language) try the following sentence as input:

"Time flies like an arrow, but fruit flies like a banana."

Anyway, actually doing the translation is not what Déjà Vu or SDLX are all about. These applications fall into the realm of what are called "translation memory" products.

Both products take electronic source files and rip them apart into sentences, keeping track of formatting (bold, italic, superscript, etc.). The source sentences are then displayed in the left-hand side of the translator's screen and the target sentences (or blanks) are displayed on the right-hand side.

A "translation memory" is a database that is compiled using a source file that has been translated previously. Let's say, for example, that I have yesterday's Form 24 for the ISS crew, and its translation.

The memory file matches what the translator wrote with each source sentence. Yesterday, for example, the crew engaged in "утренний туалет" (literal translation: morning toilet, in the sense of washing, brushing teeth, shaving, dressing, etc.). The English for that (in NASA-speak) is "post-sleep activity."

This pair is stored in the translation memory. When the application opens today's file and sees "утренний туалет" in the original Russian, it goes ahead and fills in the English for that particular sentence for me. All very mechanical.

Some pairs may match only partially. Yesterday's "Inspect compressor 1" may be today's "Inspect compressor 2." Last week's "Take air samples at four locations in the Lab Module" may be today's "Take water samples from EDVs in the FGB," which is not as exact a match, but still semantically close enough for the computer to grok it. The program can also take translator-supplied terminology and analyze upcoming sentences in order to flag those terms for the translator.

In effect, the "smarts" being applied to the translation are the translator's. All through the process, the computer's provides unparalleled sentence tracking and pretty good terminology matching skills. Once all the sentences are translated (with or without the help of the memory file and terminology tagger) the application creates a target file with all of the formatting where it needs to go. The end result is a better translation from the point of view of completeness and consistency, and - at least in theory - a somewhat less stressed translator.

* * *
I opened the store again today, and got out around 11 am after doing two days' worth of postal reports. When I got home, I had to spend some time putting together a report for the AMPC that's due by the end of the month (which was today), so I only was able to start on my current job after noon.

The day seemed to whiz by quickly, and despite what I felt was a nice clip on the translation, I still have 6 pages to go, based on my scheduling effort of a couple of days ago.

In truth, I can probably quit for the night after two more pages (actually, I can probably put it up now, but I'll feel better after a couple more pages are done...), since the other job I have - which I figured would take two days - will likely take a day and a half (and may not take that long, as the pages are shorter than the ones I'm dealing with now).

Still, either way you slice it, unless I make some truly spectacular progress on these jobs tomorrow, I will probably spend the entire weekend working, which in and of itself is not awful, but I had wanted to go to Durango on Sunday with Galina and the kids and do a family portrait.

We'll see. For now, it's back to work!

Cheers...
alexpgp: (Default)
Near the beginning of the film Saving Private Ryan, while setting up the story of the film, the screenwriters place a minor character - a general - in a situation where he feels compelled to read a letter concerning a mother who had lost five sons in the Civil War.

He starts by reading the letter out loud for the benefit of his interlocutor (and us, the audience), and then slowly looks up with slightly glazed eyes and sits down as he finishes reciting the letter from memory. The general sums up by telling his listener(s) the author's name: Abraham Lincoln.

(It's a pretty strong scene, and reflects the kind of concern for human life that eventually was largely beaten out of the mind-set of "corporate" military officers under the leadership of vermin like Robert McNamara, and left largely to junior officers and NCOs, but I digress...)

The letter was written by Abraham Lincoln, in a time long before anyone heard of speechwriters, focus groups, or PR flacks. In an old poetry book of mine, I ran across a facsimile of the letter, the original of which is said to be on display at Brasenose College at Oxford University in England. The book describes it as "a model of purest English, rarely, if ever, surpassed."

Executive Mansion
Washington, Nov. 21, 1864

To Mrs. Bixby, Boston, Mass.

Dear Madam,

I have been shown in the files of the War Department a statement of the Adjutant General of Massachusetts that you are the mother of five sons who have died gloriously in the field of battle. I feel how weak and fruitless must be any word of mine which should attempt to beguile you from the grief of a loss so overwhelming. But I cannot refrain from tendering you the consolation that may be found in the thanks of the republic they died to save. I pray that our Heavenly Father may assuage the anguish of your bereavement, and leave you only the cherished memory of the loved and lost, and the solemn pride that must be yours to have laid so costly a sacrifice upon the altar of freedom.

Yours very sincerely and respectfully
A.Lincoln
I'm sure nothing like this would fly in today's cynical age, but nonetheless, it is a powerful set of words. Lincoln was very adept at extracting the greatest amount of oomph from the least number of words.

I wonder if the letter is still at Brasenose?

Cheers...

Profile

alexpgp: (Default)
alexpgp

January 2018

S M T W T F S
  1 2 3456
7 8910111213
14 15 16 17181920
21222324252627
28293031   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Sep. 2nd, 2025 06:46 am
Powered by Dreamwidth Studios