Aug. 14th, 2012

alexpgp: (St Jerome a)
Every once in a while, your friendly, neighborhood translator is gobsmacked by an abbreviation that shows up in a document seemingly out of nowhere and, naturally, demands attention.

Take, for example, the Russian abbreviation "САР" (—please!) in a document I'm working on. To my credit, I realize there's a better than even chance that the last two letters stand for анализ риска (risk analysis), but without knowing what the first letter expands to, I may as well just transliterate the abbreviation (SAR) and move on, as it were.

That is, except for one sturdy little straw that's available for the grasping, involving a search using wildcards. Consider the following string:
[а-я]@
In Microsoft Word's variant of wildcard code, this means "one or more occurrences of any lower-case letter between 'а' and 'я'." If one tacks the character 'с' to the front, like this:
с[а-я]@
followed by a space, performing a search will find every instance of a word of at least two letters whose first letter is 'с'. Continuing with this logic,
с[а-я]@ а[а-я]@ р[а-я]@
will find three consecutive words, of two or more letters each, that begin with 'с', 'а', and 'р', respectively (I use lower case because Russian is generally pretty sparing when it comes to capitalizing words).

I hit paydirt with the second successful "find":
системный анализ риска
or "system risk analysis."

There are times this technique will not work, but it's almost always worth a try when you're up against it.

Cheers...
alexpgp: (St Jerome a)
One of the figures in the document I was working on today had a figure that underwent some kind of warp, resulting in the conversion of CP 1251 (single-byte Cyrillic) characters into CP 1252 (single-byte "Latin 1") characters.

I was now faced with the prospect of undoing the conversion, taking something like, um,
ñèñòåìíûé àíàëèç ðèñêà
and rendering it as something readable, i.e.
системный анализ риска
Fortunately, I not only covered this particular issue in my presentation titled Navigating the Cyrillic "Swamp" made at the 2002 ATA Conference in Atlanta (has it really been almost ten years?), but I also kept track of the associated PowerPoint presentation, which helped me "decode" the gibberish in the figure back into Russian.

Going after a little of that wow! factor, y'dig?

No turn left unstoned! :^)

Cheers...

Profile

alexpgp: (Default)
alexpgp

January 2018

S M T W T F S
  1 2 3456
7 8910111213
14 15 16 17181920
21222324252627
28293031   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Aug. 10th, 2025 01:34 pm
Powered by Dreamwidth Studios