The Evil Translation



  • A [url=http://forums.thedailywtf.com/forums/p/7509/139904.aspx#139904]while[/url] ago I promised a story/rant about Microsoft translators. Well, here it is. I'm not sure if the issue I describe is known stuff, or whether it is appropriate to the forum, or if it's too long, so "AS IS", etc.

    You probably know that Microsoft distributes a lot of localized Windows (and not just Windows, obviously) versions. Of course, it is nice to see the strings translated to your native tongue, especially if you don't know English at all. But sometimes, translators go too far. A relatively benign example is that they translate program names. Well, "Звукозапись" (or "Фонограф", as it was in Win95) is somewhat more intelligible that "Sound Recorder", but "Проигрыватель Windows Media"? Why didn't they translate "Internet Explorer" (since "Explorer" is "Проводник"), or, for that matter, "Windows"? Still, that's only mildly amusing, and I can live with it.

    But sometimes the translators poke their noses where they really shouldn't. For example, the messages that were shown by Windows 98 bootloader if something went wrong were "localized". Well, the Cyrillic font obviously isn't loaded at this stage, so all that came out was the unintelligible mess of accented characters and symbols for the artist-formerly-known-as-Prince. And the Windows XP bluescreens (at least in SP1) made the same mistake. Yeah, I'm very happy that ïðîèçîøëà êðèòè÷åñêàÿ îøèáêà. Anyway, bluescreens don't happen very often, and I can live with it.

    (That was just the foreword, the actual story follows. 8=])

    Circa 1996, I had a computer (wow!). And it had the übercool OS installed on it — Windows 95! And it was really cool (I mean, I was just a kid back then). Well, besides such cool things as Paint and Wordpad, I had a certain [url=http://en.wikipedia.org/wiki/War_Gods_%28video_game%29]game[/url]. And it was (you guess it) cool. And that's about all about this period.

    But circa 2001, an upgrade happened. And one of the consequences was that the trusty Windows 95 was replaced by Windows 98. Which was cool, but not so much. Months passed and I remembered The Game. Anticipating hours of fun gameplay, I installed it, but when I tried to launch it, it popped up a strange dialog: "To play War Gods you must close all other CD applications and insert the game CD. Retry/Cancel?"

    I scratched my head and went on to verify that CD is, in fact, in the cupholder, and that no other applications are accessing it, or, for that matter, are running. After realizing the futility of repeatedly clicking the Retry button, I was reduced to watching the demo mode, which was entertaining, but not as entertaining (and cool), as, say, [i]playing[/i] the game. Afterwards, I uninstalled it and started exploring some other game. PowerPoint, probably.

    The upgrade to Windows Me, which happened soon, didn't help the situation. The upgrade to Windows XP, which happened not-so-soon, didn't either.

    Fast-forward to 2006. By this point I already had an Internet connection and my programming knowledge was growing exponentially. After I learned some x86 assembly, I suddenly thought: Hey! I'm a 1ee7 cr@xx0r now! I can look into the game disassembly and find the bug which prevents me from kicking ass! So I grabbed a [url=http://www.ollydbg.de/]debugger[/url] and plunged into the realm of teh machine codez. By setting a breakpoint on the MessageBox function call, I quickly located the CD binding code.

    Here begins the interesting part. The game talked to Windows using MCI ([url=http://msdn2.microsoft.com/en-us/library/ms709461(VS.85).aspx]Media Control Interface[/url]). Basically, it sent a string and received a string in return. The first command issued was <font face="Courier New">status cdaudio ready</font>, and the response was <font face="Courier New">true</font>, since the drive was indeed ready. The game was happy with it and proceeded to request another bit of detail: <font face="Courier New">status cdaudio media present</font>. The response was identical and the game displayed curiosity again with <font face="Courier New">status cdaudio number of tracks</font>. To my surprise, the answer was <font face="Courier New">15</font>! After some head-scratching, I opened Windows Media Player and realized that the game data was on the first track, while the rest were music from the game. The game agreed that 15 tracks is just enough and then entered an interesting loop for <font face="Courier New">i</font> = 1 to 15. In each iteration the value of <font face="Courier New">i</font> was substituted into <font face="Courier New">status cdaudio type track %u</font>, and subsequently executed.

    This command is [url=http://msdn2.microsoft.com/en-us/library/ms713277(VS.85).aspx]supposed[/url] to return either <font face="Courier New">audio</font> or <font face="Courier New">other</font>, depending on the track type. Imagine my surprise when it returned some garbage string instead! In the first iteration, this string was deemed unequal to "audio" and thus OK; however the garbage of the second iteration was expected be be equal to "audio", and thus the check for authencity failed.

    Fine, I found the culprit, but what now? Windows is buggy? Since you've read the intro, you probably guessed the problem already, but I was flabbergasted. Not knowing what to do, I hacked together a simple VB program to experiment with MCI commands. Since I possess supreme C skillz now, I present you a C equivalent:

    #include <stdio.h>
    #include <wchar.h>
    #include <locale.h>
    
    #define UNICODE
    #include <windows.h>
    
    #define BUFSIZE 1024
    
    wchar_t inbuf[BUFSIZE], outbuf[BUFSIZE];
    
    int main()
    {
      setlocale(LC_ALL, "");
    
      while (fgetws(inbuf, BUFSIZE, stdin))
      {
        MCIERROR err;
        size_t len = wcslen(inbuf);
        if (inbuf[len - 1] == L'\n') inbuf[len - 1] = L'\0';
    
        err = mciSendString(inbuf, outbuf, BUFSIZE, NULL);
    
        if (err == 0)
          fwprintf(stdout, L"%s\n", outbuf);
        else if (mciGetErrorString(err, outbuf, BUFSIZE))
          fwprintf(stdout, L"%s\n", outbuf);
        else
          fputws(L"WTF?\n", stderr);
      }      
    
      return 0;
    }
    

    I punched in the same commands as the game did:

    status cdaudio ready
    true
    status cdaudio media present
    true
    status cdaudio number of tracks
    15
    status cdaudio type track 1
    [b]другой[/b]
    status cdaudio type track 2
    [b]аудио[/b]
    

    [b]ZOMGWTFBBQROTFLMAOP&PIMP face'o'table!!1! [i]They translated the [u]MAGIC STRING[/u]?!?17!?!?!seventy one?!?!?!?[/i][/b]

    When the stream of profanity run out, I opened the game executable and overwrote the magic string "audio" with the sorta-magic string "аудио", and the message box nagged me no more.



  • Wow, that's like translating the words in the system variables.



  • Nice story, well written.

    Good job!



  •  Kind of like they translated all the function names in excel. Although i really hope that's stored in a language-independent way and just displayed differently (optimistic, i know).



  • I knew it was coming, but was still flabbergasted when it happened.  

    omgwow.  Microsoft thought this was a good idea on what planet? 



  • *unofficially nominates for best-of-the-side-bar article*

    Also, I love your avatar. How did nobody else think of having an Avatar 0.1?



  • I hereby nominate this for Best of the Sidebar.



  • @rbowes said:

    *unofficially nominates for best-of-the-side-bar article*

    Also, I love your avatar. How did nobody else think of having an Avatar 0.1?

     

    Wow... you beat me to it. Well I second it at least then. 



  • Another annoying thing is that Microsoft insist on translating shortcut keys, and the manner in which they do it. It might make sense to change them to fit with the language, but then it should be done consistently.

    In Danish versions of Windows/Office, for example: Ctrl-B for bold becomes Ctrl-F for fed. That of course results in Ctrl-F for find having to be remapped... One would think that it would become Ctrl-S for søg - but no, instead they chose to just swap the shortcuts, thus turning Ctrl-F into Ctrl-B for ..... I haven't the faintest idea.

    For someone who switches between different localizations of Microsoft products this quickly becomes a huge annoyance. 



  • @rbowes said:

    unofficially nominates for best-of-the-side-bar article

    Well, to put it on the front page, Alex'll have to anonymize it, right? And it's kinda tough to anonymize this. 8=]

    @rbowes said:

    Also, I love your avatar. How did nobody else think of having an Avatar 0.1?

    It's more 0.1 than you think. Since I have no clue about image editing, here's what I did to produce it:

    1. Download the anonymous image, open it in MS Photo Editor, set the transparent color to gray;
    2. Google for wooden table texture, download, crop (in MS Paint);
    3. [code]
      [/code];
    4. PrintScreen;
    5. WTF?


  • To me the most tiresome thing about non-English versions of Windows is navigating the Control Panel.  The Control Panel elements have irritatingly vague names to begin with, like "Computer Management".  They're translated to equally vague terms, they're positioned differently (since they're sorted in lexicographical order), and it's hard to figure out what was what, especially when you're trying to follow instructions written in English.



  • @Spectre said:


    3. <font face="Lucida Console" size="2"><img style="position absolute; top: 10; left: 10" src=wood-cropped.dib> <img style="position absolute; top: 10; left: 10; filter: alpha(opacity=55) glow(Color=#eeeeee, Strength=2)" src=anonymous-cropped.gif></font>;
    4. PrintScreen;

    Videoing-in the avatar, Swampy style! 



  • @Spectre said:

    @rbowes said:
    *unofficially nominates for best-of-the-side-bar article*
    Well, to put it on the front page, Alex'll have to anonymize it, right? And it's kinda tough to anonymize this. 8=]

    At this point, I think it would be pointless to anonymize it, because he would link it to this post, and then have to edit your OP for anonymization.  

    OR he could anonymize it for front page, but never link it, but the first comment would be a link to this thread, ridiculing Alex for trying to anonymize it.  MS WTFs are hardly a secret.

    </ramble> 



  • @MasterPlanSoftware said:

    @rbowes said:

    unofficially nominates for best-of-the-side-bar article

    Also, I love your avatar. How did nobody else think of having an Avatar 0.1?

     

    Wow... you beat me to it. Well I second it at least then. 

    By less than a minute, too. That's awesome!



  • @rbowes said:

    @MasterPlanSoftware said:

    @rbowes said:

    *unofficially nominates for best-of-the-side-bar article*

    Also, I love your avatar. How did nobody else think of having an Avatar 0.1?

     

    Wow... you beat me to it. Well I second it at least then. 

    By less than a minute, too. That's awesome!

     

    I am pretty sure this has to make the front page now... at the very least just for this.



  • @belgariontheking said:

    @Spectre said:

    @rbowes said:
    unofficially nominates for best-of-the-side-bar article

    Well, to put it on the front page, Alex'll have to anonymize it, right? And it's kinda tough to anonymize this. 8=]

    At this point, I think it would be pointless to anonymize it, because he would link it to this post, and then have to edit your OP for anonymization.  

    OR he could anonymize it for front page, but never link it, but the first comment would be a link to this thread, ridiculing Alex for trying to anonymize it.  MS WTFs are hardly a secret.

    </ramble> 

    Best-of-the-sidebar articles are just posted verbatim, as far as I know.



  • Amusingly enough, those strings seem to be translated in [b]all[/b] languages of Windows.
    A Google search indicates that the Japanese version of Windows uses "オーディオ" and "その他" (which translate as "audio" and "other"); while I couldn't find code examples for other languages, it's fairly likely that they are affected as well.



  • @Quietust said:

    Amusingly enough, those strings seem to be translated in all languages of Windows. A Google search indicates that the Japanese version of Windows uses "オーディオ" and "その他" (which translate as "audio" and "other"); while I couldn't find code examples for other languages, it's fairly likely that they are affected as well.

    I was lucky. An imaginary Japanese hacker wouldn't be able to patch the executable like I did, since その他 is longer than 5 bytes in every encoding I can think of. Although, he might be able to hack the Windows multimedia library instead 8=].



  • @witchdoctor said:

     Kind of like they translated all the function names in excel. Although i really hope that's stored in a language-independent way and just displayed differently (optimistic, i know).

     

    Reminds of an old version of MS Office (must be like seven years ago), where they actually translated the VBA Keywords, such as if, while, function, ...! I am quite sure that this was not language-independent, so your english VBA scripts did not run in a German Office version.



  • @Spectre said:

    その他

    Err, I meant オーディオ. Damn editing limit.



  • @Martin Dreier said:

    I am quite sure that this was not language-independent, so your english VBA scripts did not run in a German Office version.
    It was a vain attempt at preventing German and Russian hackers to spread macro viruses.



  • @Spectre said:

    I was lucky. An imaginary Japanese hacker wouldn't be able to patch the executable like I did, since その他 is longer than 5 bytes in every encoding I can think of. Although, he might be able to hack the Windows multimedia library instead 8=].

    Actually, the code to do the string comparison would probably be large enough that you could overwrite it with code to just compare the first byte or word or whatever works best.



  •  W.T.F.



  • @The Real WTF said:

    @Spectre said:

    I was lucky. An imaginary Japanese hacker wouldn't be able to patch the executable like I did, since その他 is longer than 5 bytes in every encoding I can think of. Although, he might be able to hack the Windows multimedia library instead 8=].

    Actually, the code to do the string comparison would probably be large enough that you could overwrite it with code to just compare the first byte or word or whatever works best.

     

    Actually, the code to do the conditional jump after the comparison would be large enough that you could overwrite it with a nop or an unconditional jump so that your game will run whether you have the CD in the drive or not. 



  • Well, actually I've been working as a Localization Software Engineer at a $BIG_COMPANY_FROM_USA. I have to clarify all these things a bit.

    At first: stop blaming translators :)

    I don't quite know how it is organized in MS, but typically translations are done by external translation vendors and they are translating the things they are told to. So if somebody is telling them: don't translate Internet Explorer, it is a trademark (which is a tm in fact), they won't do that. The guy who tells them what to translate is not necessarily a Localization Engineer. Large companies are usualy  hiring dedicated person to do translation reviews. What that person says is saint, no matter what other people think :)  Surelly, sometimes it leads to funny translations like Hungarian translation of word "account" - "fiók" (which also means "drawer").

    > And the Windows XP bluescreens (at least in SP1) made the same

    > mistake. Yeah, I'm very happy that ïðîèçîøëà êðèòè÷åñêàÿ îøèáêà.

    This is typical error with Cyrillic (Windows 1250) encoding. Don't blame translators, blame Localization Engineers (or Software Developers).

     > They translated the MAGIC STRING?!?17!?!?!

    <voice type="Forrest Gump">It happens</>. The way L10n works, leads to such errors. L10n usualy sends out for translation every single string that could be found in resources (being it *.rc file or compiled binary file). It is up to Localization Engineer experience to decide if this string should be translated or not. But you shouldn't blame LE's for that, it is clearly a Developer's fault. Strings that should never, ever be translated have to be hardcoded. From my experience I know that developers could leave out in resources such things as "-ERR", or "451 TLS is not available for unknown reasons". And QA team is not always able to find functional bugs asociated with it (BTW. The "451 ..." will be displayed in MS OE, and QA team will demand translating it. If you'll do it, you will violate RFC...). 



  • @gremlin said:

    Actually, the code to do the conditional jump after the comparison would be large enough that you could overwrite it with a nop or an unconditional jump so that your game will run whether you have the CD in the drive or not.

    Yes, but where's the fun in such a simple, practical solution?



  •  @Spectre said:

    @Quietust said:
    Amusingly enough, those strings seem to be translated in all languages of Windows. A Google search indicates that the Japanese version of Windows uses "オーディオ" and "その他" (which translate as "audio" and "other"); while I couldn't find code examples for other languages, it's fairly likely that they are affected as well.

    I was lucky. An imaginary Japanese hacker wouldn't be able to patch the executable like I did, since その他 is longer than 5 bytes in every encoding I can think of. Although, he might be able to hack the Windows multimedia library instead 8=].

    The imaginary Japanese hacker would probably resort to some variation of 90 90 90.


Log in to reply