Python Babel



  • I am currently building a medium sized website using Python + flask - most is really nicely done from an architecture view. But today I started to look into internationalization and my skin started to crawl - I am accustomed to .NET ressource files, where most of the work is done "under the hood", but flask uses babel, which in turn uses PO files - and generating those files is a ton of work. At least three different console programs, cryptic as hell, funny editors for those files and constant head scratching.
    Small example:
    $ pybabel extract -F babel.cfg -o messages.pot .
    $ pybabel init -i messages.pot -d translations -l de
    $ pybabel compile -d translations
    And that throws all in one gigantic file, but I would like to have separate files for different parts of my website.

    I'm looking at defining some json structure and do it all myself, at least the amount of senseless commands will be smaller.


  • FoxDev

    Wouldn't this be better in Coding Help?



  • Nope - the coding is rather straightforward, just the process is like crawling through a pipe, swimming 10 kilometres upstream and digging 10 meters deep just to translate some text.



  • There's no question. It's just a WTF.

    And the biggest WTF is: this stuff was probably all written after .NET already had resource files. Because you're using open source products, and God forbid they learn anything from any other implementations ever. There's no advancement in open source. No evolution. No maturing. Just the same old shit over and over again.





  • @blakeyrat said:

    There's no advancement in open source. No evolution. No maturing. Just the same old shit over and over again.

    There's no advancement in your commentary on approaches to software development. No evolution. No nuance. Just the same old shit over and over again.



  • It's the Unix philosophy: do one thing and do it well. Which means to do any common task you need a bunch of different commands. It sucks.



  • @anonymous234 said:

    to do any common task you need a bunch of different commands

    which it's trivially easy to save as a script.



  • Granted - but why have I to invent that damn script again?
    Thats whole worklof screams "lets build and IDE for it!" - hell, you could even integrate it into something like PyCharm, which is otherwise quite nice.



  • @blakeyrat said:

    And the biggest WTF is: this stuff was probably all written after .NET already had resource files

    .NET didn't invent resource files :rolleyes:



  • @HdS said:

    Thats whole worklof screams "lets build and IDE for it!"

    Something like this ?



  • And that changes what I said... how?


  • Winner of the 2016 Presidential Election

    I haven't used gettext in a while, but there are a few nice editors for .po files IIRC. Like https://poedit.net, which @TimeBandit mentioned.


  • Winner of the 2016 Presidential Election

    @HdS said:

    I'm looking at defining some json structure and do it all myself, at least the amount of senseless commands will be smaller.

    Those commands are not senseless. The first one is for extracting the translatable strings from the code, so that you know what you have to translate. You can manually create a template as well, but why would you do that?

    The second command creates a new translation file for one language from the template. The last one translates the text format (for easy translation) into a binary format which gettext uses at runtime. No unnecessary steps there.

    What exactly is your problem and why would you rather use a less elaborate system that doesn't have the ability to create translation templates automatically?

    Edit: There are also a bunch of tools which allow you to easily update all translation files and even work when the original string has been slightly altered. Gettext is actually pretty cool for translators.


  • I survived the hour long Uno hand

    @anonymous234 said:

    It's the Unix philosophy: do one thing and do it well.

    I've come to realize that a vast number of programmers don't know what "well" means.

    Or "one".


  • Winner of the 2016 Presidential Election

    @blakeyrat said:

    Because you're using open source products, and God forbid they learn anything from any other implementations ever.

    According to my Google search, .NET resource files offer absolutely no support for translating plural forms. I've literally never seen a single translation system except gettext that actually supports all features translators actually need to translate arbitrary messages properly. If you think another system is better, you've never actually tried to translate a large program into a language more complicated than English.

    So the joke's on you and Microsoft: They were literally too dumb to take a look at gettext and its features and learn from it.


  • FoxDev

    Or they looked at it, looked at the sorts of programs typically designed for Windows, and decided it wasn't worth the effort.


  • Winner of the 2016 Presidential Election

    @RaceProUK said:

    Or they looked at it, looked at the sorts of programs typically designed for Windows, and decided it wasn't worth the effort.

    Supporting the different plural forms of different languages is an extremely basic feature, and yet most translation systems simply ignore the problem. Take the following message:

    "You have %d new emails"

    Do you have any idea how difficult it is to translate that message correctly? A programmer from an English-speaking country would probably be smart enough to have two different strings:

    "You have %d new emails"
    "You have one new email"

    But that's not nearly enough, see the link I posted above. Without completely re-wording the message into something awkward, it's impossible to translate into many European languages and Arabic. Plus, you've already pissed off translators whose native language doesn't have plural forms at all, because they now have to translate the same shit twice.

    The only correct way to make that string translatable is using a translation system that handles the plural forms for different languages, creates the appropriate amount of strings for each language and selects the correct one automatically at runtime.


  • FoxDev

    And you can guarantee that the translation system will always get the right plural form for every possible word?


  • kills Dumbledore

    You have %d new sheeps
    You have %d new stadia


  • Winner of the 2016 Presidential Election

    No, but it knows how many plural forms the language has and can create the correct amount of translatable strings in each translation file.

    BTW: Never ever do what I did above and use format specifiers that the translator cannot re-order, unless you want to be murdered by your translators. And don't even think about:

    std::cout << translate("You have ") << d << translate(" new messages");

    Or I will personally come after you with a clue bat. And I cannot guarantee I'll stop beating you when you've realized your mistake.

    In case you haven't noticed: I have translated computer programs for money before. And I've learned to hate clueless programmers in the process. Programmers should be taught I18N basics before they're allowed to touch any code. The fact that most translation systems don't even support the most basic features translators absolutely need is proof that a lot of education is necessary.


  • I survived the hour long Uno hand

    @asdf said:

    The only correct way to make that string translatable is using a translation system that handles the plural forms for different languages, creates the appropriate amount of strings for each language and selects the correct one automatically at runtime.

    We use a system where humans define what the appropriate text should be in each language and the system just yanks it out of the DB. We've sometimes had issues where text got too long and broke the CSS, but that's what testing is for.


  • FoxDev

    @Yamikuronue said:

    We use a system where humans define what the appropriate text should be in each language and the system just yanks it out of the DB. We've sometimes had issues where text got too long and broke the CSS, but that's what testing is for.

    That is an uncannily accurate description of the translation system we use 😄



  • @asdf said:

    I've literally never seen a single translation system except gettext that actually supports all features translators actually need to translate arbitrary messages properly.

    You mean like gender?


  • Winner of the 2016 Presidential Election

    @Yamikuronue said:

    We use a system where humans define what the appropriate text should be in each language and the system just yanks it out of the DB.

    Humans will always define the appropriate text. But you need some kind of system that does something like this for strings containing numbers (pseudo code):

    switch language {
        case English | ... {
            if number == 1 return singular_translation;
            else return plural_translation;
        }
        case Arabic {
            if number % 100 == 99 return translation1;
            ....
        }
        ....
    }
    

  • Winner of the 2016 Presidential Election

    @ben_lubar said:

    You mean like gender?

    My advice would be to avoid names in translatable strings altogether. Gender (which you may not know) is not the only problem, adjectives might have different forms depending on relationship between two people, social status, …

    If you absolutely need to include a name in a translatable string, you'd better provide a lot of context.



  • Symfony framework does a decent job at pluralization.

    In its simplest form:

    email.new_emails_message: You have one new email|You have %num% new emails
    

    Or you can use their little DSL to specify the forms exactly based on item count:

    page.num_apples: {0} There are no apples|{1} There is one apple|]1,Inf[ There are %count% apples
    

    I used i18n in .NET just once, I don't remember how I handled pluralization then, or if that was even an issue. Node? Don't make me laugh.


  • FoxDev

    @asdf said:

    My advice would be to avoid names in translatable strings altogether.

    In many languages, nouns have gender; are you saying we should abandon the use of nouns?



  • Only the words that are substituted into the string need to be handled by the translation software. That does mean that gettext can't handle sentences with multiple numbers in them, for example.


  • Winner of the 2016 Presidential Election

    You have to do that for every string, though, right? Gettext allows you to specify the number of forms once per language.



  • @asdf said:

    You have to do that for every string, though, right? Gettext allows you to specify the number of forms once per language.

    Yeah. Personally, I only used the first one/many form, where this isn't an issue. I guess that DSL would become a pain if you needed it for every sentence.


  • Winner of the 2016 Presidential Election

    @ben_lubar said:

    Only the words that are substituted into the string need to be handled by the translation software.

    Yeah, @RaceProUK seems to have misunderstood the problem. Also, I was assuming you were inserting a person's name into a string, not some other noun.


  • ♿ (Parody)

    @asdf said:

    Yeah, @RaceProUK seems to have misunderstood the problem.

    You mean the problem wasn't someone criticizing Micro$soft?


  • FoxDev

    You mean there's never a situation where you need to substitute nouns into a string?

    @boomzilla said:

    You mean the problem wasn't someone criticizing Micro$soft?

    The Status thread has at least half a dozen instances of me criticising Azure


  • Winner of the 2016 Presidential Election

    I didn't say that. But you should really avoid cases where you substitute a noun or name in the middle of a sentence anyway. Example:

    "You have a nice %s"

    Literally impossible to translate. Unless you want to ship dictionaries with your program and look the gender of every single possible noun up and guess the gender of words not found in the dictionary, which is unreasonable.

    Also impossible to translate (the genetive case might change the name or require different suffixes depending on the name):

    "I met %s's father yesterday."

    The following is hard, but not impossible to translate, you'd potentially need a lot of context (gender, social status, ...):

    "%s seems like a nice person."

    The following, however, is non-problematic.

    "Created database '%s'"

    You get the idea…


  • ♿ (Parody)

    @RaceProUK said:

    The Status thread has at least half a dozen instances of me criticising Azure

    Did you forget to post an inane reaction like this after you posted?

    @RaceProUK said:

    Or they looked at it, looked at the sorts of programs typically designed for Windows, and decided it wasn't worth the effort.


  • FoxDev

    My apologies for being able to judge individual products and decisions on their own merit



  • I keep seeing this thread as "Python Babe!" and hoping for the best :giggity:

    Edit: 'bold' is broken


  • Winner of the 2016 Presidential Election

    @asdf said:

    The following is hard, but not impossible to translate, you'd potentially need a lot of context (gender, social status, ...):

    BTW: I wonder how Facebook handles such cases and whether they actually do it correctly for every single language (which I very much doubt). Their translation code is probably horribly complicated.


  • BINNED

    @Helix said:

    broken

    Yes.


  • ♿ (Parody)

    @RaceProUK said:

    My apologies for being able to judge individual products and decisions on their own merit

    Don't apologize, you haven't done anything!


  • kills Dumbledore

    Don't poke the hedgehog. We all know it's easy, which takes the sport out of it, and the ragequits aren't really funny any more



  • That's the translator's job.


Log in to reply