Google Translate



  • Has anyone else noticed how Google Translate now has a feature to auto-detect the source language?

    It tends to get quite confused sometimes:

     

     Is that how they talk over there?



  • I love things that try to automatically detect language.

    Original text: Snoggle sindation manth effern giew selentuine knom thog

    Translation: English (automatically detected) » English
    Snoggle sindation manth effern giew selentuine knom thog

    I hear knom thogs are good this time of year.



  • "Goo goo" is detected as Finnish and "g'joob" is detected as Estonian.



  • It detected [url=http://en.wikipedia.org/wiki/Jabberwocky]Jabberwocky[/url] as English though

    'Twas brillig, and the slithy toves
    Did gyre and gimble in the wabe:
    All mimsy were the borogoves,
    And the mome raths outgrabe.

    Just in case you need that in German ...

    'Twas brillig, und die slithy toves
    Hat gyre und gimble in der wabe:
    Alle mimsy waren die borogoves,
    Und die mome raths outgrabe.

     



  • @benjymous said:

    Has anyone else noticed how Google Translate now has a feature to auto-detect the source language?

    It tends to get quite confused sometimes:

     

    NEWS FLASH: Computer translation is hard and inaccurate.  Film at 11. 



  • I'm waiting for the feature to detect the output language as well.



  • @medialint said:

    It detected Jabberwocky as English though

     

     

    That was the first thing I tried after reading this thread - I was actually sort of disappointed that it worked right.



  • More Beatles ...

     

     



  • @medialint said:

    Just in case you need that in German ...

    'Twas brillig, und die slithy toves
    Hat gyre und gimble in der wabe:
    Alle mimsy waren die borogoves,
    Und die mome raths outgrabe.

     

     

    Hand-"translated" version

     



  • Fun with dead languages

    Go figure; they can recognize artificial languages easily, yet they have some serious trouble with actual dead languages:

    'Gallia est omnes divisa...' recognized as Italian

    OK, Latin, Italian... close enough, right? But then there's this:

    'Quis custodiet ipsos custodes?' recognized as Romanian

    Well, at least Romanian is a Romance language, OK. But this one, I just can't make sense of:

    'E Pluribus Unum' is Belarusian?

    I suppose if it were written "Е плурибус унум" it might be a possibility, but... :-p



  • TRWTF is machine translation. I have yet to see an example where it even allows you to get the gist of the foreign text. Occasionally it manages to help you identify the subject, but then you can generally do that by looking at the pictures anyway...



  • @lago said:

    TRWTF is machine translation. I have yet to see an example where it even allows you to get the gist of the foreign text. Occasionally it manages to help you identify the subject, but then you can generally do that by looking at the pictures anyway...

    Huh? TRWTF is you!


    Machine translation is more useful than you think, just because it isn't 100% accurate doesn't mean it's garbage.



  • @codeman38 said:

    Go figure; they can recognize artificial languages easily, yet they have some serious trouble with actual dead languages:

    Language recognition usually works by looking at the relative frequencies of letter pairs. You're not giving it enough data to work with.



  • there is probably also a good chance that the language detection favors more common languages.  FOr instance, since people probably translate a hundred trillion romanian pornos every day into english, google is more likely to think something is romanian than latin which people are use to translate 5 homework questions a day. 



  • I might be wrong, but I think that Google's translation works with a statistical translation engine based on the concept of tree-adjoining grammars. I don't think it learns from the stuff people ask it to translate, but I also have no idea what their source text is. Probably stuff like transcripts of UN meetings that are already translated into many languages.



  • Re: Fun with dead languages - and with live ones

    @codeman38 said:

    I suppose if it were written "Е плурибус унум" it might be a possibility, but... :-p

    Tried that... "We are not yet able to translate from Macedonian into English". So that's me tell't.

    It recognises the filthy outpourings of my cat-avatar though, from only three words. If I try the words individually, I get:

    (1) English for "skurvený" (er... OK, we'll leave that one obscure)
    (2) Slovak - can't do it (correctly identified language though)
    (3) Czech for "does not work" (but same word, and correct result).

    How come it doesn't identify the whole phrase as CZ and then translate that, when AFAIK it would be exactly the same as in SK, will remain a mystery. Perhaps I made it blush... :o)

     



  • @PerdidoPunk said:

    I might be wrong, but I think that Google's translation works with a statistical translation engine based on the concept of tree-adjoining grammars. I don't think it learns from the stuff people ask it to translate, but I also have no idea what their source text is. Probably stuff like transcripts of UN meetings that are already translated into many languages.

    You can click on a sentence and it gives you the original and a "suggest a better translation" option - I assume these are screened to make sure they're not pranks, but this does indicate that it learns from user-submitted content.


  • Discourse touched me in a no-no place

    @Random832 said:

    I assume these are screened to make sure they're not pranks
    This didn't use to be the case: http://forums.thedailywtf.com/forums/t/7736.aspx?View=Flat


Log in to reply