New captcha = no captcha, yes ranking



  • Apparently, reCAPTCHA has a new anti-bot system. And it's this:

    WTF?

    How does this stops the bots? Let's hear it from the horse's mouth.

    To counter this, last year we developed an Advanced Risk Analysis backend for reCAPTCHA that actively considers a user’s entire engagement with the CAPTCHA—before, during, and after—to determine whether that user is a human.

    So, it seems they are monitoring mouse movements and keyboard events from the moment you enter the page. Once you click the box thingy, they push the whole profile through some kind of algorithm. If it matches what they'd recorded as "human behavior", they let you in immediately. Otherwise, you have to solve the standard OCR puzzle.

    Putting aside performance issues, how does it perform?

    Early adopters, like Snapchat, WordPress, Humble Bundle, and several others are already seeing great results with this new API. For example, in the last week, more than 60% of WordPress’ traffic and more than 80% of Humble Bundle’s traffic on reCAPTCHA encountered the No CAPTCHA experience—users got to these sites faster.

    Ok fair enough, the new system works.... for now. But how long will this last, before we see bots that just simulate similar sequences of events their fancy new toy is expecting to record? Spammers had solved the distorted letters OCR problem. This seems trivial in comparison.

    So unless someone sees a reason this might actually work long-term, I'm calling bullshit.



  • Shamus just has a "I am not a robot" checkbox on the page. I think it changes name/id every pageload. It seems to work just as well as an annoying-as-fuck captcha, and is much easier.



  • Homebrew captcha like that can work on a small-ish individual site. Make it just annoying enough so that spammers won't bother scripting for you specifically. That kind of strategy doesn't scale.



  • Two steps to building a better CAPTCHA:

    1. Be about as effective against bots as other systems
    2. Be much less painful to real users


  • @cartman82 said:

    This seems trivial in comparison.

    No kidding. I once worked at a company that specialized in comparative rating software for independent insurance agents. Basically, you go sit down with Joe Insurance, say, "I need to insure my cars," and give him your information. He'd enter it into our software, and it would spit out quotes from up to 100 different companies, all calculated inside our software (we got all the formulas and factors from the insurance companies). For added reliability, if you liked a specific quote, we had a plug-in that allowed the agent to click a button, and it would navigate through the selected insurance company's site, entering your data, clicking links, etc., until it got the quote direct from the company. This was all done in VB6, about 10-15 years ago. It would be extremely easy to slow down that process and pass the input as keystrokes and make the mouse move slowly to trick this process.



  • @cartman82 said:

    To counter this, last year we developed an Advanced Risk Analysis backend for reCAPTCHA that actively considers a user’s entire engagement with the CAPTCHA—before, during, and after—to determine whether that user is a human.

    In other words, we are Google we know if you are a bot because we track your every move.
    Before you even have to solve a CAPTCHA, After you solve it and While you solve it.

    The UI is technically not needed for them to tell if you are a real user or not.
    They might use it as minor factor.

    For a change we can use all the knowledge we collect about you to actually make your life easier.

    Before and After mean that Google can perform risk analysis on each individual based on previously recorded pattern. Visiting any of Google partners website (Ad sense).

    Detecting mouse and keyboard pattern is weakest link to break, but you can't change your overall internet activity behavior without raising a flag or standing out. ( not saying it is not possible )

    This little snippet that you find in almost any website. provide much more information about a user then how the user move the mouse.

      (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
      (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
      m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
      })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
    
      ga('create', 'UA-9122028-2', 'auto');
      ga('send', 'pageview');
    

    A machine might not be that good at AI, but it should be good detecting AI. like we are good at detecting real people.
    The day AI will no longer be able to detect AI is the day we are not going to be able to detect real people.
    ** one day there will be a law about this statement.



  • @Monarch said:

    like we are good at detecting real people.

    Are we? Or are we just really bad at acting like fake people?



  • What I don't understand is if a person acts too "roboty" they get the same old captcha test anyway. So how does this deter bots other than add a bit more complexity?



  • @Monarch said:

    like we are good at detecting real people.

    After ten years we still haven't figured out @Nagesh, so I'd say we aren't too good.

    @monkeyArms said:

    So how does this deter bots other than add a bit more complexity?

    The same as old CAPTCHA, only makes it a little less annoying for real people.



  • I've seen this in Humble Bundle. 2 out of 3 times it made me fill a standard captcha anyway.

    @cartman82 said:

    Ok fair enough, the new system works.... for now. But how long will this last, before we see bots that just simulate similar sequences of events their fancy new toy is expecting to record? Spammers had solved the distorted letters OCR problem. This seems trivial in comparison.

    So unless someone sees a reason this might actually work long-term, I'm calling bullshit.


    Well... it's a continuous arms race. If the reCaptcha team is willing to continuously analyze usage and find new ways to detect bots, it could work.



  • @Magus said:

    Are we? Or are we just really bad at acting like fake people?

    We can fake who we are but we can't fake what we are.

    @Maciejasjmj said:

    After ten years we still haven't figured out @Nagesh, so I'd say we aren't too good.

    Not sure who @Nagesh is but think about it that way, You have a black & white stripped domesticated horse, you can probably tell that it's an horse and not a Zebra just by looking at it.
    someone took your horse and release to the wild into a pack of 100 Zebras. You are going to have hard time telling which of whom is your horse just by looking, due to the added noise. so your are going to have to rely on other senses.
    You can try ride it - The true nature tend to reveal itself eventually.
    same goes for people ** particularly stripped whores.



  • This place seems different without @Nagesh



  • Reminds me of when Data's mother showed up and he tracked her blink rate like “she's using the same prng as me.”



  • @chubertdev said:

    This place seems different without @Nagesh

    This is not a bad thing. He was occasionally amusing, but more often annoying; at least I found him so.

    @Monarch said:

    Not sure who @Nagesh is
    Nagesh is, or was, a very active user who was either Indian (but living in London, IIRC), or pretending to be Indian; and either incredibly stupid, or a troll pretending to be incredibly stupid. My opinion always favored the former, but once in a great while he'd say something intelligent enough to suggest the latter.



  • Protip: Only trolls act incredibly stupid 100% of the time.



  • If they actually are using Google Analytics as a Turing test, I'd love to see how it works. Sounds fascinating.



  • @created_just_to_disl said:

    Protip: Only trolls act incredibly stupid 100% of the time.

    I don't know about that. Even a stopped chronometer is right once every 1000 years (Star Trek reference I think).



  • @Monarch said:

    This little snippet that you find in almost any website. provide much more information about a user then how the user move the mouse.

    Doesn't do squat on my LANs. Any HTTP request directed at *.google-analytics.com from inside any of the LANs I manage returns a 1x1 transparent GIF.



  • what browser do you use?
    which search engine do you use?
    which website do you use to watch videos?
    which email service do you use?
    what phone do you use?
    what glasses do you use?
    which captcha service do you use?

    do you know this little icon on websites



  • @Monarch said:

    what browser do you use?

    Most often? Iceweasel.

    I know, I know. But at least the real www.google-analytics.com never gets any traffic from any of my browsers; that's got to put at least a little bit of grit in Google's gears.

    @Monarch said:

    do you know this little icon on websites

    Can't say I've noticed it. I don't have a G+ account, and I spend almost no time logged onto my Gmail account, if that's relevant. What domain does it usually get served from? I'll let you know if it's on my current blacklist.



  • I think
    apis.google.com

    And you don't need a google plus to see this it is like the facebook like counter you see on websites.
    if you are logged in it helps, otherwise they can narrow it down to your IP,



  • @Monarch said:

    it is like the facebook like counter you see on websites

    Oh, right. No, my gateway blacklist doesn't block those. My Adblock Plus subscription to Fanboy's Annoyance List does though.


  • 🚽 Regular

    @flabdablet said:

    I know, I know.

    only **one in 475,643 browsers have the same fingerprint as yours.**

    I'm a precious snowflake ❄

    (like all the other people who are just as unique)



  • @anonymous234 said:

    I've seen this in Humble Bundle. 2 out of 3 times it made me fill a standard captcha anyway.

    Do you dream of electric sheep?



  • @cartman82 said:

    So, it seems they are monitoring mouse movements and keyboard events from the moment you enter the page.

    I wonder how it handles tablet users, screen reader users, old browser users...



  • @Zecc said:

    only one in 475,643 browsers have the same fingerprint as yours.

    I'm a precious snowflake

    [quote=Panopticlick]Your browser fingerprint appears to be unique among the 4,756,762 tested so far.Currently, we estimate that your browser has a fingerprint that conveys at least 22.18 bits of identifying information.[/quote]

    I have a precious snowflake's chance in hell.


  • FoxDev

    @Zecc said:

    I'm a precious snowflake ❄

    Your browser fingerprint appears to be unique among the 4,756,988 tested so far.

    Currently, we estimate that your browser has a fingerprint that conveys at least 22.18 bits of identifying information.

    then i must be an awesome ☃ ;-)



  • You would trick the system, but still wouldn't allow you to do it fast, which is one of the objectives behind a bot, to do things fast.


  • ♿ (Parody)

    @flabdablet said:

    I know, I know.

    @Zecc said:

    only

    one in 475,643 browsers have the same fingerprint as yours.

    Hmm...

    Your browser fingerprint appears to be unique among the 4,757,109 tested so far.

    My most uncommon identifier:

    System Fonts, 22.18+ bits of identifying information, one in 4,757,109 browsers



  • @accalia said:

    then i must be an awesome

    Me too:

    Your browser fingerprint appears to be unique among the 4,757,190 tested so far.
    Currently, we estimate that your browser has a fingerprint that conveys at least 22.18 bits of identifying information.


  • Banned

    @Eldelshell said:

    You would trick the system, but still wouldn't allow you to do it fast, which is one of the objectives behind a bot, to do things fast.

    What's the practical difference between million instances of bots, each of them spamming one wobsite in 5 seconds, and two million instances of bots, each of them spamming one wobsite in 10 seconds (because the process is longer, but less computation-heavy)?



  • In other words, we are Google we know if you are a bot because we track your every move.
    Before you even have to solve a CAPTCHA, After you solve it and While you solve it.

    Hmm, interesting. Yeah, it makes sense they'll plug in their entire spying apparatus instead of just monitoring a single page.

    Which means the main thing for spammers will be to build up realistic-looking user profiles for their bots.

    Interesting. As reCaptcha validates more and more users through this system, we will begin to see their regular captchas become even more annoying to solve (hinted if you carefully read the article). So spammers will be squeezed between trying to improve OCR and trying to simulate humans.

    Or just paying Chinese people in warehouses to solve these for a few bucks / hour.



  • @Monarch said:

    You have a black & white stripped domesticated horse

    What the hell are you stripping off a horse? It's not like they wear any clothes.

    I know you meant striped, but I wanted to be a bit of a dick.


    Edit - PJH: Pedantry badge awarded


  • FoxDev

    @abarker said:

    I know you meant striped, but I wanted to be a bit of a dick.

    you know, i'm feeling mercurial today.

    have a pendant.



  • Thanks!



  • @accalia said:


    Rhetorical question; I don't expect you to know the answer, since I buttume you just grabbed that image from GIS, but why the heck is the quality marking on the front of that pendant? Is it not enough for our clothing to proclaim to the world the company that made it? Do we now have to openly display that "this pendant is genuine sterling silver, not the cheap imitation you wear, loser!"?



  • @abarker said:

    I know you meant striped, but I wanted to be a bit of a dick.

    @accalia said:

    have a pendant.

    ... and a flag.



  • @flabdablet said:

    Doesn't do squat on my LANs. Any HTTP request directed at *.google-analytics.com from inside any of the LANs I manage returns a 1x1 transparent GIF.

    Why?



  • Makes him feel almighty powerful?



  • I guess so. He likes being a dick to people he's not likely to actually meet face-to-face was my theory, but that's mostly the same thing.


  • ♿ (Parody)

    He has a real fear / distrust / hatred of advertising.



  • Dislike of advertising + distrust of Google? Yeah, works for me.



  • @blakeyrat said:

    Why?

    Because I have absolutely no desire to cooperate with the marketing industry in any way, and see absolutely no reason why I should feel compelled to do so. I wish to restrict the information it collects about my browsing habits to that which I can't avoid it getting.

    @boomzilla said:

    He has a real fear / distrust / hatred of advertising.

    Fear? No. Caution, sure. Distrust? Absolutely, and I think that's a reasonable thing to have for any industry so blatantly self-serving and socially destructive. Hatred? Not really. More contempt than hatred.

    I can see what the dumbfucks who choose to work in the industry do it for, and I can sympathise with their desire to turn a dishonest buck, but I certainly despise what they do.

    @blakeyrat said:

    He likes being a dick to people he's not likely to actually meet face-to-face

    Coming from you, that accusation is just priceless. Made my evening.



  • Analytics isn't advertising, that's why I'm confused.

    Not that I understand why some people are so rabidly against advertising anyway. But this isn't that.



  • @Eldelshell said:

    Makes him feel almighty powerful? as if some tiny shred of dignity can still be preserved while browsing the web, even in 2014.

    FTFY



  • @flabdablet said:

    Because I have absolutely no desire to cooperate with the marketing industry in any way, and see absolutely no reason why I should feel compelled to do so.

    What does web analytics have to do with the marketing industry?

    @flabdablet said:

    I wish to restrict the information it collects about my browsing habits to that which I can't avoid it getting.

    What about the other people who use the LANs you control? They don't get any say in the matter?

    @flabdablet said:

    I can see what the dumbfucks who choose to work in the industry do it for, and I can sympathise with their desire to turn a dishonest buck, but I certainly despise what they do.

    So when I worked in web analytics I was being "dishonest"? How? Explain?


  • ♿ (Parody)

    @blakeyrat said:

    Analytics isn't advertising, that's why I'm confused.

    I dunno....but isn't it used to support advertising in some of its guises?



  • @boomzilla said:

    I dunno....but isn't it used to support advertising in some of its guises?

    Well ok? I can't argue with that I guess.

    By that logic, why not ban the HTTP protocol? I think the HTTP protocol might be used to support advertising in "some of its guises."



  • @blakeyrat said:

    Well ok? I can't argue with that I guess.

    By that logic, why not ban the HTTP protocol? I think the HTTP protocol might be used to support advertising in "some of its guises."

    And capitalism. Capitalism definitely supports advertising. Better ban that.



  • @blakeyrat said:

    What does web analytics have to do with the marketing industry?

    [quote=LMWTFY]Web analytics is not just a tool for measuring web traffic but can be used as a tool for business and market research, and to assess and improve the effectiveness of a website. Web analytics applications can also help companies measure the results of traditional print or broadcast advertising campaigns. It helps one to estimate how traffic to a website changes after the launch of a new advertising campaign. Web analytics provides information about the number of visitors to a website and the number of page views. It helps gauge traffic and popularity trends which is useful for market research.[/quote]

    @blakeyrat said:

    What about the other people who use the LANs you control? They don't get any say in the matter?

    Now, blakey, I want you to be sitting down for this, because it's going to come as a bit of a shock to you, what with having spent so long as an insignificant cog in the marketing machine and all:

    Nobody I know personally actually likes mandatory advertisements.

    I control five LANs: one at home and four at school. The school principal is absolutely on the same page as me wrt web marketing and completely approves of the way I run his LANs. I also get frequent praise from the other staff for keeping the school network as clean as I do. And I've had nothing but positive feedback about the lack of popups and other assorted bullshit that family and visitors experience when connected to my house wifi.

    @blakeyrat said:

    So when I worked in web analytics I was being "dishonest"? How? Explain?

    Reading comprehension fail, blakey. I have nothing to go on apart from your general unpleasantness when evaluating your personal ethics. But anybody who works in any capacity for the modern marketing industry is being paid in money obtained mostly by deceiving, manipulating and/or telling barefaced lies to the industry's customers as well as its consumer livestock (hence "a dishonest buck").

    Filed under: and while we're on the subject


Log in to reply