Most of us stumble upon different Spam-protection techniques, while there are few which are relatively very easy, there are also some wicked one which ask you to solve proper calculus problems ! Spam protection techniques are aptly very common in our day to day usage, whether it’s about creating online accounts or posting comments in online forums, you’ll find it at most of the places.

With the boom of dot com and SEO for better page rank, spams became a powerful tool for the Internet Marketeers, however necessary, they’re ultimately irritating to the end user, so techniques were devised so as to reduce the spam. One of the most popular and effective method was the introduction of Captchas, they are pictorial depiction of text or numbers in somewhat distorted manner so that human eyes can read them.

However, you can always find better minds on the other side, and soon we started watching how OCR techniques were used to crack ’em down.
It was like going back to square one, so different other methods were adopted but none delivered results close to what was offered by captcha system, as a result captcha started looking more uglier and more algorithms were pushed just to make them invincible.

That was the era when computer scientist at MIT,Caltech started investing their time to make OCR translations more effective, digitization of books/manuscripts was the need of time, and even the best profiling by OCR required manual proof-reading by human eyes. reCAPTCHA came out as a solution to this problem.

So, whats special about reCAPTCHA ? well if you ever inspect any such element, you’d find that reCAPTCHA code is always composed of two parts, that can be text or array of numbers, another thing to notice is that one of the part appear more uglier than the other, for curiousty lets just type the eloquent part in the given code(THE)

Viola ! It allows you access even without feeding the second portion ! so is it some kind of a bug ? No it is not ! reCAPTCHA was a long time secret project sponsored by Google helping in digitization of books (the manual way), apparently one of the portion that we input was the actual captcha that required decryption, the other portion was simply one of the trillions ( or many more ) words waiting in que for digitization !

Summary : The Trick is simple, why waste all those human resources on captchas ? Why not utilize them all together in the process of digitization, so every time you fill a recaptcha, remember you’re giving it back to the community. Every UN-digitalized  code is further taken into redundant procedure by multiple users so as to arrive at a fool proof decryption.

To read more about recaptcha you can refer here, there are many public service offered that allows your to implement recpatcha using APIs. Read more on how efficient digitization through recaptcha is

Indian Institute of Information Technology, Allahabad is currently playing a big role in digitization in India.