Stay up to date with the HUMAN project through our bi-monthly newsletter.
Anybody who’s used the Internet has encountered a CAPTCHA: it lets online services know that you’re human and not software. CAPTCHAs often require a declaration to that effect through a checkbox – for example, by proving you’re able to identify real-world objects and situations. They are, in essence, systems to filter out non-humans from humans: stewards of the Internet, and essential to the health of online networks.
CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. A CAPTCHA is a question that demands an answer: a test used to determine the user’s awareness, and ascertain their humanness. This test is a means of verifying a user’s ability to identify information such as symbols, words, equations, understand the context in which it is shown, and label it accordingly.
Without CAPTCHA systems, networks would be unable to determine the authentic from the inauthentic. To know the difference between an automatic action performed by a bot, and a manual action performed by a real human.
A bot is a piece of software that has been set up to perform tasks automatically Some bots are good; used, for example, to index a web page’s content for search engines. Other bots, however, are designed by malicious actors to break into user accounts or perform other harmful online activities.
Over time, bots have grown more sophisticated. With the advancement of ML and AI technologies, bots have become better at identifying shapes and patterns, requiring equivalent advancements in CAPTCHA systems, and the puzzles presented.
CAPTCHA is a means of protecting against bots. To unlock the protected service, a challenge must be answered, which, to prevent predictability and automation by bots, is different each time. CAPTCHAs are designed to be tests humans find relatively easy – the identification of a car on the street, for example – and bots find hard. A bot may be able to identify an object, but struggle to identify the object (or a part of the object) in different contexts.
Two images of dogs might be shown: one upright, and one upside down. A machine would struggle to tell the difference, whereas a human would not.
Humans are able to interpret meaning from relative chaos – the scrambling of images, numbers, and letters – whereas bots use pattern recognition software to identify the order, through familiar shapes, from the same information. Modern CAPTCHA systems prevent bots from deciphering order independent of meaning.
reCAPTCHA, a service acquired by Google, started as a research project at Carnegie Mellon University in 2007. Google purchased the company in 2009, and initially offered the service for free, in exchange for the data produced, which Google uses to train their visual identification systems.
reCAPTCHA was popularly used by Cloudflare: an infrastructure provider that serves 25+ million websites. By 2020, however, reCAPTCHA was dropped by Cloudflare in favor of hCaptcha for its millions of websites due to concerns over privacy and cost, as well as its unavailability in certain countries.
hCaptcha is an alternative to reCAPTCHA, powered by HUMAN Protocol. hCaptcha is the only major CAPTCHA service not owned by an ad network; it protects user privacy, rewards websites for using it, and helps organizations produce more targeted datasets. A breakdown of the differences between reCAPTCHA and hCaptcha can be found on hCaptcha's blog.
hCaptcha rewards websites for the data labeled by visitors. In this blog piece, Cloudflare describes its reasons for switching from reCAPTCHA to hCaptcha. Among them, and most significant, is that hCaptcha does not sell personal user data.
hCaptcha functions as an Exchange on the HUMAN network, matching – and mutually benefitting – website owners, website visitors, tasks, and companies that need data to be labeled. Websites receive stronger bot protection and may be compensated for the work done by their visitors. Visitors get a better experience through reduced bot activity on their favorite online services, and the knowledge that their time spent completing a CAPTCHA task was purposeful.
Companies that need data to be labeled, meanwhile, receive results in the form of relevant datasets faster and at a lower cost. In addition, because the types of data and questions change frequently, it is harder for bots to successfully defeat this kind of system. hCaptcha is just one application of HUMAN Protocol, however, which is a general mechanism to request, organize, evaluate, and compensate a wide variety of work.
The HUMAN Protocol Foundation makes no representation, warranty, or undertaking, express or implied, as to the accuracy, reliability, completeness, or reasonableness of the information contained here. Any assumptions, opinions, and estimations expressed constitute the HUMAN Protocol Foundation’s judgment as of the time of publishing and are subject to change without notice. Any projection contained within the information presented here is based on a number of assumptions, and there can be no guarantee that any projected outcomes will be achieved.