CAPTCHA is the method used on websites to prevent automated linkspam on wikis and blogs, automated generation of tons of logins for free email, etc. It's usually a wavy or otherwise distorted picture of some random letters that you are supposed to type in the box. This is bad because:
It makes these sites inaccessible to the blind or visually impaired (who use screen readers)
It's inconvenient for those of us who can use it, because it's increasingly hard to see the letters exactly, and annoying when we have to try multiple times.
It uses a relatively large amount of bandwidth and a relatively large amount of processing power to generate the image.
It can be thwarted by recruiting people who just sit at computers all day typing them in.
My idea is to use simple English (or the local language) questions that are trivial to answer by a human, but not by a computer. Examples:
Enter the sum of two and three: ____
What is the opposite of "cold"? ____
Click the second and fourth check boxes: [] [] [] []
Basically, questions that a native speaker over the age of 7 could answer, but a computer could not. With this system:
Since it relies on text, contributing to such sites would be just as accessible as the sites themselves.
It would use less server resources and take less time to download
It would not be as easy to run through an overseas human factory for cracking, since it requires knowledge of the local language instead of just the ability to recognize distorted characters. Yes, this makes editing the page less accessible to people who don't understand the language, but again, it's just as accessible as the page itself.-- omegatron, May 22 2006 baked as of 2008 http://textcaptcha.com/Even has the "solved by a 7-year old" requirement. Perhaps Rob reads the bakery. [Loris, Jul 07 2014] How vital it is to check the depth of snow in midsummer http://www.jma.go.j....html?elementCode=4...hmmm [not_morrison_rm, Jul 07 2014] 20 Questions http://www.20q.net/ [Skewed, Jul 07 2014] XKCD's alternative to CAPTCHAs http://xkcd.com/810/ [scad mientist, Jul 07 2014] reCAPTCHA http://www.google.c...ha/intro/index.html [scad mientist, Jul 07 2014] Someone came up with the idea of 'click the puppy' where, in a grid of 3x3 (or NxN) pictures of kittys, the user had to click on the one containing the puppy (change imagery to suit audience). This sounds like a textual version of that. [+] I wonder what type of questions would provide appropriate answers. The problem is soliciting a single response that can only be determined by fully comprehending the question.
For example, the opposite of hot, might be 'freezing', or 'not hot' or 'brassic' - I know it's not a great example, but you get the idea. I spent far too long typing in variations of "STAND ON THORIN'S SHOULDERS" to know that typing in the same pattern of text that a designer/programmer expected me to is not always as easy as it at first sounds.-- zen_tom, May 22 2006 //Which transformer was omegatron? What did it change into/from?//
A play on Megatron? He turned from a bot into a handgun/blaster.
Nice idea [omegatron] btw.-- kuupuuluu, May 22 2006 How did Ophelia die? How many pennies in a million dollars? What is the atomic number of helium? What is decimal ten in base eight? In how many countries did the Da Vinci Code take place? Bun.-- SledDog, May 22 2006 Perhaps introduce the odd spelling error here or three, just to confuse computer programs even more.
[SledDog], I only know 3 of those 5 myself. Although, humans would have the advantage of search engines to aid them.-- hidden truths, May 22 2006 [Sleddog]
Oh come on, ask some hard ones!-- Galbinus_Caeli, May 22 2006 Its not a hard problem to parse and answer some of these questions through machine logic, especially if everyone is using the same database of questions. T
he workload in generating thousands of these questions would require a human factory. Once answered you have the key to unlock the website.
The big advantage of captcha is that they can generate thousands of hard to read by OCR images very quickly and dispose of the old ones periodically. The question system is 10000000x the workload.-- lowbot, May 22 2006 //Someone came up with the idea of 'click the puppy'//
The main point of this was to make it usable by the visually impaired.
//Which transformer was omegatron? What did it change into/from?//
It wasn't one. Megatron is similar. I got the name from a vacuum tube.
//SledDog//
Damn. No way, dude. I only know three of those. Much simpler questions.
//lowbot//
True. I was actually just thinking of using something like this for my own site, which would work quite well because it's one of a kind. It *would* take some work to make a universal one that can run on multiple sites, though. Less than 10 addition permutations would only give you 100 questions, for instance. Hmm... The "click in a certain pattern" idea might be more flexible.-- omegatron, May 22 2006 //The main point of this was to make it usable by the visually impaired. //
How does reading a question in text get any easier for the visually impaired than telling the difference between a kitten or a puppy?
I accept that a text-only interface is easier though.-- zen_tom, May 22 2006 [zen_tom] Text can be LISTENED to. Pictures cannot.-- Galbinus_Caeli, May 22 2006 Aha! Sorry, my reading-aid's been playing up.
Anyway, you could just make sure your pictures have the appropriate <alt> text; puppy.jpg, kitty.jpg etc ;)-- zen_tom, May 23 2006 //Why is everyone so worried about special web design for the blind?//
There were three other benefits I listed, too...
It's not just about designing so that pages are readable by the blind, either. It makes things accessible to people with monochrome, very small, or text-only displays, like PDAs and cell phones, for instance.-- omegatron, May 31 2006 This is now baked. I've seen it on http://forum.linux-ntfs.org, for instance.-- omegatron, Oct 26 2006 1º: Caracter recognition. 2º: Image recognition. ... last: emotion recognition.
EMOTIONAL CAPTCHA
www.emotionalcaptcha.com-- anllie, Jun 11 2009 "What is the air speed velocity of an unladen swallow?"-- coprocephalous, Jun 11 2009 //"African or European ?"//
Yes.-- pjd, Jun 11 2009 Wolfram Alpha is not sure what to do with your input.-- 4whom, Jun 11 2009 2b OR Not 2b =-- pashute, Jul 05 2014 Sp. "2b | ( ! (2b)) =="
lvalue required.-- 8th of 7, Jul 05 2014 Clearly since undefined is not the 2B state of the variable the answer is !2B
Const char TheQuestion = [];
if (TheQuestion=="2B"); { } else; { } endif;-- Voice, Jul 06 2014 Aren't these precisely the kind of questions some of the better chatbots & Watson etc. pick up answers to by conversation in chartrooms & with brute database searches if they have the right algorithms?
Or has someone already raised that point?-- Skewed, Jul 07 2014 //"STAND ON THORIN'S SHOULDERS"//
The hobbit text driven game, circa 1980-something, frustrating wasn't it (never did complete it, I think I may have discovered girls & lost interest, either that or I got hold of a copy of Elite).
Chatbots are a lot better than that now.
I think this idea was from before a computer could beat you at trivial pursuits? it would definitely crash & burn if you tried to use it now, the bots would be all over it.-- Skewed, Jul 07 2014 //I think this idea was from before a computer could beat you at trivial pursuits? it would definitely crash & burn if you tried to use it now, the bots would be all over it.//
I think that's true only if the same system is deployed at scale. Many small sites can greatly reduce their spam issues using trivial systems. Even a single fixed, domain-specific question will defeat most of the generic spambots currently crawling the web.-- Loris, Jul 07 2014 It's easy to invent an original question that requires basic intelligence to answer. Until a chatbot has such intelligence no amount of spoofing will make it possible to emulate a human.
"I like my coffee black. Do you think I used cream yesterday at breakfast?" "Sally is a 250 pound diabetes patient who likes to watch telly and eat crisps. Tom is holding a party for winners of the recent marathon. Will Sally be the guest of honor?" "My cousin was born nine years ago. Do you think I should buy him a shotgun for Christmas?" Come to think of it this may be a good way of filtering out Borg as well.-- Voice, Jul 07 2014 All those questions are ones I think a bot with the right algorithms should be able to answer, especially if it's had long enough to build it's database [Voice]
Would think you're right [Loris], was kinda my point, if this type of security was in common use the malware programmers would be all over it, as long as it remains niche there's no percentage in it for them (assuming I've got your meaning right).-- Skewed, Jul 07 2014 Hmm, there are near infinite number of questions...for example, why is the Japanese Weather bod's website running a daily snow depth chart in July..see link if you really have nothing better to do..-- not_morrison_rm, Jul 07 2014 Lets not forget it's a computer (with a limited database of CAPTCHA's to present) trying to detect a computer, not a person trying to do the same, infinitely easier to spoof than people.
<Later Edit>
20 Questions is a good example of how a basic one might work (link from [not_morrison} posted on 'My first search engine, thanks').
<link>
The bots would notify (email?) failure to pass one of these question CAPTCHA's to their bot-master & he / she would then update its database with the appropriate response (it can also be left wild as a normal Chabot in online chat-rooms etc to build its database independently). Allow enough memory (& time) & it's database will hold the answer to almost any question.
And most CAPTCHA's let you ask for a new question, so the bot just punches that button until one it knows pops up.
Those it doesn't just get added when it reports back next so new CAPTCHA's will be added to the bots as fast as they're devised.
I can see bots doing the same with visual CAPTCHA's (store each one you come across in the data base & match with appropriate response) - just a matter of teaching it enough of them it has a good chance of finding one it knows if it punches the new-question button enough (wouldn't be surprised if some already do).-- Skewed, Jul 07 2014 //The big advantage of captcha is that they can generate thousands of hard to read by OCR images very quickly and dispose of the old ones periodically..
Ah...
So the old / existing optical CAPTCHA system isn't that easy to defeat.
The question ones still are though even throwing in deliberate spelling mistakes won't work if the bot has access to a basic spell-checker.-- Skewed, Jul 07 2014 //... if this type of security was in common use the malware programmers would be all over it, as long as it remains niche there's no percentage in it for them (assuming I've got your meaning right)//
I don't think you have. It's not as if the world ends if a bot manages to get through the protection.
That being the case, a single question may suffice for a small website. If a bot-herder takes the time to answer it for their bot, then there will be some spam link cleanup to do - but the question can be changed. If you're large enough a target that this happens repeatedly, then you do need a more secure system.
The link I gave claims 180 million questions (programmatically generated, of course). It's true that this can be beaten - it has already - but conversely I think it's pretty straightforward to make your own implementation with its own questions. This changes the economics for the bot-herder.
So in a sense, yes, each system has to be 'niche' enough to remain uneconomic from breaking. But that's okay, because there can be multiple such systems. There are quite a number of image CAPTCHA systems, and presumably that's at least in part because they're locked into an arms race with the spammers.-- Loris, Jul 07 2014 //not as if the world ends if a bot manages to get through the protection//
Clearly, it just gets a tiny bit more annoying for whoever it's spamming.
As for the rest, points taken on-board & being digested, but it's not as secure as image based ones (the niche status helps for now but bots are getting better & their databases bigger).-- Skewed, Jul 07 2014 If there's only one (or a small number) of questions, and some of the answers are simple (yes, no, a number less than 10, etc), then it shouldn't be hard for a bot net to brute force it.
Now if you create this system for one small web site, no one would bother to brute force it because it would be much easier for a human to just answer the question, but if you create a reusable system where each web site owner makes up one or two questions of their own, and it gets widely adopted, then the system might become a real target. While a primitive brute force attack can be easily detected and blocked, if there were enough sites using this method it could be worth the effort to create a distributed attack that is low speed as seen from any one web site being targeted. A web site owner could still improve their protection by having a very good question, but having a question that can be answered by any real person probably limits the answers to a pretty small dictionary in most cases.-- scad mientist, Jul 07 2014 Not directly related to this idea, but this idea had me looking into CAPTCHAs and I ran across reCAPTCHA. Google is digitizing books, and when a word fails to OCR, they use it as a CAPTCHA. Presumably they must use it with a couple web sites at once and accept the answer if two people agree. But what happens when there is a large bot net that sees the same CAPTCHA on several sites at the same time and gives the same wrong answer? I guess the bot gets in and the book is incorrectly digitized.-- scad mientist, Jul 07 2014 random, halfbakery