h a l f b a k e r yNot from concentrate.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
CAPTCHA is the method used on websites to prevent automated linkspam on wikis and blogs, automated generation of tons of logins for free email, etc. It's usually a wavy or otherwise distorted picture of some random letters that you are supposed to type in the box. This is bad because:
It makes
these sites inaccessible to the blind or visually impaired (who use screen readers)
It's inconvenient for those of us who can use it, because it's increasingly hard to see the letters exactly, and annoying when we have to try multiple times.
It uses a relatively large amount of bandwidth and a relatively large amount of processing power to generate the image.
It can be thwarted by recruiting people who just sit at computers all day typing them in.
My idea is to use simple English (or the local language) questions that are trivial to answer by a human, but not by a computer. Examples:
Enter the sum of two and three: ____
What is the opposite of "cold"? ____
Click the second and fourth check boxes: [] [] [] []
Basically, questions that a native speaker over the age of 7 could answer, but a computer could not. With this system:
Since it relies on text, contributing to such sites would be just as accessible as the sites themselves.
It would use less server resources and take less time to download
It would not be as easy to run through an overseas human factory for cracking, since it requires knowledge of the local language instead of just the ability to recognize distorted characters. Yes, this makes editing the page less accessible to people who don't understand the language, but again, it's just as accessible as the page itself.
baked as of 2008
http://textcaptcha.com/ Even has the "solved by a 7-year old" requirement. Perhaps Rob reads the bakery. [Loris, Jul 07 2014]
How vital it is to check the depth of snow in midsummer
http://www.jma.go.j....html?elementCode=4 ...hmmm [not_morrison_rm, Jul 07 2014]
20 Questions
http://www.20q.net/ [Skewed, Jul 07 2014]
XKCD's alternative to CAPTCHAs
http://xkcd.com/810/ [scad mientist, Jul 07 2014]
reCAPTCHA
http://www.google.c...ha/intro/index.html [scad mientist, Jul 07 2014]
[link]
|
|
Someone came up with the idea of 'click the puppy' where, in a grid of 3x3 (or NxN) pictures of kittys, the user had to click on the one containing the puppy (change imagery to suit audience). This sounds like a textual version of that. [+] I wonder what type of questions would provide appropriate answers. The problem is soliciting a single response that can only be determined by fully comprehending the question. |
|
|
For example, the opposite of hot, might be 'freezing', or 'not hot' or 'brassic' - I know it's not a great example, but you get the idea. I spent far too long typing in variations of "STAND ON THORIN'S SHOULDERS" to know that typing in the same pattern of text that a designer/programmer expected me to is not always as easy as it at first sounds. |
|
|
//Which transformer was omegatron? What
did it change into/from?// |
|
|
A play on Megatron? He turned from a bot
into a handgun/blaster. |
|
|
Nice idea [omegatron] btw. |
|
|
How did Ophelia die? How many pennies in a million dollars? What is the atomic number of helium? What is decimal ten in base eight? In how many countries did the Da Vinci Code take place? Bun. |
|
|
Perhaps introduce the odd spelling error here or three, just to confuse computer programs even more. |
|
|
[SledDog], I only know 3 of those 5 myself. Although, humans would have the advantage of search engines to aid them. |
|
|
Oh come on, ask some hard ones! |
|
|
Its not a hard problem to parse and answer some of these questions through machine logic, especially if everyone is using the same database of questions. T |
|
|
he workload in generating thousands of these questions would require a human factory. Once answered you have the key to unlock the website. |
|
|
The big advantage of captcha is that they can generate thousands of hard to read by OCR images very quickly and dispose of the old ones periodically. The question system is 10000000x the workload. |
|
|
//Someone came up with the idea of 'click the puppy'// |
|
|
The main point of this was to make it usable by the visually impaired. |
|
|
//Which transformer was omegatron? What did it change into/from?// |
|
|
It wasn't one. Megatron is similar. I got the name from a vacuum tube. |
|
|
Damn. No way, dude. I only know three of those. Much simpler questions. |
|
|
True. I was actually just thinking of using something like this for my own site, which would work quite well because it's one of a kind. It *would* take some work to make a universal one that can run on multiple sites, though. Less than 10 addition permutations would only give you 100 questions, for instance. Hmm... The "click in a certain pattern" idea might be more flexible. |
|
|
//The main point of this was to make it usable by the visually impaired. // |
|
|
How does reading a question in text get any easier for the visually impaired than telling the difference between a kitten or a puppy? |
|
|
I accept that a text-only interface is easier though. |
|
|
[zen_tom] Text can be LISTENED to. Pictures cannot. |
|
|
Aha! Sorry, my reading-aid's been playing up. |
|
|
Anyway, you could just make sure your pictures have the appropriate <alt> text; puppy.jpg, kitty.jpg etc ;) |
|
|
//Why is everyone so worried about special web design for the blind?// |
|
|
There were three other benefits I listed, too... |
|
|
It's not just about designing so that pages are readable by the blind, either. It makes things accessible to people with monochrome, very small, or text-only displays, like PDAs and cell phones, for instance. |
|
|
This is now baked. I've seen it on http://forum.linux-ntfs.org, for instance. |
|
|
1º: Caracter recognition.
2º: Image recognition.
...
last: emotion recognition. |
|
|
"What is the air speed velocity of an unladen swallow?" |
|
|
//"African or European ?"// |
|
|
Wolfram Alpha is not sure what to do with your input. |
|
|
Clearly since undefined is not the 2B state of the
variable the answer is !2B |
|
|
Const char TheQuestion = []; |
|
|
if (TheQuestion=="2B");
{
}
else;
{
}
endif; |
|
|
Aren't these precisely the kind of questions some of the better chatbots & Watson etc. pick up answers to by conversation in chartrooms & with brute database searches if they have the right algorithms? |
|
|
Or has someone already raised that point? |
|
|
//"STAND ON THORIN'S SHOULDERS"// |
|
|
The hobbit text driven game, circa 1980-something, frustrating wasn't it (never did complete it, I think I may have discovered girls & lost interest, either that or I got hold of a copy of Elite). |
|
|
Chatbots are a lot better than that now. |
|
|
I think this idea was from before a computer could beat you at trivial pursuits? it would definitely crash & burn if you tried to use it now, the bots would be all over it. |
|
|
//I think this idea was from before a computer could beat you at trivial pursuits? it would definitely crash & burn if you tried to use it now, the bots would be all over it.// |
|
|
I think that's true only if the same system is deployed at scale. Many small sites can greatly reduce their spam issues using trivial systems. Even a single fixed, domain-specific question will defeat most of the generic spambots currently crawling the web. |
|
|
It's easy to invent an original question that requires
basic intelligence to answer. Until a chatbot has
such intelligence no amount of spoofing will make it
possible to emulate a human. |
|
|
"I like my coffee black. Do you think I used cream
yesterday at breakfast?" "Sally is a 250 pound
diabetes patient who likes to watch telly and eat
crisps. Tom is holding a party for winners of the
recent marathon. Will Sally be the guest of honor?"
"My cousin was born nine years ago. Do you think I
should buy him a shotgun for Christmas?" Come to
think of it this may be a good way of filtering out
Borg as well. |
|
|
All those questions are ones I think a bot with the right algorithms should be able to answer, especially if it's had long enough to build it's database [Voice] |
|
|
Would think you're right [Loris], was kinda my point, if this type of security was in common use the malware programmers would be all over it, as long as it remains niche there's no percentage in it for them (assuming I've got your meaning right). |
|
|
Hmm, there are near infinite number of questions...for example, why is the Japanese Weather bod's website running a daily snow depth chart in
July..see link if you really have nothing better to do.. |
|
|
Lets not forget it's a computer (with a limited database of CAPTCHA's to present) trying to detect a computer, not a person trying to do the same, infinitely easier to spoof than people. |
|
|
20 Questions is a good example of how a basic one might work (link from [not_morrison} posted on 'My first search engine, thanks'). |
|
|
The bots would notify (email?) failure to pass one of these question CAPTCHA's to their bot-master & he / she would then update its database with the appropriate response (it can also be left wild as a normal Chabot in online chat-rooms etc to build its database independently). Allow enough memory (& time) & it's database will hold the answer to almost any question. |
|
|
And most CAPTCHA's let you ask for a new question, so the bot just punches that button until one it knows pops up. |
|
|
Those it doesn't just get added when it reports back next so new CAPTCHA's will be added to the bots as fast as they're devised. |
|
|
I can see bots doing the same with visual CAPTCHA's (store each one you come across in the data base & match with appropriate response) - just a matter of teaching it enough of them it has a good chance of finding one it knows if it punches the new-question button enough (wouldn't be surprised if some already do). |
|
|
//The big advantage of captcha is that they can generate thousands of hard to read by OCR images very quickly and dispose of the old ones periodically.. |
|
|
So the old / existing optical CAPTCHA system isn't that easy to defeat. |
|
|
The question ones still are though even throwing in deliberate spelling mistakes won't work if the bot has access to a basic spell-checker. |
|
|
//... if this type of security was in common use the malware programmers would be all over it, as long as it remains niche there's no percentage in it for them (assuming I've got your meaning right)// |
|
|
I don't think you have.
It's not as if the world ends if a bot manages to get through the protection. |
|
|
That being the case, a single question may suffice for a small website. If a bot-herder takes the time to answer it for their bot, then there will be some spam link cleanup to do - but the question can be changed. If you're large enough a target that this happens repeatedly, then you do need a more secure system. |
|
|
The link I gave claims 180 million questions (programmatically generated, of course). It's true that this can be beaten - it has already - but conversely I think it's pretty straightforward to make your own implementation with its own questions. This changes the economics for the bot-herder. |
|
|
So in a sense, yes, each system has to be 'niche' enough to remain uneconomic from breaking. But that's okay, because there can be multiple such systems. There are quite a number of image CAPTCHA systems, and presumably that's at least in part because they're locked into an arms race with the spammers. |
|
|
//not as if the world ends if a bot manages to get through the protection// |
|
|
Clearly, it just gets a tiny bit more annoying for whoever it's spamming. |
|
|
As for the rest, points taken on-board & being digested, but it's not as secure as image based ones (the niche status helps for now but bots are getting better & their databases bigger). |
|
|
If there's only one (or a small number) of
questions, and some of the answers are simple
(yes, no, a number less than 10, etc), then it
shouldn't be hard for a bot net to brute force it. |
|
|
Now if you create this system for one small web
site, no one would bother to brute force it
because it would be much easier for a human to
just answer the question, but if you create a
reusable system where each web site owner
makes up one or two questions of their own, and
it gets widely adopted, then the system might
become a real target. While a primitive brute
force attack can be easily detected and blocked, if
there were enough sites using this method it
could be worth the effort to create a distributed
attack that is low speed as seen from any one web
site being targeted. A web site owner could still
improve their protection by having a very good
question, but having a question that can be
answered by any real person probably limits the
answers to a pretty small dictionary in most cases. |
|
|
Not directly related to this idea, but this idea had
me looking into CAPTCHAs and I ran across
reCAPTCHA. Google is digitizing books, and when
a word fails to OCR, they use it as a CAPTCHA.
Presumably they must use it with a couple web
sites at once and accept the answer if two people
agree. But what happens when there is a large
bot net that sees the same CAPTCHA on several
sites at the same time and gives the same wrong
answer? I guess the bot gets in and the book is
incorrectly digitized. |
|
| |