Consider this typo:
"ince"
Spelling correction suggestions could be "Inch, Inca, Since, Nice, Once, etc..."
I propose a keyboard-proximity filter (based on the system's current designated keyboard layout) to order them in the most likely typo order. O is very close to I on the keyboard, so:
Since, Nice, Once are probably the most likely words.
Inca and Inch are much less likely to be typed by accident, and accidents form 90%+ of my spelling errors.
Add to that theme with some intra-sentence grammar checking and common word tagging, and spellchecking could be much more useful.
(How often does one really intend to use the words "tot he" in a sentence? it is more likely to be "to the", for example. Some systems auto-correct that automatically, though.)-- not_only_but_also, Aug 17 2009 US Patent 6,801,190 http://www.google.c...AAAAEBAJ&dq=6801190 [jutta, Aug 17 2009] Context-sensitive spell check in Microsoft Office 2007 http://blogs.msdn.c...6/06/05/617653.aspx [jutta, Aug 17 2009] Context-sensitive spell check in Google Wave http://googlesystem...-spell-checker.html [jutta, Aug 17 2009] Wikipedia: Damerau-Levenshtein distance http://en.wikipedia...evenshtein_distanceEdit distance with bells on. [jutta, Aug 17 2009] Wikipedia: Needleman-Wunsch algorithm http://en.wikipedia...an-Wunsch_algorithmThis very clearly needs to be worked into a popular "dance craze" song, "Do the Levenshtein-Damerau Needleman-Wunsch". [jutta, Aug 17 2009] All good ideas, and patented and implemented in a few systems. A patent- or literature-search for "spell checking algorithms" might be in order.
The keyboard proximity thing is implemented, if one bothers with it, as a "confusion matrix" that, given two keys, tells you how likely they are to be confused. When computing the edit distance between two words (-> Levenshtein distance), instead of assigning equal probability for each substitution error, the confusion matrix is used to look up the possibility of this specific error.-- jutta, Aug 17 2009 It seems intuitive.-- normzone, Aug 17 2009 //Consider this typo:
"ince"//
Well, duh... plainly obvious you mean Vince-- vincevincevince, Aug 17 2009 //Levenshtein// - so that's what it's called - I once wrote a program that was intended to act as an "engine" for ALL card games, from snap through Gin Rummy to any/all variants of Poker, with each ruleset defined as a (relatively easy to edit) xml file - the tricky part came during draw/replace scenarios, trying to get the machine to try to decide whether it had a good/bad enough hand to draw a card (and decide which one to burn in the process), and there are lots of routines that reference the Hamming distance between a given hand, and a target one (e.g. four of a kind, or a series of hearts, or a numeric sequence) that the program might have "wanted" - I'm now going to have to go back and rename some of my methods to usd the word "Levenshtein".-- zen_tom, Aug 17 2009 //confusion matrix// - I'm pretty sure I can implement that myself without any algorithms.-- wagster, Aug 17 2009 I no. Pathetik isn't it.-- wagster, Aug 17 2009 //I think spellcheckers should be programmed to deliberately fail every so many words, or even insert barely- noticeable typos whilst typing that won't show up on the finished-product spellcheck.//
I find that happens already, as some errors form another word.
examples: your/you're lose/loose discrete/discreet
(The first to are quite common on the net, and widely reviled.)-- Loris, Aug 18 2009 One typo I often come across is a 'dyslexic' (no offense to dyslexic people) error - hitting the (theoretically) correct key with the wrong hand (eg. putting 'k' when you needed 'd'). <Pet peeve> People getting 'than' and 'then' mixed up! Grrr!</pp>-- neutrinos_shadow, Aug 18 2009 //I can't believe editors, who used to have to earn their pay by proofreading, can cheat...//
Yeah!! And what about those lazy sailors who use GPS to navigate..?-- shudderprose, Aug 19 2009 //The first to are quite common//
Was that "to" intentional? Yeah. Must've been.-- theleopard, Aug 19 2009 I know someone who frequently misuses "of" - as in must of, could of, should of etc - I don't have the heart to tell them.-- zen_tom, Aug 19 2009 You must learn to Give In To Your Hate, [zen].-- 8th of 7, Aug 19 2009 //Was that "to" intentional? Yeah. Must've been.//
It was now.-- Loris, Aug 19 2009 random, halfbakery