Half a croissant, on a plate, with a sign in front of it saying '50c'
h a l f b a k e r y
Guitar Hero: 4'33"

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.

user:
pass:
register,


                                   

Predictive Text at end of word

"Skip to the end!" - Prince Humperdinck
  (+8)(+8)
(+8)
  [vote for,
against]

Oft-times whilst sending short messages, I attempt to put in a longish word. If my auto-complete list contains a bunch of forms of said word, I end up typing most of the length of the word before the one I want pops into the suggestion area.

If I had a "skip to end" key, then 'h - a - p - (skip) - g' would get me to "happening" more quickly.

lurch, Apr 09 2012

Please log in.
If you're not logged in, you can see what this page looks like, but you will not be able to add anything.
Short name, e.g., Bob's Coffee
Destination URL. E.g., https://www.coffee.com/
Description (displayed with the short name and URL.)






       We initially read this as "Predictive Text at end of world" which struck us as something of an oxymoron ...
8th of 7, Apr 09 2012
  

       If I understand this right, having defined the start of the word, you want to work back from the end?
Loris, Apr 09 2012
  

       Basically, although spelling in reverse is not something I expect people to do without stopping and muttering about it. (Let's see: a-u-t-o-(skip)-n-o-i-t, no, I don't like it.)   

       So the "skip" key is switching you between two separate input strings, both proceeding as left-to-right input, and on each keypress, update the suggestions:   

       SELECT words FROM wordlist WHERE words LIKE s1%s2   

       you just pick up the words which match both the head and tail partial strings.
lurch, Apr 09 2012
  

       This idea seems to be predicated on the idea that the end of the word has more predictive power than the early or middle bits. Is that true, though?   

       It strikes me that word endings ("ed", "s", "ly", "er", "ion") are less varied than the middle.
MaxwellBuchanan, Apr 09 2012
  

       Once you've gotten to a list of forms-of-the-same-word, then I think the predictive power of the end gets a lot better.   

       Plus, I think "skip to end" would be a lot easier for the user to visualize and use than "ok, we're now going to jump to someplace in the middle".
lurch, Apr 09 2012
  

       //easier for the user to visualize and use than "ok, we're now going to jump to someplace in the middle"//   

       No, I didn't mean "jump to the middle", I meant that it *might* be more effective to simply type the next letter than to skip to the end.   

       For example, suppose I want "exciting", and I've typed "ex". If I now add the last letter ("g"), it could predict "exciting", "exiting", "extrapolating", "existing", "extolling" and many more. However, if I add the "c" instead ("exc"), the choice is more restrictive.   

       I'm not saying that this is always the case, I'm just questioning whether the last letter has more predictive power than the next letter in the word.   

       There's one other drawback to your system, though. Most predictive software narrows the choice as you add more letters. If you jump forward and add the last letter, then you would have to work backwards (adding the penultimate letter etc) if you wanted to restrict the choices further.
MaxwellBuchanan, Apr 09 2012
  

       Well, yes, in your example, (checking on my phone here) I get "except", "example", "exactly", "extra", "express", "expect" - so there would be no reason for me to skip to the end.   

       Likewise, when I hit the "c", now I've got "except", "exchange", "excuse", "excellent", "excited"... still lots of variety.   

       However... when I put in the "i", the list looks like "excited", "exciting", "excitement", "excite", "excitation", "excise", "excitable", "excites", "excitedly" - the predictive value of the next letter just fell off a cliff. But: (skip)-g and you've nailed it.
lurch, Apr 09 2012
  

       Fair enough. Research trumps theory.
MaxwellBuchanan, Apr 09 2012
  

       But after typing "exc" your choices are still pretty broad, because you could be typing "exciting", "excited", "excitable", "excitation", "excites", and so on.   

       Luckily, this is easily testable. Just iterate over the alphabet, building two regexps per letter: one of the form /...a..*/ and another of the form /.....*a/. Essentially, you're looking for words of at least five letters where either the fourth or the last letter is specified.   

       In fact, I've taken the liberty of running this test against the Unix words file, and come up with 1,126,044 matches for the former regexp, and 1,135,799 for the latter, indicating overall only a marginally higher degree of specificity for typing the fourth letter compared to skipping to the last letter. So while it's a wash overall, there are likely some circumstances where it could be a significant timesaver.   

       Anyway, bun for essentially suggesting using a regexp for predictive text.
ytk, Apr 09 2012
  

       That's some fine geekery, [ytk]. Which reminds me, has anyone used the equivalent of UNIX tab completion in this context?
spidermother, Apr 10 2012
  

       Upon further reflection, my methodology is flawed. Not only did I oversimplify the problem, but the conclusion I reached should have been impossible, and I would have noticed that but for also doing a fairly impressive job of cocking up the regexps. (Apropos: “Some people, when confronted with a problem, think, “I know, I'll use regular expressions.” Now they have two problems.”)
ytk, Apr 10 2012
  

       After a bit of thought, I arrived at the following new methodology: Iterate over the strings "aaaa" to "zzzz", and for each one build two regexps of the format /^1234.+$/ and /^123.+4$/ (substituting each number with the character in that position). Run each pair of regexps against the dictionary, adding a count of the matches to a pair of arrays. For each array, I then remove all of the zeros, since they don't match any actual words. Finally, I took the averages of each array and used that for the comparison.   

       Given a word of at least five letters, typing four consecutive characters yields an mean of ~11.9 matches, and a median of 3 matches. Typing the first three and then skipping to the last one yields a mean of ~7.6 matches, and a median of 2 matches. Assuming this methodology is sound (which is a pretty big assumption), it would seem that typing three characters and skipping to the last one is significantly more efficient than typing four consecutive characters at the start of a word. You may be on to something here, [lurch].
ytk, Apr 10 2012
  

       [ytk] Kudos, for serious research, and: But the user has the option of either standard- or skip-ahead text completion, and chooses, for any given word, the one which is more efficient. So the method's efficiency exceeds standard-only text completion by an even larger margin, no?   

       On the subject of methodology, the dictionary words you tested this against ought to be weighted by their frequency, i.e. if the method's less efficient for rarely-used words, and more efficient for common ones, then your method underestimates its efficiency in practice. Or vice versa. The large difference between mean and median in your results suggests very skewed distributions, so this effect may be large.
mouseposture, Apr 10 2012
  

       Some shorthand systems use one-letter abbreviations for common suffixes, e.g. m for -ment, g for -ing, d for -ed, a for -ation. The phone should be programmed to test your typed string as a shorthand form as well as a real word, hence:   

       cong would predict both "conglomerate" and "continuing"   

       docm would predict "document" (using m as the suffix)   

       rena would predict "rename" and "renovation"
phundug, Apr 10 2012
  

       //predictive text at end of world//   

       "We apologize for the inconvenience..."
RayfordSteele, Apr 10 2012
  

       (that would be the voice-over for a Fatal Asteroid Collision (or other catastrophe) Song)
lurch, Apr 10 2012
  

       Predictable Textile Atherosclerosis Endometriosis Offal Worsted   

       Just saying...
UnaBubba, Apr 12 2012
  


 

back: main index

business  computer  culture  fashion  food  halfbakery  home  other  product  public  science  sport  vehicle