h a l f b a k e r yNumber one on the no-fly list
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
Please log in.
Before you can vote, you need to register.
Please log in or create an account.
|
Say you write
THE CAT
but it looks more like
TAE CHT
because something (perhaps a cat) distracted you while
you were writing.
The OCR today would give: Tae cht
with a combo box allowing you to chose corrections. After
the correction the text will look so: The cat
My proposal.
Why not have letters that can be something
between an A and an H, so that if the software cannot
decide, it will show that letter, and leave it to the reader
to decide what the letter is.
Since all computers will have these new hybrid letters,
there's no need to fix anything during copy and paste.
If you still want to fix the resulting text, there's nothing
easier. Find all the hybrids, and simply chose between one
of the two or three choices.
[link]
|
|
Do they not use context-based algorithms (thinking T9 or Apple's iType thingy) in order to help cleanse OCR results? |
|
|
The trouble with using hybrids, is that if you're not careful, everything starts looking like a hybrid of something or another. |
|
|
This just ends up presenting the page of text as a JPG image. |
|
|
Drat. I wanted this to be a font which reads
normally to human eyes, but which yields
obscenities or insults when OCR'd. |
|
| |