h a l f b a k e r yAsk your doctor if the Halfbakery is right for you.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
recognition
Consider hashing as recognition and vice versa | |
I was recently reading some psychology and some computer science stuff, and it seemed like hashing functions and the process of recognition (in humans and presumably animals) are very similar.
Basically, a hash function is very fast, and it gives a plausible index to look for a memory (previous
encounter) with the object (key). It may not work, in which case you are confident you've never seen it before (important information to have), or it may give one or more plausible matches.
Recognition is very fast (and unconcious). When you don't recognize the object, you are sure you don't have the information (even though you have not searched your entire memory). Sometimes you misrecognize an object, but after the plausible match(s) pop into your mind, you can do an explicit comparison (this is concious), and reject them.
The major difference that I can see is that whatever the recognition processes are, they are noise-tolerant. If you see someone from a different angle, you can still recognize them. If you see a new math problem, you can still recognize it as something suitable to be solved with induction.
Of course, the noise is not simple bit-flipping, it is 'real-world noise', meaning that certain patterns (features) are more likely to be different between encounters than other patterns (the meaningful ones).
I think that studying the math behind noise-tolerant hashing functions, for various models of noise, could be very useful in CS and psychology.
[link]
|
|
The "noise tolerance" is a pretty fundamental difference between hashing and pattern-recognition, I think. The properties that are desirable in a hash function are not usually desirable in a recognition algorithm and vise-versa. |
|
|
Consider two patterns, Alice and Bob. Imagine you have a noise-tolerant hash function. It produces Halice and Hbob when applied to Alice and Bob. Now imagine that you can interpolate between Alice and Bob. Create a pattern which is 99% Alice and 1% Bob. Your noise-tolerant function should still produce Halice, not H(alice+noise). Gradually increase the bob-to-alice ratio until you have 99% Bob, and the hash function will be producing Hbob, right? |
|
|
But somewhere in between it will have had to make a sudden changeover. (Possibly several, if there are intermediate hash values.) What if you meet another friend, Bertha, whose pattern is right on this changeover? The noise-tolerant hash won't be very noise-tolerant when hashing Bertha, since a small amount of noise can nudge Bertha's pattern into the Alice basin or the Bob basin. The only way around this is to have the "hash function" depend on the set of recognizable people, by which point it doesn't look much like a "hash function" any more, it looks like a "pattern recognition algorithm". |
|
|
To put this mathematically, the problem is that "similarity" isn't transitive. |
|
|
From the standpoint of speech, handwriting, image, and other well studied discretely modelled, statistical pattern recognition problems, the general consensus is that improving the search strategy is only the smaller half of the game. Efforts in advancing that half are often mitigated by rapidly improving basic computational resources. |
|
|
The bigger half is the design of the feature set to use. Only recently have there been any attempts to use perceptual features for computer vision, let alone in image compression. The computer speech community has already spent tremendous effort to arrive at sensible features that are durable to changing acoustic conditions but still yield diversity enough to separate language from noise. |
|
|
What I do agree with is that most new advances in psychology and philosophy will arise out of CS disciplines. |
|
|
To answer wiml, if you gradually blended alice with bob, there would not necessarily be a transition between alice's basin of attraction and bob's. There would be a certain amount of bob that you could add to alice, and still recognize alice, but more than that you would get 'I don't recognize that person'. Thus Bertha (if everything is working right) has some available space. |
|
|
To say this a different way, the recognition function is allowed to return zero answers, if the objects are sufficiently different, or more than one, if objects share a basin. |
|
|
Urog, do you know if there is a way to go from a model of noise to a feature set? (or vice versa?)
I suppose I should explain what I mean by 'model of noise'. |
|
|
If there is no noise, and all bitstrings are equally likely, then the hashing function should incorporate the information in each of the bits equally. |
|
|
If all bitstrings are equally likely, but the first 2 bits are always replaced by static, then the hashing function should ignore the first two bits. |
|
|
If all bitstrings are equally likely, but they can be rotated randomly, then the hashing function should be invariant under rotation. |
|
|
If they can be arbitrarily permuted, then the hashing function should just count up the number of ones or zeros, and hash on that (or some scheme for invariance under permutation). |
|
|
Noumos, if the function returns "I don't recognize that" for in-between inputs, then it has to depend on the entire set of recognizable patterns, which practically means it can't be small and fast, like a hash function. I don't think recognition algorithms and hash functions are very similar at all, beyond the superficial fact that they both take a large input and have a small output. |
|
|
hash functions normally return 'I don't recognize that'. That's what is so cool about them; you can get non-membership quickly, without searching all of your memory. |
|
|
Isn't Soundex a noise tolerant hash function? |
|
| |