h a l f b a k e r yAlmost as great as sliced bread.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
Some data types, such as floating-point numbers, have some bit patterns reserved for particular meanings that do not correspond to real numbers, such as 'NaN' (not a number) 'Inf' (infinity) -Inf (Minus infinity) and so on. However, it is always possible (if difficult) to set a floating number to one
of these special values. Other data types, such as integers, do not have an illegal value: all bit patterns correspond to real numbers. So, if you want to check your program is setting the values before it is using them, it needs to keep an extra flag per number to show whether it has been set. programs like 'Purify' do this, but they make your code run slowly.
What we want is something like the parity bit on memory. You would have one extra bit per byte (or whatever your smallest addressable unit was). The CPU would set the bit when the memory was written, and it would free up the bit when the memory was freed. It could trap any states where it is using a value that had not been previously set. This extra bit would be invisible to the rest of the program, so this would be completely compatible with existing processors.
valgrind
http://developer.kde.org/~sewardj/ Valgrind works by emulating (in software) almost exactly what is described here. As noted in the idea, it suffers the performance penalty. [egnor, Oct 04 2004]
[link]
|
|
Given that careful coding and regular debugging will take care of the programming languages that do not automatically initialize values, what is the advantage of your proposal? |
|
|
Or one could simply switch to a more modern language that doesn't have this problem. |
|
|
Oh come now, "just use a different language" is a non-answer. |
|
|
There have actually been CPUs designed with such a "fault bit" on every value which causes a trap whenever anyone tries to read it. (I've worked on one myself.) This wasn't implemented for debugging but rather for fine-grained parallel processing or lazy evaluation. Results that haven't been computed yet are filled with the "fault bit"; whenever code tries to read it, the OS catches the fault and blocks until the value has actually been computed. |
|
|
Since such a bit is useful for a wide variety of purposes it doesn't seem like a bad idea to include it. But to take advantage of it you really need to change the way programs are built, and hardware that requires that (instead of just running existing programs faster faster faster) tends to languish and fall behind the inexorable progress of Intel. (See also LISP machines, etc.) |
|
|
Conventional processors can do this for a 4K (or so) page, and this is used for debugging -- plenty of memory debuggers use the MMU to catch overruns -- but of course it's not as good. |
|
|
Unfortunately, unless the economics of chip design change or Intel suddenly decides to make this the next cool feature to promote (I wonder what name they would give it), we'll have to live with emulation a la valgrind or Purify. |
|
|
my cat's breath smells like cat-food. . . |
|
|
Why implement this in hardware? It may be less of a performance hit, but you're stuck with it all the time, not just when you're debugging. (Am I right in thinking this is useless on an end-user machine?) |
|
|
//This wasn't implemented for debugging but rather for fine-grained parallel processing or lazy evaluation. Results that haven't been computed yet are filled with the "fault bit"; whenever code tries to read it, the OS catches the fault and blocks until the value has actually been computed.// |
|
|
Unfortunately, context switching is very expensive in terms of CPU time, and continuously gets moreso. An 8x51-based lightweight task switcher can hop threads in less time than is required to execute 50 normal instructions. A context switch on a P4, by contrast, is apt to take longer than executing 5,000 normal instructions. |
|
|
5000? You're making that up. |
|
|
I do think this would make a lot more sense in software, especially since software bytecode is apparently the way of the future - Java, .NET and Parrot being the prime examples. Just add -dXffd37 onto your command line, and the VM runs in extra extra careful mode. |
|
|
//5000? You're making that up.// |
|
|
It's a WAG, to be sure, but a 2GHz machine can execute something over two billion instructions/second when running from cache, right? Main memory, however, can only be fetched at 166M octwords/sec. Every cache miss, therefore, ends up taking as much time to fetch eight bytes as the processor would have spent executing 12 instructions. |
|
|
If it's necessary to context-switch to a very small routine which is 2Kbytes in size, the cache misses alone will take 12 cycles each for each octword, or about 6000 cycles--enough time to run more than 6,000 instructions from cache. Since most routines among which the processor would be task switching are apt to be larger than 2K, the cache penalties would be correspondingly greater. |
|
|
You're wrong. Valgrind, which does everything specified here in software, "only" runs 25x-50x slower. |
|
|
consumer advice needed: what happens when you leave a computer in the car overnight in 0 degree (way below freezing) weather? nothing....I'm hoping.... |
|
|
Nothing, unless water condenses on it when you bring it inside. But surely you can find a better place to ask your question. |
|
| |