Users run some kind of agent which reports an inventory of the hardware in your system (disk drives, add-on cards, etc) every so often, so there's a record of how long it's been working for. If equipment fails in a way the agent couldn't detect, you report that manually. The agent also records your usage patterns and monitors the temperature of your case (most computers have several temperature sensors) and so on.
A central server collects all this and builds a big index about how long various products last before failure, the effects of light or heavy usage and temperature on failure rate, whether certain equipment tends to cause other equipment to fail, which operating systems have higher a "mean keystrokes between crash" rate and various other statistics of note.
That way you can shut up the "well *I've* been using a (model Foo) for two years and I've never had a problem" folks once and for all. It would also be a lot more useful than those lame uptime contests.-- egnor, Jan 22 2003 Unfortunately many of the most interesting data points will be forever trapped on computers that just died for some reason and therefore cannot "phone in".-- krelnik, Jan 22 2003 Well, the idea is that the user would phone in. Even if they don't, then the system would notice that the system had stopped reporting in...-- egnor, Jan 22 2003 A better MTBF stat? (Mean Time Between Failure)-- Shz, Jan 22 2003 Croissant, especially if I can drill down through the stats, find the people with the same hardware as me and see what works on their systems.-- st3f, Jan 23 2003 Hardware performance also depends on software. Shouldn't this kind of agent be integrated with antivirus..?-- Inyuki, Jan 23 2003 Kind-of baked, at least in the software world. Whenever a program crashes, Windows (XP) asks me if I'd like to send an error report. Of course, I have no idea what Microsoft does with the millions (billions?) of error reports they receive each day from failing windows programs.-- mgangemi, Jan 23 2003 mgangemi, the idea's about hardware.-- waugsqueke, Jan 23 2003 Not entirely: operating systems are mentioned.-- krelnik, Jan 23 2003 Software fault reporting (as done by Windows or Mozilla) is not irrelevant, though it's a different technique (and has its own failure problems; it won't catch the truly catastrophic failures, or any failure that impacts the network).
I know the Mozilla project uses its crash-report data to compile a list of "top places in the code where crashes occur" as an aid to developer effort. I presume Microsoft does something similar. It's not to report a problem that nobody else saw, it's to prioritize known problems.-- egnor, Jan 23 2003 random, halfbakery