Computer: Programming
Zip/Unzip Bug   (+5, -2)  [vote for, against]
Evolutionary Computing as applied to a _very_ common application.

Using Genetic Algorithims (GA) researchers have turned up some suprisingly efficient, if incomprehensible, code. Applying this discovery to the field of commonly used apps should be some Comp-Sci Capitalist's best chance of making it big.

In this case, using GA to evolve small and robust programs to take the place of hand-coded zip/unzip algorithms should be pretty straight-forward. (so says the guy who's never programmed a thing in his life) Since the task is linear, straightforward, and can be performed quickly by non-GA software, breeding a universally "fit" program should merely be a matter of time.

Ultimately though, a more interesting application would be to attempt to "discover" new compression techniques. A world full of custom-designed encryption AND compression would be a very secure world, indeed.
-- bear, Apr 21 2000

EvoNet Flying Circus http://www.wi.leide...gusz/Flying_Circus/
What I'm talking about, a moderately understandable source of GA info. [bear, Apr 21 2000, last modified Oct 05 2004]

MM's GA book http://www-mitpress...tcl?isbn=0262631857
A high-quality introduction to Genetic Algorithms. [noumos, Apr 21 2000, last modified Oct 05 2004]

"Evolved" 'ware worries me a little. We seem to have trouble figuring out how to use organic solutions to GA problems (ie. rainforests), so what makes us think we could make evolved tools do exactly what we want? On the other hand, knowing the question the GA is trying to solve is a major advantage we don't have in the natural world, but I bet there's an art to asking the right question. And, if your evolved program doesn't quite do the job (and assuming you didn't just lose all your data) download one of its siblings or children.
-- centauri, Apr 21 2000


I guess I wasn't quite clear enough. "evolution" would only take place during the design phase of the project. The end-user would essentially work with standard (more-or-less) programs. You're quite right to not want a self-altering piece of software handling files on your system. What I intended was that a group or company try to develop a smaller/faster version of the classic zip/unzip app. using GA. What you download would be "dead" (non-reproducing or altering) code.
-- bear, Apr 25 2000


Sorry, I wasn't the one being clear (but what can you expect from an evolved system?).

What I was getting at is when you use the GA to create something, you end up with something that works but with no precise idea how or why. If it doesn't do exactly what you want you can't debug it (as we seem to want to do in genetic engineering), you have to let it evolve for another few generations. You can test the dickens out of it but how you be sure it has every really evolved to the proper state, or ever will?

I guess you can pretty much say the same thing about programmed software (how do we know Word won't somehow freak out if I type a certain phrase), but at least we can efficiently debug that and then test our problem case without fear that another has cropped up elsewhere.
-- centauri, Apr 25 2000


Wow, we're making this long... Has anyone suggested a "Fixed annotation system for Halfbakery" yet?

I see what you mean... In my head, I had the idea that during the "shakedown" process (the part where you're just trying to get as many varieties of program that _do_ the task, speed/efficiency be damned.) the researchers/programers/breeders(?) would use randomly generated clumps of data for each candidate and compare them to data gained from a "standard" zip program. Your point still stands, however, in that without the ready ability to debug a given chuck of code, things would be pretty hoary. My guess would be that the project would act similarly to many open-source and shareware projects, with constant work on the part of the originators and a continual stream of "new" and "improved" code coming out with varying degrees of effectiveness and stability. Hopefully, someone could work out a way to create a "safe" program and just release it like any other piece of software, but most likely, the users of the programs would be the standard group of beta-testing, bug-hunting neophiles who we all know and love.
-- bear, May 04 2000


Bumper-stickers of the future: "Save the endangered primary ZIP algorithm", "Save Joe's file format", etc.
-- jetckalz, Jun 23 2000


simple comment - you would want to breed it on real files; the only way you can get effective compression is to identify patterns in real-world data, not random stuff.

other comment - you almost certainly can design it to be provably lossless, taking a evolved lossy algorithm and proving bounds on it might be harder.
-- noumos, Feb 26 2001


Recreate the ZIP algorithm with GA? This would be like trying to recreate a particular species of higher life by starting from scratch on a new planet. Start with the original ZIP as a seed and then either of 1) improve its speed but don't change the algorithm or 2) improve on ZIP's compression? That is more possible, but the algorithm takes enough space to describe (much more than a few hundred bytes), you'd probably have a very long wait.

As for being sure your results are reliable, you can do that on a per file basis. Simply save a copy of original, then compress and test by decompressing and comparing.
-- ThotMouser, Apr 30 2002


Actually, it _would_ be possible to debug even the most convoluted piece of GA generated code.

All executable code can be taken straight back into assembler. Once the code is in assembler, it can be debugged, though you'll probably need some real good assembly programmers for the complex stuff. Once they understand what it's doing, they can figure out the error conditions.
-- Wulf, May 11 2002



random, halfbakery