h a l f b a k e r yThe mutter of invention.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
|
Congratulations you've moved 1 step closer to Huffman coding. I suggest sticking with UTF-8 if you want to keep your data byte-aligned and somewhat more easily decoded, or else go for a fully optimized Huffman encoding to maximize compression. I don't see much use in this halfway thing. |
|
|
Or just compress your text before storage or
transmission. It'll provide the same additional hassle as
in this idea, plus far better storage efficiency to boot.
Or, given that a chip the size of my fingernail can store
the Encyclopedia Britannica dozens of times over, don't
even bother. [-] |
|
|
Edit: Yeah, pretty much what [scad mientist] posted
right before me. |
|
|
All caps (+ punctuation + numerals) fits into 6 bytes rather nicely, and 7-bit was what Unix originally used for text. |
|
|
//Or just compress your text before storage or transmission |
|
|
I did try that, but you have to be careful, as sometimes the O doesn't get completely flattened and it just looks like a D |
|
|
The trick is to compress your O around the middle until
it becomes an 8, then fold it in half so it looks like a º.
You can fit, like, six of those in the same space as the
original O. |
|
|
I'm still thinking the best compression would be to arrange the text at 90 degrees to the storing medium, then each character would only be 1 electron wide. |
|
|
[+] anyway the next generation of software
infrastructure will render all the current info useless.
Almost like the information in Ebcidic and anything
on magnetic tape became obsolete and practically
inaccessible. See the recovery attempts for
discovering the origin of :) :( |
|
|
I SEEM TO REMEMBER THAT TELETYPE MACHINES USED
7 BITS PER CHARACTER BUT HAD NO LOWERCASE AND
LIMITED PUNCTUATION. |
|
|
sp. "5" I think... wait no, I think that was telegrams. |
|
|
I wonder what would happen if we forced [po] to
communicate using an all-uppercase teletype
machine? Hmm - tellytypies. |
|
|
We sent a message to a friend who was getting married
overseas: "WE HEAR YOU ARE GETTING MARRIED. STOP.
YOU PLAN TO SPEND THE REST OF YOUR LIFE WITH THE
SAME WOMAN. STOP". He never commented on how it went
over. |
|
|
I would add to [n_m_r]'s suggestion about turning the text
sideways by pointing out that each sentence could be
stored end-on. |
|
|
Here's a real-world application: any medium where you can't fit more than a few bytes into a message, like 2D bar codes. There isn't enough space to fit a dictionary or huffman tree. |
|
|
Like scad said, this is similar to huffman coding with a predefined tree optimized for English text. |
|
|
//There isn't enough space to fit a dictionary or
huffman tree.// You could compress the Huffman
tree using a Huffman shrub, then compress that
using a Huffman petunia, and compress that using a
Huffman club-moss. |
|
|
// the tree is more complex and built on chains. // |
|
|
Is that one of those tyre rope-swing things, then ? |
|
|
Why don't you just come up to speed with the rest of the Universe and start using Trinary ? |
|
| |