Halfbakery: Colour box video compression

Please log in.

Before you can vote, you need to register. Please log in or create an account.

Computer: Compression
Colour box video compression (+1) [vote for, against]

A pixel is defined by the intensity of each of its three subpixels (eg red=190, green = 255, blue = 60) where each subpixel intensity is within a range (eg 0 -> 255).

If we map these subpixel ranges onto three orthogonal axes, a pixel's colour is defined by a point in a box. Call this a colour box.

So an image on a screen can be described by a set of points in a horizontal array of colour boxes.

A video can be thought of as a stack of these horizontal arrays.

If the set of points (making up the images forming the video) can be described 'concisely' then this is a video compression algorithm.

Because video is generally smooth (ie it doesn't change much from one frame to the next) it can be compressed.

For example, DivX video compression selects 'key frames' (preferably the first frame in the scene) and all subsequent frames are defined by the difference with the key frame.

My first idea for video compression using colour box arrays I'll call 'spagettification'.

First consider a pixel that stays the same colour for several frames. The location in the colour box defining the pixel colour is therefore the same. With the colour boxes stacked vertically on top of each other, a vertical straight line passes through these points.

So rather than repeditively defining the pixel's colour:

frame 1 - r=190, g=255, b=60

frame 2 - r=190, g=255, b=60

. . . .

we can say

frame 1 - r=190, g=255, b=60

frames 2 to n = vertical line passing through this point with regular point intervals.

For a pixel that changes colour slightly from frame to frame, we could define the angle and interval of the vertical line changes.

A video compressed in this way would have a mass of wiggly vertical lines resembling spagetti.

Compressing video one pixel at a time is not efficient; groups of pixels need to be compressed simultaneously for adequate compression factor.

Alright, at this point I'm going to warn the reader that the idea is now moving into very hypothetical territory (i.e. I don't have much idea if it would be possible/practical and much less idea of how I would actually implement it).

So I'll sidetrack for a moment to Quasitiler.

Quasitiler is a fascinating applet that can tile a plane with every tiling imaginable (regular, irregular, periodic, aperiodic). It does this by taking a 2 dimensional slice of a n-dimensional lattice at a particular angle and position.

Now if you were to place a regular square grid over the top of the resultant tiling, you'd see that each tile corner falls inside a square of the grid.

So the tile corners each define a unique location in each square.

If we were using colour squares (2D) rather than colour boxes (3D), you can see that this defines a unique colour for each square in the grid.

Now by shear conjecture, I reckon you could take a 3 dimensional slice of an n-dimensional lattice, thus making any three dimensional tiling (ie space filling) imaginable. I admit that this is a pretty abstract and mind-boggling.

So equivalently to before, the corner of each 3D tile would fall within a box of a 3D grid, hence uniquely defining a colour.

So if you chose the parameters carefully enough (the dimension, orientation and position of lattice), the tile corners would fall exactly where you wanted in the colour boxes in the colour box array, thus forming an image.

It would be impossible to do this with an entire image; you could only do this with a small section of the image (groups of pixels), and repeat process for other sections of the image.

A video is formed by varying the n-dimensional lattice parameters and taking slices for each frame (in the same way that I varied the angle and interval of the lines in the 'spagettification' method).
-- xaviergisz, Jan 28 2006

Quasitiler http://www.geom.uiu...ler/about.html#tile
taking a 2 dimensional slice of an n-dimensional lattice to tile a plane... hmm, the applet doesn't seem to work - oh well, the explanation is still good. [xaviergisz, Jan 28 2006]

Hypercube (or tesseract) http://en.wikipedia.org/wiki/Hypercube
a four dimensional cube [xaviergisz, Jan 29 2006]

Sounds a little like Quadtree optimization, (with some 3D tesserae fitting algorithm thrown-in to keep it interesting) - Sorry, that's not fair.

Does seem wasteful (processor intensive) if this is done per-pixel.

(bun witheld till I've thought about the middle-bit somemore)
-- Dub, Jan 28 2006

Oh dude.

I agree with Dub, it *sounds* like like you'll end up doing heaps more processing for a small increase in compression. But I can't be sure until I've worked out what //take a 3 dimensional slice of an n-dimensional lattice// means in video terms, which is unlikely to happen any time soon. I'm also unsure as to what benefit non-rectangular tiling brings you.

Neutral until the either the explanation or my brain improves.
-- wagster, Jan 28 2006

Mpeg already encodes colour at a lower resolution than luminance.
-- wagster, Jan 28 2006

As does just about everything, including ccir601 and its 4:2:2 brethren.
-- bristolz, Jan 28 2006

Is there anything you don't know about [bris]?
-- wagster, Jan 28 2006

//Mpeg already encodes colour at a lower resolution than luminance// You don't even need to go digital to do this - both PAL and NTSC do also.
-- AbsintheWithoutLeave, Jan 28 2006

The essence of data compression is finding underlying patterns in the data. Video compression to date has exploited (among other things) the continuity of frames.

I am interested in exploring other kinds of underlying patterns that might be found in video. So my basic idea is to view video compression in a spatial manner.

I admit the “3D slice of a n dimensional lattice” is kind of wacky. I was originally toying with the idea of 3D surfaces that passed through the colour box array (with local maxima/minima defining pixel colour), but this seemed fairly clumsy.

I have always been drawn to the idea of n-dimensional objects (probably due to reading too many science magazines/books as a kid). The idea that a single n-dimensional object could have a vast number of projections/representations in 3D space (depending on its orientation and position) I find fascinating. (see, for example, link on the hyper-cube)

A single n-dimensional object is easy to define, yet has a huge number of possible projections – and so I say to myself, could this be somehow used to compress data?

Anyway a “3D slice of an n-dimensional lattice” could be explained as the projection (to 3D space) of an array of hyper-cubes. It’s a shame the applet in my first link doesn’t work – it really gave a feel for n-dimensional projections.
-- xaviergisz, Jan 29 2006

I have no clue what I'm talking about, so here goes:

If I were to grab a strand of your spaghetti, and stretch it out horizontally in front of my eyes, I'd see a continuously varying rainbow of colors along it, right?

Now if I was to graph that spaghetti as time vs color (seconds vs 65535 colors), I'd probably see a long squiggly line with many flat spots and gentle curves.

Would it be possible to analyze this line such that large portions of it could be described by mathematical functions? When it becomes to chaotic for short periods of time, you defer to some more simple raw data compression mechanism. The sum of the chaotic-data plus the data described by the numerous functions would equal the description of the pixel's color for the full length of the video.

My hope is that the memory occupied by the functions is less than the memory occupied by raw data.

Instead of selecting a key frames, select key pixels, those with the simplest functions that run for the longest time, that could be used as a reference for neighboring pixels. The neighboring pixels could now use much simpler functions as you'd only need to describe the difference between them and the key pixel (and these neighbors often look very similar).

Depending on available computing power, clusters of spaghetti could be analyzed to form ever more complex functions that would describe pixels when the time variable is plugged in.

Again, I know nothing about video compression, so somebody please slap me if this, vector system?, is a nonsensical approach.
-- TIB, Jan 29 2006

You could use RGB toothpaste to print a video cube inside a square tube with time along the length axis. You could then watch a film appearing on your toothbrush over a period of about a month.
-- wagster, Jan 31 2006

random, halfbakery