A pixel is defined by the intensity of each of its three subpixels (eg red=190, green = 255, blue = 60) where each subpixel intensity is within a range (eg 0 -> 255).
If we map these subpixel ranges onto three orthogonal axes, a pixel's colour is defined by a point in a box. Call this a colour box.
So
an image on a screen can be described by a set of points in a horizontal array of colour boxes.
A video can be thought of as a stack of these horizontal arrays.
If the set of points (making up the images forming the video) can be described 'concisely' then this is a video compression algorithm.
Because video is generally smooth (ie it doesn't change much from one frame to the next) it can be compressed.
For example, DivX video compression selects 'key frames' (preferably the first frame in the scene) and all subsequent frames are defined by the difference with the key frame.
My first idea for video compression using colour box arrays I'll call 'spagettification'.
First consider a pixel that stays the same colour for several frames. The location in the colour box defining the pixel colour is therefore the same. With the colour boxes stacked vertically on top of each other, a vertical straight line passes through these points.
So rather than repeditively defining the pixel's colour:
frame 1 - r=190, g=255, b=60
frame 2 - r=190, g=255, b=60
. . . .
. . . .
we can say
frame 1 - r=190, g=255, b=60
frames 2 to n = vertical line passing through this point with regular point intervals.
For a pixel that changes colour slightly from frame to frame, we could define the angle and interval of the vertical line changes.
A video compressed in this way would have a mass of wiggly vertical lines resembling spagetti.
Compressing video one pixel at a time is not efficient; groups of pixels need to be compressed simultaneously for adequate compression factor.
Alright, at this point I'm going to warn the reader that the idea is now moving into very hypothetical territory (i.e. I don't have much idea if it would be possible/practical and much less idea of how I would actually implement it).
So I'll sidetrack for a moment to Quasitiler.
Quasitiler is a fascinating applet that can tile a plane with every tiling imaginable (regular, irregular, periodic, aperiodic). It does this by taking a 2 dimensional slice of a n-dimensional lattice at a particular angle and position.
Now if you were to place a regular square grid over the top of the resultant tiling, you'd see that each tile corner falls inside a square of the grid.
So the tile corners each define a unique location in each square.
If we were using colour squares (2D) rather than colour boxes (3D), you can see that this defines a unique colour for each square in the grid.
Now by shear conjecture, I reckon you could take a 3 dimensional slice of an n-dimensional lattice, thus making any three dimensional tiling (ie space filling) imaginable. I admit that this is a pretty abstract and mind-boggling.
So equivalently to before, the corner of each 3D tile would fall within a box of a 3D grid, hence uniquely defining a colour.
So if you chose the parameters carefully enough (the dimension, orientation and position of lattice), the tile corners would fall exactly where you wanted in the colour boxes in the colour box array, thus forming an image.
It would be impossible to do this with an entire image; you could only do this with a small section of the image (groups of pixels), and repeat process for other sections of the image.
A video is formed by varying the n-dimensional lattice parameters and taking slices for each frame (in the same way that I varied the angle and interval of the lines in the 'spagettification' method).