This idea is about a detailed description that is enough to recreate the video, with described objects on the scene, and links to other similar images, to other audio bits, or to other videos, which are kept as "master keynotes" or anchors to start-off from or to end up at.
A single picture for
each scene can be enough and even less is needed if we have an image from another scene or another movie from which we can reconstruct the whole thing.
For example, if we have the face of each of the actors in a play, as seen from different angles, they could be stored once, and then we can reconstruct the whole scene, using only the description of the play, along with a few parameters like the location of the heads on the screen and the angle at which the photo has been taken, the words being spoken, and the expressions (if needed some memorable features could be added such as the normal eyebrow position and changes during anger or surprise), the position or actions of the hands and feet, etc.
This would make a "script" file, which could be read and understood by humans, along with the linked images or video sections all of which are enough to reconstruct the original image or movie in good detail. Enclosed with the image are parts of extra details getting more and more particular.
Unlike a standard movie scene script, in the Descriptive Cognitive Encoding script, there is more detail about what we see and where. Instead of data in two dimensions on the display, it would save the perceived 3-dimensional distance to the objects, what they are, where they are moving to, and where the camera is.
After a first run of encoding, a comparison between the original and the result could be done, assessing what was missing, or at least what was noticeably missing, and then could be automatically corrected accordingly.
As opposed to a 3d modeling file, it will be in natural language, (with an equivalent in JSON format), that can be read and understood by a human, aided with visually open links.
A similar audio file would be constructed of some sound features like the musical instruments sampled (from the movie itself) the voices and their typical prosodies, and then besides the text or chords and notes, a bit of information about the way the sound was made, for example the temperament and accent which would allow for a fairly good reconstruction of the original "from memory".