h a l f b a k e r y"Look on my works, ye Mighty, and despair!"
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
2D23D
Computer program makes 3D movies out of 2D ones. | |
It is a computer. It is very smart, and it can map spatial terrain. It can also recognize humans and thousands of other objects from a series of images. Using built in databases and really top notch programs. It then takes the 2D movie and watches the whole thing, making judgements about spatial arrangements
and people,objects, etc. It then compiles the movie in a 3D landscape, putting textures from the actual film on the appropriate shapes and filling in the new surfaces with the textures. It infers what the stereoscopic images should look like based on the final 3D map. It uses these images for the left/right handed polarized light and you have a 3D version of whatever you want, yes!
Baked?
http://www.stereo3d...3dplus_software.htm This is a system that *claims* to do the above, [Aristotle, Mar 03 2009]
[link]
|
|
Yes, but those drugs are illegal. |
|
|
(Um Gottes willen). I can envisage a situation where the edges of shapes can be detected and sort of cut out to make sprites like DOOM, then what's behind them filled in, but i think it would probably involve people going through the film frame by frame doing that. If you could get hold of a version of a film where, for instance, wires on flying actors had yet to be removed, there would be a clue there. However, i can only see that this could be done for either flat shapes looking like cutouts or in a sort of cubist way, where for example someone's nose floats in front of the rest of their face as a separate sprite. I really don't think models could be practical. |
|
|
// It is a computer. It is very smart // |
|
|
It is an idea. It relies on magic. |
|
|
Would be fun to watch how it deals with LotR, where actors at different distances were meant to represent people of different height at the same distance. |
|
|
Creating a 3D model from 2D images is possible, in principle, but requires the camera to pan through the scene in a controlled way. I don't see how it could be done reliably if the camera is in a fixed position or if the scene is complex (e.g. a forest, or poorly lit). Just saying "really top notch programs" isn't enough - some sort of explanation would be helpful here. |
|
|
wouldn't expect much out of the first-generation, but bear in mind that a normal (new) TV does quite a bit of interpolation to produce new frames, keep lines connected, etc... sadly this new technology still won't help you look down <actress>s top... no clue why you'd base it on object recognition, though. |
|
|
People have been attempting this very thing for quite some time. The first generation of such software has certainly already been and gone, I believe. See link. |
|
|
[loonquawl] points out one good example of scene
ambiguity. In fact everything you see is
ambiguous. |
|
|
Vision is a totally underspecified problem: since
something as basic as surface reflectance and
illumination are conflated in the stimulus (the
thing to which you have access), there's no
straight forward way to work out what is "out
there" in the real world from the image formed on
a camera (or retina). Put simply, a particular image
could be formed from an infinite number of object
configurations / object reflectance qualities /
scene illuminations. We don't know how animals
do this amazing feat and we sure as hell can't get
machines to do it yet. |
|
|
//it can also recognize humans and thousands of
other objects from a series of images. // |
|
|
Object recognition in natural images is kind of the
holy grail for computer vision algorithms. It can't
be done robustly partly for the reasons outlined
above, and partly because objects have a nasty
habit of looking different when you rotate them
or move your head. Nor is there a good way to
categorise them robustly. Solving this problem
would require you to solve many of the thornier AI
problems. |
|
|
//Creating a 3D model from 2D images is
possible// Is it? Occlusion will mean you'll always
have masses of information either missing or
guessed (e.g. i have no idea what the back of my
monitor looks like from here - and unless i
"recognise" it as a monitor, my guess will be poor). |
|
|
This technology would indeed be great. Sadly
none of the hard problems are solved yet. |
|
|
Thanks for the annos, they aren't telling me that everyone groaned at the magicianery of all this. Well, maybe [loonquawl] did, and some may have sighed. And an mfd may be in the works, but it's an interesting problem, the vision problem, and this idea is a good way to hash through it, I think. Thanks, [Aristotle], for that link. To be sure, people have been thinking about this for quite some time. |
|
|
I'd say magic. Humans are still way smarter than computers in this area and if you close one eye I'll bet I can get you to run into any number of things if I try. My guess is this would work part ways and just enough to give me massive headaches looking at it as my brain tries in vain to rationalize the computer generated irrationalities. (-) Also as [hennessey] states this generates extra data out of a set of data and that is a tough trick in the best of circumstances. |
|
|
Whereas i think a relatively feasible way of doing this is by manually tracing the images in the frames, vision doesn't just work by parallax but also by other cues, one of which is focus. I'm watching a video right now and, depending on the camera angle, it isn't all sharply defined because the camera tends to focus on the actors or other subject of the shot in question. Whereas a sophisticated algorithm picking out different depths may not be easy, if there's a way of distinguishing between in-focus and out-of-focus objects, there could at least be two layers. Unfortunately, sometimes the depth of field can be quite deep and it might be impossible to distinguish between blurry foreground objects and blurry background ones, meaning that they would look like they were behind, say, birthday cake or puppy shaped holes in the actors. |
|
|
//no clue why you'd base it on object recognition, though// unless you mean create a fresh database with each film so you could meld a side profile from a previous scene with the front profile from the current, type of thing. |
|
|
I thing with sight, in my belief, is the human brain is guessing from an already built the model of expected. A human brain will still fail to recognise something if there is not enough points to run through the whole memory and the object is totally out of place. |
|
|
/(e.g. a forest, or poorly lit)/
If the computer is not going to be a 100% accurate to the 2D film then there is wriggle room to build an approximate forest with the actors features from the last clear scene . A plot line script(story board) would help wouldn't it . |
|
|
unless the film is digitally animated this is completely impossible. You are fighting against the brain's natural tendency to INFER depth by relationship by adding real depth based on similar inferences. To add depth to an image we need two images each image containing data that was not in the other image. If you start with an image that has no depth, the extra data simply doesn't exist, the edges of objects will have an obvious blurred zone where the image had Z overlap. The closer the closeup the more Z overlap needed, and on round objects near the center of vision (like a facial closeup) the amount of difference between L and R images is tremendous, a simple distortion is going to look TERRIBLE. So to sum, you now need two images for our 2 eyes, or the ability to construct a 3D model and project in 3D. AND you need to fake a large quantity of missing image information that is revealed by Z overlap (which basically requires that everything be fitted for a geometric model and placed in relationship to the rest of the model relying on data which, because not everything is geometric, simply isn't in a conventional image.) Saying that you want this to happen, without recognizing how much the image would have to be essentially faked by the computer is simply asking for a miracle. WIBNIfty. |
|
|
[WcW] might be impossible for a forest but for something where the action takes place on a handful of indoor stages, the computer would have the entire film to figure out what all the sides of a room/prop/person looks like, before starting to display the film, ie: fill in a hat of which only the frontside is viewed in scene 3, with the back of the hat which is glimpsed in scene 7. |
|
|
I wonder how often such a technique - were it possible -
would reveal previously unnoticed depth-errors in movies
shot without the expectation of 3D... |
|
|
//It is very smart
I bet a creative AI could probably put stuff in, probably not what was actually there originally, but make a good enough creation that we can simply walk through the scene within our holo-decks and not know that central park never had a grand stadium in it... |
|
|
Scratch that, creative ai with a REALLY big cross referenced-database of the world, but for informative reasons? Imfeasable at best. |
|
|
Since we can't even employ this on the scale of a single still image (randomly chosen) I highly doubt that the computing power exists today to even contemplate translating video. We are just at the level of producing good stereo sound from mono recordings, the stone age technological equivalent. |
|
|
Back when I was an inventor for Kodak I was familiar with the work of image scientists, who could decompose pictures into textured regions and were working on published scientific papers on 2d to 3d transformations. This was about 6 years ago and their papers and research will be in both the relevant literature and their patent applications. |
|
|
A problem with such work is that we have in-built facilities to spot errors in 3d work, which we use for spacial assessment. We are also very good at spotting incorrect facial details and posture of humans. |
|
|
However this is an area that has been on the verge of passing from being Artificial Intelligence work to that of a specialised area of computing for some time. Nothing really commercial is out there but this not due to lack of baking. |
|
|
There are other cues to stereoscopic vision. It isn't just a question of parallax and angle of view, and these will turn up in video. There's the blur, there's light attenuation and there's increasing haziness. These could be applied to specific, specialised scenes. For instance, a scene in an underground carpark or a dense wood allows comparison between similar materials and their lighting, allowing different layers to be constructed, and a scene in fog, or a hazy mountain on the horizon, could similarly be detected. However, it could only really be flat-looking objects in a scene rather than genuinely three dimensional ones. |
|
|
I am, however, convinced that what i describe could be done. |
|
|
WcW, I'm glad to see you haven't let go of impossibilities (You're a fine cynic). |
|
|
No, as a matter of fact i've noticed sort of the opposite. When colour TV was new, it tended to be too vivid. Everything looked as saturated as a peacock. Nowadays, the colours look a lot more muted to me. |
|
| |