h a l f b a k e r yBone to the bad.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
Suppose you have some big file (or several) split into multiple smaller ones (for crude distributed storage or smth), and need to put them back together. Normally you'd either create a new file and copy the contents from each partial file into that, or append the others to the first part. But all that
copying is redundant - at least if you'd delete the parts anyway afterwards.
I'd rather have a program that stitches the parts together by reassigning the clusters taken by the split parts to another file in the correct order. Of course, the part sizes should be exact multiples of the cluster length or there would be garbage or zeros at the seams. But it's often the case that the split size is a nice round number.
Please log in.
If you're not logged in,
you can see what this page
looks like, but you will
not be able to add anything.
Annotation:
|
|
[+], but only if the program can do the opposite, as well. That is, split apart a big file into many smaller ones, without copying the clusters. |
|
|
Wouldn't this depend on the file system's method of indexing clusters? An offset method wouldn't be able to jump to a brand new cluster index. |
|
|
Okay, but is this really a problem for anyone? |
|
|
This general concept (though mostly in reverse--treating larger files as combinations of smaller ones) would mainly be useful if it could be combined with hard linking and copy-on-write semantics. In that circumstance, it could be very useful if file formats were set up to take advantage of it. |
|
|
For example, a person might have a number of video-editing projects which use portions of some large video files. If a project uses 5 minutes out of a 90-minute file, it would be wasteful to make a separate copy of the 5 minutes, but it would also be wasteful to have to haul around the whole 90-minute file. Best would be to have the system keep a link to the appropriate part of the 90-minute file, with the ability to grab just the appropriate part of the file when copying it to another medium. |
|
|
BTW, on a related note, it may be helpful to have a file format which stores a hash value with each chunk of the file, and have a system maintain an index of the hashes of chunks of data. In that way, when importing a large file, the system could check whether any of the chunks already exist and--if so--avoid storing them redundantly. |
|
|
Some Youtube videos the bit that changes most or rather changes significantly is the audio. If the video is not exactly in the right order from beginning to end it doesn't matter. If the video is: "How to paint your face to look like a comic book character" say, getting a few bits out of sequence is not important. If the file transfer protocol allows minor noise, the overall speed might improve a bit. |
|
|
Does copyright imply that a perfect copy is made ? |
|
|
Copyright covers derivative works as well as verbatim
copies. |
|
| |