Halfbakery: File system support for auto-breaking hard links

I'm hoping this is baked and I'm just using the wrong search terms.

Many commonly used file systems support hard links and/or soft links between files, allowing one copy of some data to be accessed from two different locations. Many also have snapshot features that allow all the files to be virtually copied without making a second physical copy unless an application tries to modify the original. This allows capturing the state of all files at one point in time, often for the purpose of performing a backup of all files at a certain time stamp while allowing the system to continue normal operation during the backup.

I think it would be useful for file systems to support a type of hard link that would automatically create a copy and unlink files as soon as any application tries to modify the file through any of the links.

The times that I really want this feature are maybe somewhat specific to my job, but another application that I think would be useful to many people is in managing digital photos.

When I copy photos onto my computer, I organize all of the original files the may I like to archive them and never modify them. When I want to gather a collection of photos to send to someone, for example highlights of a vacation, or if I'm sorting through a bunch of photos to select the best (from a portrait session with my kids). I generally create a separate folder and put copies of a bunch of photos in that folder. Depending on what I'm doing, sometimes I'll end up making modification to some of the files, which is why I made copies to ensure I don't modify the originals. After I use these files I generally leave them in that folder so I can see what I did with them. Of course I never delete the originals. Hard drive space is pretty cheap, so I don't worry too much about that, but I do end up wasting a lot, and worse is the time spent waiting for files to copy when I copy a large number of photos to a folder so I can pick the one I want based on process of elimination. If I had the ability to make hard link copies that would automatically split into real copies if either were modified, that would save hard drive space and copying time. When talking specifically about photos, some photo management programs implement some of these things, but I have other uses for this as well, and photo management software could take advantage of this feature if it was built into the file system.

Hard links almost implement this feature. If you create a hard link to a file in a separate folder, you can read it from both. If you overwrite the photo in one folder, the link will be broken and the photo in the other folder will be unchanged, but if you modify the photo in one folder, the photo in both folders will be modified since they share the same data. The technology to share data but make a copy if an attempt is made to modify the file is present in existing snapshot features like Windows Shadow Copy, but that particular feature can only be used on the entire volume at once, and the shadow copy is read only. For some of the uses I'm interested in, there may be many copies of each file and many of those need to be writable. But if I ever modify one, I want to unlink and save it separately so I don't modify the others

One other use of this would be in a hard disk optimizer. Someone could write a small program to scan through a hard drive and link duplicate files together. This could be basically transparent to the user because if an of the duplicate files were modified, it would create a separate copy of the modified one.

Note that this should not be used for making back-up copies, but that's obvious because a backup copy on the same volume is not very useful anyway.

This could make normal back-ups less efficient unless the backup software if aware of this feature and there is a way to track linked files so they can be linked on the backup drive as well. Then again if this is only used in cases where normal copies would have been used anyway, then the backup would be no less efficient than before.