h a l f b a k e r yThink of it as a spell checker that insults you, as well.
add, search, annotate, link, view, overview, recent, by name, random
news, help, about, links, report a problem
browse anonymously,
or get an account
and write.
register,
|
|
|
For some categories of information held within a relational database, it's very useful to be able to know not just what the current information is, but a history of what the information used to be.
We can accomplish this by holding multiple copies of the data with date markers and simply checking
which record to retrieve at any given time, but as this information becomes more widely required the design quickly becomes unwieldy as extra objects grow all over the database and the public interface changes significantly. In essence, the developer ends up building a separate database on top of the existing database platform, with its own commands and API. This is clearly inefficient.
Instead, the database platform should support this as an option on all tables at creation. When this option is selected, rather than a traditional file the database shall store a full change history with multiple versions of each record. Records won't be deleted but expired.
For reading, instead of running
SELECT * FROM Users
The programmer could then run
SELECT * FROM Users AT '2010-01-19 23:00'
and receive a full copy of the requested table as it was at that point rather than at the present moment, or
SELECT * FROM Users BETWEEN '2010-01-01 23:00' AND '2010-01-19 23:00' to see the changes within the time period.
[link]
|
|
A really excellent idea. I'm not sure one could discover the
prior art if he/she didn't happen to know the term "version
control." |
|
|
When I read the title I thought this was going to be a database of reviews on all the people you've dated. |
|
|
What you are talking about is something like saving the transaction log from SQL. The downside is that this can get BIG quickly. |
|
|
a) This _ISN'T_ database version control. That's about taking a copy of your database's structure and saving that into a version control system so the database structure can be tracked and compares with your VC tools. The purpose of this would be to allow you to build a table whose data could be read at different points in time through SQL syntax. Very useful for some applications but a major pain to implement by other means. |
|
|
b) Yes, it could need a lot of storage. That can be mitigated by storing deltas rather than full records and providing a purge function, but the chief point is to make this an option. Most tables don't need this and those that do already have the storage issue, plus lots of extra code to write which this would make unnecessary. |
|
|
something keeping you from putting a date field in when you write the schema ? |
|
|
Perhaps more usefully:
A lot of companies are now marketing database software
designed to comply with "CFR 21 part 11" (rules from the
USA's Federal Drug Administration regulating clinical trials
databases). Compliant software must be capable of
retrieving dated earlier versions of altered records. But
probably not with the ease & convenience of this idea,
since they have an audit in mind, not the sort of query you
describe. |
|
|
FlyingToaster - no, but that presumes you only want to know when an individual record was created / last altered / whatever. This would allow you to see what it used to look like and when, which for some applications is _very_ useful and can't just be done with a simple date field on an existing table. |
|
|
what? no. (okay I read what I wrote and that's what it says, but it's not what I meant) |
|
|
Include "date" as part of the unique key, so with not too much flimflammery you'd normally just pick up the latest record unless you specifically asked for another date/range.... umm, requires a small amount of programming. |
|
|
But incremental's better when you don't mind having to pick up field-change records from all over the file. |
|
|
I've seen this where people save an id, a version number,a timestamp telling whenever something changes, plus a status field stating whether something is Active or Historic. If you want to see the live situation, and see it quickly, simply recall everything that's Active (bonus, you can make this an indexed field too if you want) If you want to see/rewind the state of play as at X, select everything with a date less than X, grouping on the id and showing only those records who's X is equal to a Max(x) within id numbers - but that is pretty expensive performance wise. |
|
| |