Relational databases are generally the preferred choice for storing application data. They help organize, store, and retrieve the data in a very efficient manner. Since application data is stored in these relational databases, applications try to use these databases for tracking historical data as well. The most pervasive approach to storing historical data is to have a time-stamped history table for every table that stores important application entities. Updates made to the main table result in actions that push out the previous values of data to the history tables. This is either done through triggers or by the applications themselves.The thrust of the article can be found in this paragraph:
There are several issues with storing the historical information in history tables.The relational databases should still be the repository for storing and retrieving transactional data. They excel in managing these critical data assets. The shortcomings listed above are confined to storing multiple versions of data entities within the relational data store and tracking such entities over time.
- Relational databases and the relational data modeling concentrate on efficient data storage and retrieval. The history tables do not model naturally with relational databases.
- There is no support for versioning from the database. The application has to carefully store the entries into the history tables either through triggers or some other custom techniques.
- It's up to the application to determine what changed between versions. The retrieval of historical information from the history tables is also specific to the history table storing the data.
Applications can use a combination of relational databases and Subversion to satisfy data management and data tracking requirements. Any updates that are made to the critical data assets present in the relational database would be accompanied with a commit into Subversion. Subversion would be the primary data source for use-cases for tracking, while the relational database would be used for all other purposes. An additional advantage is that due to Subversion's copy-modify-merge concept, there is no requirement to lock an object every time its retrieved from the relational database.Other coders have tried (successfully?) to use engines such as Lucene for actual data storage as well. Lucene doesn't really apply version control for data (as far as I know, although it can be bolted on), but the idea of using relational databases solely for looking up references to externally-controlled data can be found in more and more places (such as SVN, as in this article, or possibly even in APIs such as the Java Content Repository.)
What do you think of the idea?