Envers - Easy entity versioning with a single annotation

Discussions

News: Envers - Easy entity versioning with a single annotation

  1. Envers enables you to easily version your JPA entities, by simply annotating their properties with @Versioned. It works on top of Hibernate Entity Manager. You can version simple properties (strings, numbers, dates), embedded components consisting of the above, as well as one-to-one and one-to-many relations. All you need to do is annotate your entity; Envers will automatically generate tables in the DB, which will hold the versioned data. Their structure is clear and simple to understand, so you'll be able to read the versioned data without much effort if you ever decide not to use the library. Each transaction commit, which changes versioned entities, generates a new revision number, "capturing" a consistent state of the database. Suppose you have two versioned entities: Person and Address. Each Person has one Address and each Address can have many Persons (living there). Retrieving a historic version of an Address is as simple as: Address oldAddressVersion = versionsReader.find(Address.class, entityId, revisionNumber); Then, to get the persons which lived at this address at that time (revision), you only need to call the getter: oldAddressVersion.getPersons(); The VersionsReader interface is very simple (see for yourself: http://www.jboss.org/files/envers/api/index.html), and enables you also to retrieve the revisions at which an entity was changed. For more information, see http://www.jboss.org/envers.

    Threaded Messages (23)

  2. Envers enables you to easily version your JPA entities, by simply annotating their properties with @Versioned. It works on top of Hibernate Entity Manager.

    You can version simple properties (strings, numbers, dates), embedded components consisting of the above, as well as one-to-one and one-to-many relations.

    All you need to do is annotate your entity; Envers will automatically generate tables in the DB, which will hold the versioned data. Their structure is clear and simple to understand, so you'll be able to read the versioned data without much effort if you ever decide not to use the library.

    Each transaction commit, which changes versioned entities, generates a new revision number, "capturing" a consistent state of the database.

    Suppose you have two versioned entities: Person and Address. Each Person has one Address and each Address can have many Persons (living there). Retrieving a historic version of an Address is as simple as:

    Address oldAddressVersion = versionsReader.find(Address.class, entityId, revisionNumber);

    Then, to get the persons which lived at this address at that time (revision), you only need to call the getter:

    oldAddressVersion.getPersons();

    The VersionsReader interface is very simple (see for yourself: http://www.jboss.org/files/envers/api/index.html), and enables you also to retrieve the revisions at which an entity was changed.

    For more information, see http://www.jboss.org/envers.
    Interesting initiative. Is there any document detailing detailed design? I would be especially interested to read something regarding performance considerations. Isnt there a risk that table _revisions_info becomes kind of a hotspot when every (mutating) transaction in the system generates an insert in this table?
  3. Hello,
    Interesting initiative.

    Is there any document detailing detailed design? I would be especially interested to read something regarding performance considerations.

    Isnt there a risk that table _revisions_info becomes kind of a hotspot when every (mutating) transaction in the system generates an insert in this table?
    No, unfortunately such a document does not yet exist. Basically, every update of versioned data generates an inserts to the appropriate versions table. This is of course a performance overhead, but you can't get something for free - you've got to store the old data :). As for the global _revisions_info table, you're right, if a lot of tables where versioned, and they received a lot of update traffic, this would become a "hotspot". But it a normal setting, I don't think that this should be a problem. However, I'd have to measure that, of course. Global revision numbers are essential for versioning one-to-many and one-to-one relations, to be able to get the not-owning side of the relation. Adam
  4. Basically, every update of versioned data generates an inserts to the appropriate versions table. This is of course a performance overhead, but you can't get something for free - you've got to store the old data :).

    As for the global _revisions_info table, you're right, if a lot of tables where versioned, and they received a lot of update traffic, this would become a "hotspot". But it a normal setting, I don't think that this should be a problem. However, I'd have to measure that, of course.

    Global revision numbers are essential for versioning one-to-many and one-to-one relations, to be able to get the not-owning side of the relation.

    Adam
    Well, it takes one update and one insert, but you can limit the update to one column for all cases. The _revisions_info design breaks any chance of me ever using this. My requirement is to be able to track any change to a particular user and session (and time of session) and that can be implemented without any "global" locking.
  5. How can you limit the update to one column? (it's very possible that I just didn't thought of some better way to implement this, so all opinions are valuable :) ). I also don't understand what do you mean by tracking changes to a particular user and session? Anyway, if you have to entities in a relation, I don't see a way to version them both and the relation, with a "global" table binding their revisions. Adam
  6. How can you limit the update to one column? (it's very possible that I just didn't thought of some better way to implement this, so all opinions are valuable :) ).
    In your design you split them in 2 tables, but thats not necessary the best solution. If you keep the historical versions in the same table as the "in force" version, then you can make it into one insert for the "new" state, and an update of one column (i.e. the version) of the "old" state.
    Anyway, if you have to entities in a relation, I don't see a way to version them both and the relation, with a "global" table binding their revisions.
    I am saying that you dont need any "global" revision table at all. I thought your design used such a revision table that needs one insert for every change in the system...correct?
  7. In your design you split them in 2 tables, but thats not necessary the best solution. If you keep the historical versions in the same table as the "in force" version, then you can make it into one insert for the "new" state, and an update of one column (i.e. the version) of the "old" state.
    Well, if you keep everything in one table, you need an update of old "in force" version + an insert of the new one. If you keep data in two tables, you need an update of the "in force" version + an insert to the other table - so no real difference here. Of course, choosing one table over two tables may have its dis- and advantages, two tables is just one option :)
    I am saying that you dont need any "global" revision table at all. I thought your design used such a revision table that needs one insert for every change in the system...correct?
    Yes, I think that you don't need it until you are versioning relations. You need a global revision for each "strongly connected component" of the relations graph, am I not correct? For tables which are versioned without relations, maybe I'll add an option to version them without the global revisions. What do you think?
  8. Source?[ Go to top ]

    It's a very interesting project, but I can't find source code. Is it going to be public? And under what license?
  9. Re: Source?[ Go to top ]

    Hello,
    It's a very interesting project, but I can't find source code. Is it going to be public? And under what license?
    I've uploaded the source code here: http://www.jboss.org/envers/downloads The license is LGPL. Adam
  10. Temporal data?[ Go to top ]

    very interesting Would it be possible to extend this to handle temporal data - that is, data that is valid at a particular time? In the example with persons and addresses, I'd argue the address is really a piece of temporal data - i.e. the person lived at address XYZ between May'06-June'07, and less of a versioned piece of data. It's possible to handle temporal data via versioning, but this requires you to get all versions, and step through them to find the address that was valid at a certain time.
  11. Re: Temporal data?[ Go to top ]

    Well, as somebody wrote on the forum, a method returning a revision corresponding to a given date would be useful - and in face very easy to implement, too, so I'll add it soon. It's also possible that handling temporal data in the sense that you are writing would be a scope for another library; if I understand correctly, you'd like to retrieve the timeframe, in which an Address for a given Person was set to a specific value? Adam
  12. Re: Temporal data?[ Go to top ]

    no, as you say get the address that was valid at a particular time, and preferrably as part of a bulk query - i.e. get all the addresses that were valid at time T for a given set of persons, or all addresses someone lived on between time T and T+N. Guess these are pretty contrived examples, but the use-cases for temporal data are vast, in financial systems, accounting, CRM systems, etc. Simplistically, it comes down to tracking objects with start and end-dates. There's a write-up on temporal data available on Martin Fowler's website (here more specifically labelled Temporal Object / Effectivity Pattern): http://martinfowler.com/ap2/timeNarrative.html
  13. Re: Temporal data?[ Go to top ]

    Simplistically, it comes down to tracking objects with start and end-dates.
    Actually it comes down to tracking what data was in effect in the system at any given time. Startdate-endate is one of several alternative solutions.
  14. @Version VS @Versioned[ Go to top ]

    Great stuff :) Any reason why you've chosen the name @Versioned instead of @Historic, as @Versioned is a bit confusing with @Version (for optimistic locking).
  15. Re: @Version VS @Versioned[ Go to top ]

    I agree it may be confusing, but @Historic just doesn't sound right - it doesn't say that an entity is versioned or that it's history is recorded. And I didn't find any better name. But I hope people won't have trouble with that :) Adam
  16. Useful stuff! Congrats. When is the feature to store only the diffs between versions expected to be released? Ranjith
  17. Are there any plans to allow for HQL by revision? Or even by worktimestamp? Something like: HistoricQuery q = ... q.setWorktimestamp(threeYearsAgo); return q.getResultList();
  18. This would be a really nice feature, but quite hard to implement, I'm afraid. For now, I was thinking about providing a criteria-like interface for querying historic data. This would of course enable you to create only a limited set of queries. I also wrote about it on the forum: http://www.jboss.com/index.html?module=bb&op=viewtopic&t=134500 Adam
  19. Diffs[ Go to top ]

    I'm still thinking about the usability of this - because to retrieve the "real" version, you'd have to read all revisions up to the latest one. Anyway, it really depends on how much time I have (it's not my primary job); but I would expect 1-2 months. Adam
  20. Does Envers support hadling bulk updates or native SQL update statements?
  21. Bulk updates - I'm almost sure that yes, though I'll have to write a test for it and check. Native updates - no, as they don't go trigger any hibernate events I believe, so I'm not able to catch them. I think you'd be only able to version data when doing native updates with some database trigger. Adam
  22. Please give some feedback after testing HQL based bulk updates.
  23. Bitemporal?[ Go to top ]

    Erwin Vervaet did a presentation in TheSpringExperience 2007, called 'Temporal Issues in a Rich Domain Model'. How does your solution compare with his? See: https://svn.ervacon.com/public/projects/bitemporal/trunk/doc/Temporal%20Issues%20in%20a%20Rich%20Domain%20Model%20-%20TSE%202007.pdf https://svn.ervacon.com/public/projects/bitemporal/trunk/readme.txt
  24. Re: Bitemporal?[ Go to top ]

    Very interesting, I'll have to read it carefully later. The example there is even the same as mine; though I haven't seen this before :) Thanks! Adam