Having read Mr Dea's RAX pattern here, I was motivated to record this pattern as an approach to resolving similar problems.
This pattern came as an evolution of approach. I was working on a thick client application that allowed off and on line work using large graphs of DTOs. Pessimistic locking was used to prevent multiple updates (in particular for off line work.
The original authors of the system, under time pressure, had determined that the simplest solution to persistence, was to delete all the 'old' records from the database, and to then create the data afresh from the graph of objects. We were dealing with graphs that addressed up to 1000 records in the database so this approach placed unreasonable demands on the database due to the expense of deletion and creation as opposed to update (or leaving alone). This solution also had the problem that there was no way to migrate to an optimistic locking approach.
In order to address this, the first pass resolution was to use something not dissimilar to Mr Dea's RAX pattern, the Session Bean went through the graph of objects, doing comparisons with the database to determine what needed updating, deleting or creating. While this solution reduced the load on the database, there was still a lot of work done in getting the original graph of objects and comparing it with the altered one.
The final solution was to make the DTOs slightly more intelligent carriers of data - They carried and maintained information about the state of the data they contained.
When a DTO was created and populated on the client, its state was 'NEW', a DTO created and populated on the server from the persistence store had state 'UNCHANGED'. When a DTO that had been passed to the client was altered its state became 'ALTERED'.
When the client decided to delete a DTO, if it's state was 'NEW' it was simply discarded, otherwise it's state became 'DELETED'.
From this you can see that a very simple state chart can be constructed of the typical lifecycle of a DTO.
When the graph of objects is persisted, it then becomes a very simple process to iterate over the graph and determine whether to create, update, delete or leave alone the record associated with the DTO in question.
The first pass of implementing this solution involved hand coding the DTOs to maintain the state, so that when a setXXX() method was called, the state would be changed appropriately (staying the same if 'NEW', 'ALTERED' or 'DELETED', changed to 'ALTERED if the state was 'UNCHANGED').
Since this solution was first implemented, code generation techniques have come into general use, so it has become trivial to alter the DTO template to generate code to support this approach.
Another approach that I have thought about, but yet to implement is the use of AOP to maintain the state, keeping the DTOs as 'vanilla' as possible and possibly allowing this approach to be added to an existing application.
This technique has little or no use in terms of thin client applications, but since they generally do not generally work in this way with large graphs of objects, it is no real loss.
If this pattern has been posted before I apologise, but I do not recall having seen it recorded.
We've used the same type of pattern a number of years now and it is really usefull. The only problems we've faced are granularity and the paths of the DTO graph.
The state data is only maintained on the level of the DTO itself. So you can determine wether a DTO node was modified but not 'what' of the DTO node was modified. For evaluations like 'attribute X value@pre <> attribute X value@post' we still need to retrieve the original data from the persistence storage.
Another problem is the use of 'referential nodes' in the DTO graph. This means that a number of DTO nodes might exist under other nodes. As an example you might have a number of DTO nodes for representing orderlines, while under these nodes you might find a DTO node that contains general product data. When creating a new orderline you could reuse the general product data from another node. This means that you need to maintain state of the path between the DTO nodes.
Thanks, for your response Renaat.
I agree entirely - the pattern as written here, does not address all the concerns that you raised.
In part it was because I wanted to get to the essence of the pattern without confusing the reader. I managed to completely flummox the readers of the Access Transfer Object Pattern and still have to go back and explain myself better.
What you are addressing, tends to come out of specialised requirements of a particular business. It sounds like the environment that you are working in requires finer granularity than the initial statement of the pattern specifies. Field level meta-state in my experience is required only in a sub-set of applications. I have used it, but only when the added complexity is required. I agree that it is an useful extension, but only when required.
I am interested in what you have to say about needing to retain path state. When the 'relationship' between objects requires state, I tend to model 'stateuful relationship' as a DTO itself and in an RDMBS my schema will usually have a relationship table, thus the pattern as stated when applied to the relationship DTO addresses your concern.
This is essentially applying the "Dirty Marker" pattern mentioned in J2EE Core Design Patterns in conjunction with Composite Entities. This interface can be applied to each DTO in the graph of DTOs indicating the various states new, deleted, dirty, etc. You can track the state from calls to setter methods, etc as indicated but we used a different approach in a recent project.
Instead of triggering the status change from calls to setter methods, an object hashing algorthimn was applied when a call to isDirty() was made. The object was serialized and then "hashed" using an MD5 hash calculation which was compared with a prior baseline. If the values differ, the object had changed. You have to be sure to mark member variables as transient if they are not really part of the DTO attributes.
I am mildly concerned by your solution of using hashcodes - how can you be sure that the DTO was not changed in such a way that the hashcode remained the same?
There is a common misconception wrt hashcodes that they are unique, but of course as long as there are fewer possible hashcodes than there are combinations of the values being hashed then they can never be guaranteed to be unique.
I can only assume that you had an application that had fields change sufficiently frequently that it was worth using the cost of using an hashing algorithm to find changes, and in the case where the hashcode was identical you did enough of a field by field check to be sure?
On further thought, I would like to add:
Admittedly using MD5 message digests, the chances of identical digests are pretty small, but I would think that the cost of serialisation and digesting still make this an undesirable strategy.