Transparent Distributed Lazy Loading and Persistence

Discussions

J2EE patterns: Transparent Distributed Lazy Loading and Persistence

  1. Transparent Distributed Lazy Loading and Persistence (25 messages)

    This is currently a work in progress of mine, so not actually in use in a real project, but here's the idea:

    Value Objects are "traditionally" used in many J2EE systems to shield the client from the domain model, and to package up calls to reduce network traffic.

    Often one or more Value Objects are written per model class. Plus all that extra code for conversion. (Although a lot of that can be improved by use of tools like XDoclet). Thus increasing development time, increasing code complexity, polluting the domain model with framework specific code and generally making life more complicated than it should be.

    Also in many applications Entity beans are often overkill and require the creation and maintenance of many extra classes on top of your equivalent POJOs (Plain Old Java Objects).

    My framework/pattern will allow you to do things like the following:

    long myId = 23;
    Person p = ObjectManager.load(myId);
    Address a = p.getAddress();
    a.setTown("London");
    ObjectManager.save(a);

    The code will work equally well on the client side (say in a servlet engine), as on the server side (in the ejb container), without ANY changes, even though the database is remote from the client side.

    How does this work?

    I "advise" (Aspect Oriented Programming terminology) the POJOs on the client side so that calls to any of the setX(...) or getX() methods are intercepted by my framework.

    For instance when I call p.getAddress() from some code in my web action, the call is transparently intercepted, and the framework code then checks if the field "address" of the object "person" has already been loaded - if so it simply returns the value, if not it makes a remote call to a stateless session ejb which calls the equivalent "load" method on the O/R mapping tool to load the object, which is then returned to the servlet engine where the attribute is set with the value on the person object and returned.

    The entire process is entirely transparent to the developer writing the code, and enables the same POJOs to be used on both the client and server side thus removing the need for value objects. (You could of course still use them if you like for session bean methods that return rowset style tables of data - although there is no particular advantage in doing so).

    If the same code is executed on the server side, from inside, say, a session bean, then the calls are deferred to the O/R mapping tool.

    Either way, it is completely transparent to the application developer.

    Clearly, when the code is run on the client side, if the framework is making a remote call for every getX(...) invocation, then this could be a real performance killer.

    Consequently, the data can be preloaded by the framework on the initial calls. For instance, if I know I shall be traversing the object graph to access 5 different objects, the framework can be configured to return all objects in a single remote call.

    These "fault groups" can either be specified in code, or specified in config. files for complete transparency.

    An alternative way of pre-loading data is possible by the framework "learning" the application behaviour.

    In "learning mode", the application is initially run with no preloading, and keeps track of how much data is typically loaded in each transaction/use case. Then the next time that "use-case" is run, it uses it's recorded figures from the previous run to determine how much data to preload.

    In other words the system self-optimises for performance.

    The inspiration for this has come mainly from experience of the frustrations of managing the complexity of large J2EE projects using the "classic pattern": Struts + Value Object + Session facade + Entity bean/DAO.

    Other sources of inspiration have been Rickard Oberg's thoughts on lazy loading and various AOP works in progress, eg Nanning aspects and JBoss AOP - although I differ from Rickard's approach since I do not require an Interface/Implementation separation which I think is messy.

    Note that the framework is not reliant on any AOP tool such as aspectj/jboss/nanning, and the idea is that it should work on any app server/servlet engine.

    As I say, this is by no means complete yet and has not been tried on a real project, but I hope to open source the project when it gets closer to that stage.

    Threaded Messages (25)

  2. Tim,

    This sounds like it might be a nice idea, I'll have to see something more concrete to form an opinion. However, I think that this is most definately not a design pattern. It's a particular software framework. Like all software frameworks, it can be used in a variety of designs. But that doesn't make it a design pattern.

    I don't mean to pest on minor details, but I really do pay more attention (as does most of the community, I think) to design pattern posts because they are generally more useful to me. Interesting projects pop up all the time, you can never keep track of them all. Real design patterns rarely come up, and I'm much more interested in knowing about them.

    Gal
  3. I agree. The Patterns part of TheServerside has become just a place where one could express and discuss some ideas. I think there should be strict requirements for posts in this area, at least a requirement of following GoF pattern definition. Otherwise it's an ordinal online discussion forum, one of zillions...
  4. Not a pattern discussion???[ Go to top ]

    I'm just mentioning several alternative implemenations of the pattern you mentioned. Although TOPLink (and JDO) does do many things, one of the things it does is distributed lazy loading. Hopefully, discussion of the practical implementation of patterns also falls into the general domain of discussion of patterns.

    If not, I don't see the point of discussing patterns at all. After all, who actaully believes that architecture is just combining a bunch of patterns together? Good practice drives patterns, not the other way around! The GoF book in the hands of an inept architect lends undue credability, does it not?

    I'm surprised that the second half of my post on REST got no comment. Please read up on REST and then think about implementing distributed lazy loading through REST and XLink. I'm just not going to give it away for free at this point by spelling it out in GoF detail. Suffice it to say I am in the process of implementing it.
  5. Hi Tim:

    This is definitely something worth testing. I think it is pretty much similar to what Rickard is doing. (or has done) However, what makes you think that interface/implementation is messy? AFAIK, this is really clean for java implementation. Unless you define something like 'dynamic interface' like in Ruby, which is another aproach I am pursuing. Much like an event driven AOP.
  6. <Tieying>
     However, what makes you think that interface/implementation is messy? AFAIK, this is really clean for java implementation.
    <Tieying>

    Hi Tieying - maybe messy is the wrong word.
    AFAIK - Nanning/Rickard aspects require you to maintain a separate interface/impl hierarchy - ie more lines of code to maintain.
    Also making it hard to retrofit to existing object models where you may not want to/be able to change the code.
    Also they only allow advising of public methods.
    I guess there's nothing wrong with Nanning/Rickard aspects - they're a great idea, just IMO not so flexible as they could be.
  7. Value Objects[ Go to top ]

    I don't see how your scheme alleviates the need to make value objects. You still need them, in some form. In your instance, you have Person and Address. In the "classic" way of doing it, those would be value objects, and both server and client would use them.

    I also think that the notion of preloading or learning what is needed is orthogonal to the persistence/access mechanism. There's nothing stopping you from doing that using Session Facade, Value Objects, and Entity Beans (or DAOS, or any other O/R mapping mechanism).

    Additionally, I think it is potentially dangerous to have code that has non-obvious side effects. In your example, an invocation of a get method on a plain java object can actually initiate remote access and database access. This is not terribly obvious from the code. Furthermore, if it _WAS_ obvious to someone, you still cannot tell when that code would return a value and when it would not. Now, I realize that that is the main point, but a system like this would be significantly more difficult to debug and tune, in my opinion, and since most systems spend most of their life being debugged and maintained, this might not be desirable.
  8. Value Objects[ Go to top ]

    <dave c>
    I don't see how your scheme alleviates the need to make value objects
    </dave c>
    What I mean is you don't have to maintain separate hierarchies of "entity" and "value object", whether you call them "entity" or "value object" is academic.

    <dave c>
    I also think that the notion of preloading or learning what is needed is orthogonal to the persistence/access mechanism. There's nothing stopping you from doing that using Session Facade, Value Objects, and Entity Beans </dave c>
    True, and there's nothing stopping us writing complex applications in assembler - we don't because it's more complex. The idea here is to reduce complexity. Of course the same result could be obtained in many ways.

    <dave c>
    Additionally, I think it is potentially dangerous to have code that has non-obvious side effects
    </dave c>

    Dave - I think this could be seen as a criticism of any AOP tool/framework, nothing specific to this technique.
    Remember though, that not every POJO is affected, only those for which there are relevant aspects.

    <dave c>
    but a system like this would be significantly more difficult to debug
    </dave c>
    Again, this is a valid criticism of any current AOP tool/framework, nothing specific to this. Remember AOP is still in it's infancy so nothing's perfect yet.
  9. What about...[ Go to top ]

    This seems too simplistic to work in a real application.

    A framework cannot cache data from a database unless it knows that it is the only process touching that database, otherwise some other process can update values that are already cached in memory. This might work OK if the client is guaranteed to run only once in one single JVM, but I don't see how you can support multiple clients without a complicated synchronization mechanism, which is one of the things entity EJBs will buy you.

    I'm also not seeing how you demarcate transactions? Most applications would want to combine several updates, potentially on different objects, into a transaction, so you need a way to begin, commit, and roll back.

    I've been down this road a lot of times, and there's no easy way to avoid database access in a distributed environment. A database is designed to be the authoritative source of your data. You can put something in front of it, but whatever that something is, is going to have most of the same problems as the database (i.e. it'll be hard to distribute).
  10. What about...[ Go to top ]

    Frank-

    I think you're overcomplicating things. This is not a distributed cache I am talking about. I agree that would be complicated.

    All caching and transactions etc. etc. on the server side is done by the J2EE container + O/R mapper (no caching is done in the framework).

    The framework does not cache data any more than a value object "caches" data on the client side.

    The transaction and caching issues are exactly the same as you would have if you were using value objects.

    Updates frome the client to server are batched up and executed in one server transaction.

    Simple optimistic locking checks are applied for concurrent updates.

    So basically it follows the same pattern as your classic value object + session facade + domain object pattern.

    As I say, this is not a distributed caching framework!! If you want that there are several products on the market already.

    This is simply an AOP framework - one application of which is to remove (or massively reduce) the need for value objects in the classic servlet engine + value objects + session facade + entity bean/dao pattern.

    So your comment
    <quote>
    This seems too simplistic to work in a real application.
    </quote>
    would apply to anyone that has used value objects in a J2EE application - try telling that to all the J2EE architects!
  11. I have been using TOPLink

    http://www.oracle.com/ip/deploy/ias/gs/index.html?oracle9iastoplink.html

    for years and it does something like this. Also, I have implemented distributed lazy loading using stateless session bean handles and the exact pattern you described.

    It seems to me that using REST Web Services

    http://internet.conveyor.com/RESTwiki/moin.cgi/RestResources

    along with XLink for resolving links in XML

    http://www.w3.org/XML/Linking

    could provide a distributed persistence interface (similar to JDO) which is scalable and language (and platform)independent.

    What seems to be missing right now is an XML parser which will transparently resolve XLinks and an open source programming framework for mapping <i>resources</i> (as defined by REST) to a relational or other database.
  12. Hi Matt-
    <quote>
    Also, I have implemented distributed lazy loading using stateless session bean handles and the exact pattern you described.
    </quote>
    How are you instrumenting the classes to accomplish this? (XDoclet, own code generation, own AOP tool or other?) I'm asuming they're POJOs? Or maybe you're just using Java2 dynamic proxies a la Nanning approach?
    I'd be interested to know.
  13. Stateless session bean handles[ Go to top ]

    Yes, my own code. I just used a DAO pattern, then hid the call to the stateless session bean handle inside my bean. This does hide a "side effect" of calling the stateless session bean but isn't that the purpose of OO? ;) Just document the side effect in the JavaDoc...

    The really cool thing about using Stateless session bean handles is that they are serializable and don't ever expire. Of course, an issue occurs when a list you are getting back get too large or needs to be sorted differently, etc. Then I just use DAO. This pattern could even be an extension of DAO... A way to handle lazy loading with DAO.
  14. http://cglib.sf.net was designed for this kind of use cases.
     Proxy, Decorator or interceptor design pattern implementations use it.
    cglib generates byte code to extend class and implement interfaces and uses callback to intercept method.
    I like frameworks based on dynamic code generation, I use this way to solve most of problems in my projects. http://voruta.sf.net is one of dynamic code generation use cases.
    I think it is a good idea to make this design pattern implementation public. Open source is a very good way to innovate and experiment.
  15. The framework described here and much more is available in Toplink. Deferred loading, Caching (several caching mechanisms), Optimistic Locking, all kinds of fancy Mappings, etc., have been available in Toplink for years. One of the highlights of the product is perhaps the Unit Of Work concept where one can have transactions purely at the object level. One another aspect I like is that Toplink is pretty much non-intrusive on your domain objects.

    I know of quite a few large scale deployments around the world using Toplink. Toplink is now owned by Oracle.

    BTW, I don't see how you really eliminate the need for Value Objects. You may not call them that, but you still have them in some form.

    Jerry
    Not an Oracle Employee.
  16. Sigh... You're right.

    I shouldn't have posted this here, it's not a pattern.
    Not sure why everyone thinks it's trying to be an O/R tool or distributed cache. It's not a toplink nor does it suppose to be.

    If you think that you're defininitely missing the point.

    <quote>
    BTW, I don't see how you really eliminate the need for Value Objects. You may not call them that, but you still have them in some form


    Really?? Where?? If you think that you're DEFINITELY missing the point. That's the cruz of the matter.
  17. it certainly works[ Go to top ]

    I also use this kind of framework to access databases. It eases development very very much. Persistence is not an issue any more, i just write the java object and create a table with the same field names. It works nice at server, below session ejbs demarcating transactions. So it's a faster & lots easier than local entity beans that were hidden behind a session bean. It's a kind of jdo. The cache problem (pojo's are not notified of updates comming from below) shouldn't be handled at this level. You can just use this as a high-level java-oriented api to your specific databases.
    I don't think you can make this work on a client, batching updates etc; that would require a entire framework of it's own.
  18. How to distinguish client/server?[ Go to top ]

    I generally understand the immense value of the Interceptor/Proxy pattern and how it works. ("Generally" because I haven't done it myself, I *think* my knowledge is clear.) What I don't understand is how objects passed from the server to the client will be advised correctly.

    For instance, on the server we have an advice on the Customer object for the method 'getOrders()' so that it instantiates a Local SessionBean, fetches the Order objects related to the Customer and puts them in private variable for the Customer so they're accessible on future calls. Cool.

    The client also has a method 'getOrders()' except it connects to a Remote SessionBean on the EJB server which fetches the related Order objects and returns them in a Collection. Still ok.

    Say the Order objects created on the server used an ObjectFactory that associated each POJO created from the data in the database with a Proxy that handled relationships in much the same manner as above (instantiate Local SessionBean, call method on it, etc.)

    But now we're passing these Order objects (which are actually proxies) to the client. What happens now? We've attached additional meaning to these seemingly harmless POJOs: these proxies are meant to run in the server, not the client. And when we call one of the proxied relationship methods (say, 'getOrderItems()') the proxy will try to instantiate a Local SessionBean and blow up because we're not running in the server process.

    What am I missing?

    Thanks!
  19. How to distinguish client/server?[ Go to top ]

    But now we're passing these Order objects (which are actually proxies) to the client. What happens now? We've attached additional meaning to these seemingly harmless POJOs: these proxies are meant to run in the server, not the client. And when we call one of the proxied relationship methods (say, 'getOrderItems()') the proxy will try to instantiate a Local SessionBean and blow up because we're not running in the server process.

    >
    > What am I missing?
    >

    Ok. When passing objects by serialisation, only the DATA of the object is passed. The actual classes on the client and server are loaded by different classloaders which advise the class in different ways. When you pass an object by serialization, the class itself is not passed. Hence any advice that you added on the server would never appear on the client.
  20. Proxy/advice mismatch[ Go to top ]

    I understand that only data is passed. My mixup was as a result of confusing dynamic proxies and advice weaved in at the class level. Thanks for the clarification. Looks like it's time to play with nanning :-)
  21. I see this useful for small application for which using EJBs doesn’t make much sense anyway. For enterprise scale I see some drawbacks.

    One would be that it encourages implementation business logic in the front end since it makes the domain (entities) visible and this does not promote business reusability among applications (unless is reused at presentation level).

    Another disadvantage is high network traffic when realizing a use case that involves modifying several entities in one transaction (for reading them caching solves the problem). More, if the domain attaches policies (business rules) to the entities this logic has to be transported along with the entities which, except for the actual persistence, will basically duplicate the domain in the front end.

    Also, this approach moves the same objects across 3 layers (integration, business and presentation) and creates dependencies among them – which is bad software architecture.

    Not all the applications require the same level of detail so moving for instance the whole user data when the application needs first and last name to say "Hello" is also resource consumming.
  22. Hi Razvan - thanks for your comments.


    > One would be that it encourages implementation business logic in the front end since it makes the domain (entities) visible and this does not promote business reusability among applications (unless is reused at presentation level).
    >

    It doesn't implement business logic in the front end, or even the back end for that matter. The key point here is that "location" of the business logic is completely transparent. It lives everywhere and nowhere.
    AFAIAK business reuseability is not compromised at all since the domain logic is not locked into ANY layer, be it the front, middle or back.
     
    > Another disadvantage is high network traffic when realizing a use case that involves modifying several entities in one transaction (for reading them caching solves the problem).

    The updates can be packaged by the framework into a single network call sending only the required data thus optimising network traffic.


    >More, if the domain attaches policies (business rules) to the entities this logic has to be transported along with the entities which, except for the actual persistence, will basically duplicate the domain in the front end.
    >

    See first answer
     
    > Also, this approach moves the same objects across 3 layers (integration, business and presentation) and creates dependencies among them – which is bad software architecture.

    Not sure if I understand your point. Can you elucidate?

    >
    > Not all the applications require the same level of detail so moving for instance the whole user data when the application needs first and last name to say "Hello" is also resource consumming.

    As previously explained the whole user data is NOT moved when just a few fields are required.
  23. I see this useful for small application for which using EJBs doesn’t make much sense anyway. For enterprise scale I see some drawbacks.

    >
    > One would be that it encourages implementation business logic in the front end since it makes the domain (entities) visible and this does not promote business reusability among applications (unless is reused at presentation level).
    >
     
    > Also, this approach moves the same objects across 3 layers (integration, business and presentation) and creates dependencies among them – which is bad software architecture.


    This is a common confusion based on people not working on OO fundamentals as we rush forward with J2EE and other such "Enterprise" architecture. Your domain logic should be in domain objects. These objects are then shared across all layers and that is NOT bad software architecture. It is in fact the only way to get once and only once which is good software design. Your domain objects should not live in a tier but should be usable across all tiers.
    This doesn't create dependencies among the tiers - only dependencies on all tiers with your domain - which is as it should be. How can you have a order entry screen that doesn't know what an order is? (This of course doesn't prevent you from having presentation layer pieces that no nothing about business objects but the areas that interface with the domain might as well do it with the official domain objects as with any value object surrogate.

    The reason the J2EE picture is drawn the way it is is people thought that the predominate web design would be applets making RMI/CORBA calls. Once everything on the presentation tier is moved to the server side a lot of what comes out of J2EE as "best practices" is really nonsense.
  24. You really nailed it !

    I think many of the so called design patterns described in heavy J2EE publications (i. e. Core J2EE Patterns) simply try to patch the bugs in the specification.

    In this book there are five main patterns in the persistence territory: Value Object, Composite Entity, Value Object Assembler, Value List Handler and Data Access Object. These patterns are the result of the misconception of Entity Beans which IMHO caused a big damage in Java's reputation. Patterns that force me to duplicate (or triplicate ...) my domain object hierarchy are implicit anti-patterns because they violate the (also IMHO) most important principle in software engineering: don't repeat yourself (DRY - named by the great Pragmatic Programmers). I have seen a lot of mess of Java code created by those "everything-that-concerns-our-domain-objects-have-to-happen-on-the-server" purists; loads of classes which are implemented for a single purpose: to get those damn domain objects to the client (I talk about rich clients BTW).

    I'm actually working on something I call late materialization which means to keep the objects in their raw DB-oriented representation as long as possible. I use a combination of a popular O/R mapper on the client and a self-developed, mapper-agnostic JDBC-Tunnel. The latter one "tunnels" the JDBC calls of the O/R mapper to a server component in an efficient manner (RMI, EJB, HTTP(S) ... transport protocol really doesn't matter). ResultSets are not returned in one big load but splitted into RowPackets that are "streamed" to the client. I think this "pattern" really works for small up to medium sized applications where the usage of database resources is not as critical as in applications with a lot of concurrent users and many transactions (like ebay).
  25. Or you could use a Virtual Proxy as explained in http://www.anands.net/articles/lazyLoad.html to load the data. This can get the relevant attributes in one go.
  26. Nice idea.
    Only trouble is that the Java Dynamic proxy only works with interfaces so you'd have to provide interface and implementation pairs for each object, which could be overkill.
    Advantage of byte code advising is that it can do the same without requiring an interface.