In the latest Hard Core Tech Talk, we interview Craig Russell, Specification Lead for the controversial Java Data Objects spec. In the interview, Craig talks about JDO, how it can be used with EJBs, the similarities and differences it has with entity beans, how it can be used to create a distributed shared object cache across a cluster, and other technical issues.
Watch Craig Russell's interview here
One thing I really, really dislike about streaming video is that there appear to be no functional standards as far as usability is concerned.
What I mean is that it seems as though every time I try to watch a streamed video it seems as if I must first troubleshoot the software to get it working. So I usually don't bother. This time everything *seemed* to work correctly and I got a fascinating video of Mr. Russel moving his lips in complete silence. Fascinating but not illuminating I must confess.
Well, I'm disappointed!
This is for two reasons:
- My company's firewall is configured to block streaming medias (for good reasons, as I know).
- I tried at home, but with a 56k modem and not being a native english speaker, there has been no chance for me to get the information out of the video (I always had to wait for seconds and then heard about half a second of what Craig said...)
So, are there any good reasons not to publish the interview using letters? In fact, I don't want to listen to Craig because of his voice, but because of what he has to say.
Why cannot TSS do its readers a service by posting text transcripts of Hard Core Tech Talks? It would seem to be such an easy and trivial task...
Here are the 11 first points... Someone to write the missing ones ? Excuse the typos or mistakes, I'm not a native English speaker.
1- So Craig, what do you do at Sun?
My position at Sun is architect and I work on the transparent persistence component of the Forte for Java product, which is an integrated development environment and of a transparent persistence componentism of the object to relational mapping and my other major activity here is specification lead for Java Data Objects.
2- What is it like being a JSR Spec lead?
It's really interesting. This is like a dream team from all around the world, people who have been working on database and objects for years. We probably have 150 person.years of experience on the team. We have folks from Russia, from Germany, England, France and Italy. They're all part of the team. They're all a really excited and really contributing lot of the development of the specification.
3- What is JDO?
Java Data Objects is a new specification that deals with the storage and retrieval of Java objects in transactional data stores, transactional meaning they could be big databases on mainframes, they could be small local databases on pocket computers. All of them share the same characteristics: you want to take data that is stored persistently and present that information as Java objects.
4- What is the advantage of writing JDO application programs compared to JDBC, EJB?
The main thing is simplicity. What you deal with in JDO is your object model and the data that you put into it. How it gets stored persistently is basically somebody else's problem. The JDO vendor is what we call them. So if you have a JDO vendor who provides an implementation for your platform, you can worry about the Java object model and let the vendor worry about how to actually get that persistent.
5- What kind of databases will JDO support?
We've got implementations right now, preview implementations that support mainframe type databases likeDB2, Oracle and SQL server, there's a mapping component in there and the reference implementation has a total footprint of about 300 KBytes so it's suitable for smaller applications where you want to deal directly with their file storage. There's also anticipated the ability to cut down the API a bit and to put it into telephone or pocket PDA, one of those small embedded devices. The API will be the same, maybe as a subset.
6- How does JDO help speed application development?
It has to deal with remoting the impedance mismatch and the dual coding that you normally have to worry about. Normally a database sort of persistent application has 2 components: one as you write your business logic in Java and then you have to worry about how do I persist this, what do I have to do to map the Java into the data storage. And now I don't have to go through the mechanics of translating the Java field by field, relationship by relationship, figure out whether they're foreign keys, figure out how to map the foreign keys. All those things essentially are delegated to somebody else's job, it's usually somebody who's got an expertise in the mapping components. So you have one person or a very small part of the team that deals with the mapping. Everybody else on the team sees Java objects directly.
7- How hard is it to learn JDO?
You can be productive with Java Data Objects in less than a day. There's a very small number of APIs there required. There's one method for example that makes an object persistent, that is you take a normal object that you've developed and you want to take an instance of that and make it persistent: that's an API. There's another API that deletes that instance from the data store and there are 3 APIs that deal with transaction completion, there's begin, commit and rollback. And you don't need to deal with statements and executing statements or any of this sort of nasty parts of having to deal with the database because all that has been encapsulated by this layer that is provided by the implementation.
8- What's the difference between JDO and serialization for storing complex object graphs?
The difference between JDO and serialization when you're storing complex object graphs ? First let me describe the similarities. In both of them you identify an object that you want to store and you can call that a rude object or a basic object. It's usually a business object in any event. And with serialization you say I wanna store that object and you give it a stream to store it into. In JDO, it's very similar in that you take that object and you say I want to store that object and I want to use JDO to store it. In both cases, there's a single API and the entire closure of the object graph gets stored into the database or to the data store. The differences are that in serialization, the entire object graph is stored as one thing. And in JDO, the object graph is stored as a number of distinct things that can individually be retrieved later and operated under transaction control. So it's a matter of granularity. With serialization you get one big thing and if you want to have two different objects that might have some shared components, serialization does not reeally give you a chance to split them out, you've got to store the entire thing as one blob if you will.
9- Using JDO, how does one make a java object persistent?
There are several different techniques. One is to take the class file and run it through a post-processor or an enhancer and that's the technique that is going to be shipping with the Java Data Objects reference implementation. Another technique would be to take your Java source file and run that through a pre-processor and then you do a normal compile and what resources a class that can be used as a persistence capable class. A third alternaticve is to use a framework, a tool, that would allow you to define your object model and your business objects and that will generate the persistence capable class directly out of the tool. So you really have a lot of flexibility based on what your tool vendor wants to do for you. In fact I know one tool, the tool that I work on, the transparent persistence mapping. It allows you to take a schema from a relational database and generate the persitence capable classes directly from that.
10- When would you choose post-compilation of JDO versus the other options?
Often it depends on what your objectives are in the project. If your objective is to take a relational database or a database from mainframe environment or something like that and present those data from that database as Java objects, you'll typically do a mapping from that database and your tool will provide the persistence capable classes for you directly. On the other hand, if you're developing from a Java model that you've developed yourself or developed from some kind of a non persistent framework, you typically have the Java classes that you've developed and you've got the Java source files and you can choose whichever technique is more suitable. So you really have a lot of flexibility in terms of what you want to do with your classes. The technique of taking the class file and running that through a post-processor allows you to take pretty much any application domain class and store those instances persistently in the database.
11- Whats the difference between a Java Data Object and an entity bean?
The difference between a Java Data Object and an entity bean is that they're both representing specific instances or application type things and the data source. The difference is that Java Data Objects are designed to work in many tiers of the architecture. Entity beans are really focused on the application server components so they don't relally run anywhere. Another big difference is how we decide to use them. In entity beans, there are now 6 classes or interfaces that are part of the specification including the home interface, the remote interface, the local interface, the actual bean itself and then you've got the deployment descriptor which has to be in sync with all the other components. So it's a fairly heavy weight piece of equipment to deal with. On the other hand, Java Data Objects, the classes are just classes. There's nothing else except for a piece of information that you have to give your tool this is I want this class to be persistence capable and that's the only raw meta data that is required. Everything else that you might want to annotate the class with for persistence is optional and it's usually for performance purposes.
A pure copy of ADO (1997 Microsofttechnology)
A think , Sun will try to remove this slow layer called JDBC by an another one Access Data Object ( yes i own a 2ghz now !)
Of course Since J is everywhere and means nothing , replace it by A and then you obtain Access Data Object. .
Not very creative stuff ! Anything else ?
Anything else ?
It's been a while since I've done any ADO but unless I've forgotten a lot ADO just provides an OO interface that looks vaguely like JDBC - Connection = Connection, Recordset = ResultSet, Command = Statement etc.
JDO is a step up from that in that it allows you take your domain specific objects and persist them (almost) transparently. This means that you only have to deal with objects relevant to the problem at hand.
Are you sure, you know what those MS technologies are. I am, as I had to migrate some Access files recently. Both DAO (this is the 1997 tech) and ADO are merely connection APIs similar to JDBC.
JDO is, as you can read (1 Gig of THX, Yann) a persistence layer, that just needs to "be switched on". Perhaps Sun's naming is not very original, but it's much closer to the point than MS ADO (they are not really data objects).
The JDO concept looks very interesting. But I still don't get one thing, is JDO also support/manage object caching
(just like in entity beans) besides persistent scheme?
As a footnote:
ADO is more than just another DAO, etc. ADO is "IDispatch"-able which is important in the MS world. IDispatch types are usable in the scripting (late-binding) realm (as in ASP). It allows for the easy transport of complex types across processes.
And here's the end of it...
12- How can JDO be used from EJB?
The relationship between entity beans and Java Data Objects is that you could use Java Data Objects as an implementation strategy for entity beans, that is there will be some vendors who use Java Data Objects as the implementation for container managed persistence and it's always an option to use Java Data Objects as an implementation for bean-managed persistence because the definition of bean-managed persistence is that the bean itself takes care of how to store the information in the database. In that perspective, we've got a delegation model where the entity bean would delegate to the JDO instance for actually doing the work of storing instances in the data store. Another important point is that with Java Data Objects, not every instance needs to have a remote interface, needs to be remoted so in a typically complex application, it might be an order processing application, the entity bean which represents the order might delegate to the entity JDO instance and that instance might do some very complex object interactions with other helper objects that are all defined as part of the JDO specification and just the results get returned back to the entity bean so you can have your choice of a very lightweight JDO implementation and the more heavy weight remote interface that you get with entity beans so with a combination of entity beans managed persistence entity beans and JDO, you can get the best of the two worlds. Another way of using JDOs in a container environment is to use the JDOs as the implementation for session beans so you can basically avoid using entity beans entirely if your application is [...] to that kind of implementation, so you can hide the implementation of using JDO behind a session bean facade.
13- How does the process of designing complex datamodels in JDO and entity beans compare?
With JDO, you start with a Java object model and it can be as complex as you like, you can use inheritance freely, you can use any other modeling construct of sets, collections, lists, arrays, vectors, any of these kinds of relationship modeling concepts can be used and with entity beans, you have a much more constrained model to work with and as far as the use of use is concerned the entity bean, part of the challenge with entity beans is to getting all the components to actually be self consistent and it's a ready made opportunity for a tool to help the user developing and keeping these things in sync. With JDO, the classes, as long as they can compile with their driver compiler you can run them through the post-processor and you can generate components that can be stored in the data store so it's a much simpler process.
14- How can JDO be used to implement pass by value from an EJB remote method?
JDO objects can themselves be used as value objects so you can pass them by serialization techniques across their remote interface and assuming that the class has a definition on the receiving side, you can instatiate and use them as value objects directly so the trick is in passing them across through that serialization interface of the bean environment across RMI/CORBA into the receiving client and of course one of the biggest challenges there is to figure out how connected you want the objects to be. So if you send for example an order object, you probably want to send as values the line items to go along with that and you may or may not want to have the part objects that correspond to each of the line items and you may or may not have the inventory control associated with the parts so the challenge is really figuring where to cut the graph of connected objects and send them as components or as objects across.
15- When should someone choose JDO over entity beans?
The choice of whether to use Java Data Objects or entity beans used to be a lot simpler when entity beans only had a remote interface, you could very easily say if your object requires remoteness then you could use an entity bean and you can still implement the behaviour of that entity bean by delegation to the data object now in 2.0, it's a little bit trickier because there are local objects and the local entity beans have a part of the characteristics of the JDOs except for some of the life cycle events that are a little bit more intrusive on the application programming style of the entity bean programmer. But to the same extend as you can use JDOs as implementations for session bean facades you can use the local entity beans as implementations so it really has to do with how complex you want your object graph is and if you want to support inheritance directly and it's really more of a complexity issue. Another factor in choosing whether to use JDOs or entity beans is the complexity of the development cycle. With JDO, you basically start with a Java object model and you can test that collection of classes that you developed without any persistence at all. Next stage of testing is you can take the persistence manager and you can try storing them to the data store again without any entity beans or session beans in the picture at all. You're just testing your business logic. The third stage is once you've got your business logic working the way you want it to now you can introduce your session facade and delegate some of the business methods from the session beans into already tested code as in the JDO. As far as I know, the process of developing entity beans is you have to run the entity beans in the environment they're going to be deployed in, that is there is a container that has the appropriate callbacks and that simulates the entity bean and that's the environment that you have to test. You can't test the entity bean outside of the operational environment.
16- Can JDO be used to automatically create a distributed shared object cache across a cluster?
JDO can be used in implementation of a distributed cache across a cluster of application servers and based with the technique that we use in JDOs is called the non-transactional rate, that is to say that you have a cache of instances managed by a persistence manager and you have a separate feed that is used to update that cache. The separate feed typically contains the identity of the objects in the cache, the new value for specific fields and you basically are trying to update the cache to reflect the reality of what the database has as opposed to the more traditional approach of using the cache to actually update the database. So in this case, we're using the cache as a hot repository of information that you know is correct and may be a little be out of date but is known to be correct and is known to come from the database so we don't even go back to the database with this information, we're assuming that the information is correct when put into the cache. Now what's really interesting about it is what you can do with that information in the cache. You can use it as a hot way to navigate from one instance to another, to for example go from an order to other line items, and once you have line items you can go to the parts and let's say you want to update some part of that graph, you then use a different persistence manager and a different connection and because you know the identity of the objects in that hot cache, you can get a version of the instance to update and in a different transaction, make that update and then commit the change to the database. Now in the fullness of time, the change you committed is going to come around and will be part of the feed that will update the non-transactional data cache.
17- Will every JDO Persistence manager support a distributed shared object cache?
This is an optional feature so not every JDO implementation will support it but it's expected that most of the implementations for the larger scale application servers will implement this non-transactional feature.
18- How does JDO solve optimistic concurrency problems in standalone or clustered environments?
JDO addresses optimistic locking as a core part of the technology. But we don't try to do something that the data store refuses to do. So in environments where the data store for example Oracle serialisation already has a notion of conflicting concurrency transactions, JDO would operate very well on that environment just by calling the different data store transactions. But other data stores that have the notion of locking and holding the lock for the duration of the transaction can do very well with the JDO versions of the optimistic transactions and the way JDO defines it is that there are no data store locks held on the instances until transactions commit. At transaction commit time, the transactional instances in the cache are compared against the instances in the data store to make sure that nothing has changed in the interval and if everything is ok then the changes are committed to the data store. So we don't try to do something that the data store does not want to do so we cooperate with the data store. JDO is an API that attempts to present the view of the data store and does not try to override the policies of the data store.
19- What is the state of JDO support today?
We're basically implementing the reference implementation and the test compatibility kit that will be required to claim JDO compliance by vendors. There actually have been a number of preview implementations. A number of vendors have announced support for JDO, some of them are shipping preview versions that are available for download, some of them require you to sign some paper thing, you're not going to use it in production but so far it's not at production scale or a production implementation yet.
20- What are some of the obstacles to wide spread JDO adoption?
The biggest obstacle right now is getting the standard itself written and approved, the test compatibility kit written and the reference implementation completed and shipping. It seems that the community is really anxiously waiting for this technology. There are a number of vendors and customers who are really demanding it and who are just waiting for the specification to be completed before we can go full out.
Thanks for the transcript - Yann. That was very helpful and much appreciated.
i am crawling through O/R and JDO for the last 3 days now and it seems not so easy to chose the right approach.
Of course i wanna do what most people want to do. Make my object persistant in a RDBMS. And yes, cause the design approach of RDBMS and OO is totally different, there are problems which should handle each implementation.
Right now i am evaluating objectmatters product VBSF. Are
there any comments on this product? Is there a JDO interface planned?
Please reformat the interview clips to enable us to listen the entire interview continously. It's very time consuming this way.
The next batch of interviews filmed at Java One this year will be formatted in both clips and in continuous format. Unfortunately, we can't re-format the old interviews.
I was told that there was a BOF at JavaOne this year where the EJB spec leads pretty much trashed JDO.
Did anyone attend and make notes of exactly what her main points were? I was wasting my time at one of those sessions on "patterns" which were mostly book advertisements.
I attended the Enterprise Java Beans Architecture BOF-1654, Wed 8:00pm. I didn't take notes, but here is my recollection:
They did not trash JDO. They described upcoming features and asked only very narrow questions about peoples needs. For example: Do you need inheritance for CMP? This for me was the closest to a question that separated those who design their business objects using the full power of OOP as opposed to those who design their business objects so they can easily be stored in an RDB. My reading of the nods of heads was that 1/4 to 1/3 of the audience do not constrain their business objects. Linda DeMichiel said that in an earlier presentation she asked the same question and no one raised their hand. (Again, this is a very narrow question, so I don't know what to conclude from the answer of "no hands raised").
Mark Hapner said that the J2EE standards could change to meet the needs of programmers. I assumed he meant that anything that became widely accepted would be pulled into the standard. I think this comment was in response to a JDO question.
I agree with the other two posters. I would like to hear what Craig Russell has to say. But my home Internet connection is too slow and streaming video is frowned upon at most of my work locations. Why can't these be printed in a text format that can be read instead of listened too?
I'm also would like to see the text transcript. I'm able to listen what Craig says, but English is not my native language, so it's would be far more comfortable reading transcript than to concentrate attention on extracting value from hearing....
You have made your case and we plan to offer text answers to the questions that you can link to from the same UI as the video. We apologize for your inconvenience at home, we thought that the 56k version would be useable by home users.
Thanks you Yann for providing your own transcript for the other members. :)
My pleasure, Floyd.
I've been remotely following JDOs for a year and a half now so I have some sheer interest in actually getting Craig Russell's view.
Yann - Thanks a zillion! I just finished reading the first half of the text post made by you. I wanted to be the first one to thank you and here i am! I'm sure that there are many poor souls like me who can access only a text version. I guess I don't have to explain the problems involved one more time. Thankfully, floyd seems to understand the problem(this time around)! :)
why dont u guys just discuss the JDO now .. enough of this cribbing abt text and have no modem and all that shit ..
I wathced (!) the video with Mr. Russel with great interest. Especially the comparison with Entity Beans exited me. Having worked with CMP/BMP for awhile now, and knowing all the loopholes and anguish the Entity part of the EJB spec, and the large dependency to different descriptors/interfaces etc. is, the statements made in the interview just make me wonder if this is just a way of replacing Entity Beans? After all he says that Stateless Session Beans (the really good part of EJB) can be used to facade JDO, plus that you can implemment JDO in a clustered application server environment....
For your information, there was a flame war between Ward Mullins (Thought Inc - CocoBase O/R tool maker - CTO) and Craig Russell in a different thread (http://www.theserverside.com/discussion/thread.jsp?thread_id=771
) a few months ago that I found very interesting for the context and certainly technically challenging if you are interested in the merits and flaws of JDO in its current version. FWIW.
Now I have a remark. I may be wrong, but it seems to me that with EJB 2.0 the added value of pure O/R mapping tools is really decreasing due to the (yet incomplete) EJB-QL and object graph support with CMP relationships. Is it right to assume that they amount to:
1- An EJB-QL generator;
2- A caching and persistence engine (including synchronization);
3- Advanced mappings enabler ?
IDEs now fully cover EJB-QL so the real added value would be in caching/persistence and synchronisation algorithms as well as exotic mappings. Also, some O/R mapping tool vendors like WebGain with TopLink 4.x already support JDO and allow for a mix with entity beans as advocated by Craig Russell. These tools are *expensive* and sometimes cost more than the most expensive application servers around. So my first question is about whether such tools are still justified as far as real ROI is concerned.
I have another question that has to do with ODBMSs or object-oriented approaches to manage relational data. Relational database management systems manage data: therefore they are best qualified to optimize requests that deal with data using query pre-compilers, advanced indexing and caching. SQL is an advanced, simple and standard query language that is heavily optimized. Why is there so much energy devoted to turn RDBMSs into ODBMSs ? From this perspective, why would I want entity beans ? I know the controversy about whether to use entity beans at all has already been discussed before but I could not find any answer to this specific point. The usual arguments are about network latency, portability and maintainability of code which are all arguable (especially for the maintenance of the O/R mapping layer). It looks to me as if people only want to have entity beans for the sake of "bringing the object paradigm to SQL" or, more down to earth, to create a potentially artificial need in order to sell products to fulfil that need.
I'm honestly not trying to troll the thread, I'm genuinely asking myself those questions and lack the knowledge to objectively provide satisfying answers. I have worked with entity beans and data access objects, used O/R tools, read a few books, read the Sun specifications carefully and I am still unable to say whether the costs in time and money were justified. On the other hand, in several projects I've been involved in, we chose not to use entity beans nor O/R tools, and if I was to architect another system at this time, I would still not do it. In any case, I just do not find it satisfactory to use entity beans just because Sun advocates it. The same will apply to JDOs.
I agree with you mostly. When it comes to writing applications that need to optimize performance (database I/O i a major factor) and move large amounts of data, I would always resort to hand-written SQL, and package my Java objects from that by hand. My second choice - and the first one for applications that do not have complex data access requirements - would be a nice (and affordable!) O/R tool. If I cant avoid it, EJB (2.0!) will do, although its not really objects - no inheritance, instances are mostly restricted to the EJB container, all that redundant interface hacking, etc...
JDO fits in well for me, provided there is a good and affordable implementation.
I am one of those lot who is very eager to lay hands on JDO. I Worked with EJBs right through EJB1.0 and never was satisfied with Entity beans...I always loose my Object Model when I use the entity beans...it seems so frustating for me....
In my last projects, I used DAO with Session facades....And right now I am in the process of architecting a new system and am very eager to use JDO. I am looking for a good JDO implementation that is available today....I tried the TP that comes with Forte and could never make it to work....but I am stil hoping to use it....
just my $.02 thought...
ObjectFrontier recently released the Beta version of FrontierSuite for JDO at JavaOne. FrontierSuite supports J2SE, J2EE (EJB 1.1 and EJB 2.0) and now JDO. FrontierSuite is a mature persistence framework and I would advise you to have a look at our JDO tool (http://www.objectfrontier.com/Products/FrontierSuite/fsjdo.html)to
see if it meets your requirements. There is free download and we are offering the production copy at $ 109.00/lic for development for a limited time only. FrontierSuite provides a design, development and deployment environment and also provides client, process and JMS based distributed caching.
For the question "Why JDO/CMP instead of JDBC/SQL?":
Developing a persistance layer that is reliable and maintainable is not trivial.
+ There is no compile-time checking of your SQL - no typesafety.
+ It is tedious repetative stuff - crying out for generation.
+ It is sensitive to DB schema change (no matter whether you use stored procs or not - the schema-assuming SQL must reside somewhere.
+ The relational model is arguably not as intuitive as OO paradigm (obviously most DBA's or experienced SQL developers would vehemently deny this)
Having said that, its obviously possible to do. It has been done (with varying degrees of success) for many years, by many people.
Basically for me it comes down to a question of who is the better persistance layer programmer. Someone at Borland or BEA or Webgain (Toplink) or Thought Inc (CocoBase) whose *business* is all about persistance.... or John Doe programmer for Acme eBusiness corp, who is writing a Bonds Trading system.
My money is not on John Doe...
Should a valuable (expensive) developer with intimate domain knowledge be wasting their time writing/debugging/maintaining/supporting something that can be generated or managed by and O-R tool?
My opinion is buying this stuff is better than building.
When using an O-R tool (be it CMP or whatever), there *may* be performance hot-spots where the tool doesnt suit the operation (usually however, the performance is adequate).
However, rather than throw out the tool entirely, along with its productivity/reliability/maintainability savings, it is better just bypass it for those limited cases where there is a performance hot-spot (apply the 80-20 rule).
My advice is to start with CMP/JDO. Once a performance problem is identified and *quantified*, then examine the options. The problem may be solved by tuning. It may be solved by turning to an experienced SQL/JDBC developer and hand-writing an optimised solution. But only solve the problem when one has been identified.
For the issue "JDO vs CMP2.0?":
Here I think the issues are less clear. It is obvious that in the enterprise arena, these two technologies compete. Both are a bit too new for many people to have a good understanding and experience with both - and consequently there is a lot of debate (which I am keenly interested in).
In general, I am seeing some misconceptions about CMP2.0 and JDO that are skewing the debate and perhaps creating some false expectations of JDO.
Most people have had a hard time with CMP. And, to be fair, CMP is far from perfect. However, I do find that often a lot of these people are actually referring to CMP1.1 (when the capability and performance clearly did not exist). CMP2.0 is a huge change for developers and container vendors alike.
My feeling also is that in general, the tools for CMP have been rather crap - or non-existant. Better tools, that managed the O-R mapping, DD's, etc would make everyone's life easier with CMP.
Its also true that the containers themselves have not supported the advanced O-R features (or the performance) that the likes of Toplink/Cocobase have. With CMP2.0, this is changing - but there is a way to go yet.
(E.g. WLS7, on paper, supports a lot (but, by no means all) of the performance features that Toplink does - whether the implementation quality is the same is obviously debatable).
Some of the common judgements:
JDO is simpler:
To a large extent this true. There are no extra interfaces, DD's, no lifecycle rules to follow...
However, for most non-trivial applications, you will hide your business object implementations behind interfaces. Instantly, you have an interface (component interface), a factory (Home interface) in addition to your implementation class (Bean). The resemblance to CMP is obvious.
Also, the deployment descriptors are similar - the O-R mapping must be specified and stored somewhere (though, obviously there will be fewer files than CMP2.0). Here is where the tools have so far let CMP down. (But this is changing).
To a large extent, inheritance and polymorphism is easier in JDO - assuming that you have no interfaces protecting your business object implementations.
If you develop to interfaces, then you must have seperate factories (Homes), interface inheritance, and implementation inheritance.
Again, the resemblance to CMP is obvious.
Keeping the same PK, different Homes, and inheriting the component interface, and bean implementation gives you inheritance in CMP. Support for multi-table mapping is also being supported by more containers. (the only thing that is missing is the mapping inheritance - however WSAD now automates this nicely. Support in the standard would be nice).
While it will be true that perhaps most of the O-R vendors will have a JDO implementation, it is a little optimistic to assume that any old JDO implementation will compete with the likes of the mature O-R solutions for performance. Caching and tuned database operations are a must.
In general, CMP2.0 has made performance a lot more achievable for the average container vendor. Whats more there is nothing stopping me switching in Toplink as a CMP engine if my Appserver's CMP container isnt cutting it. (only true for Websphere / Weblogic as far as I know, with Toplink). If you are already forking out for an appserver, then your persistance comes for free to some extent - you can choose to spend more money if your problem demands and your budget allows.
Life Cycle, Threading Rules:
Its true that EJB in general has a lot of restrictions placed on what you can and cannot do in a bean. Certainly JDO has little of this baggage. However, these rules are there for a reason. It clearly defines a lifecycle contract that allows the container to manage it resources effectively (essential for scalability) and the developer to do what they have to do. While JDO does not define this, the implementation will need to do something to manage resources (passivating and instantiating objects) - there is no way thay instantiating the whole database contents in the JDO cache is viable. If the lifecycle is not standard, then, to some extent you are leaning on the particular JDO implementation.
To a large extent, I think CMP2.0 still has a bad name hanging over it from CMP1.1. The poor tool quality hasnt helped manage the complexity either. JDO also has had its share of controversy. Its taken ages to complete - and there have been some very public, bitter arguments between the spec team and the O-R vendors over what the spec should look like.
I hope that this competition between standards is resolved one way or the other. One of the key things that made Java attractive over other technologies was that there was an absence of duplication in the standard. There is usually only one right way to do things. (Anyone who remembers the number of different ways to implemement thread locking on Win32 will know what I mean).
Anyway, I hope that the debate in forums like this will help minimise the confusion and dispel the doubts surrounding both technologies.
I agree with most of your post or at least I understand your position. I don't mean to throw O/R tools away though. I'm by no means an O/R specialist and I'm certain people with this expertise know what they are doing when it comes down to synchronisation, serialization, clusters, caching, replication and all. I'm sure all of that works with a minimal amount of bugs.
However, I remain to be convinced as to the concrete benefits you can get from a high-end persistence tool for a vast majority of projects. I've seen such tools chosen on previous projects to cover features that were not possible in CMP 1.1 (and god knows there were many) and for expected performance reasons that were not really needed.
You mention that "once a performance problem is identified and quantified, then examine the options". I could not agree more and that holds true for other criteria than performance. More often than not, the session bean facade with unit-tested DAOs will be satisfactory, easier to implement and far less expensive than O/R persistence, less resource consuming and arguably as maintainable as an O/R mapping and persistence layer. In short, it will be adapted to a vast majority of enterprise Internet applications. In some cases, when performance is critical, I fully understand that the caching and synchronisation features of O/R persistence tools may be the solution and that would be the kind of case where I would recommend using such an architecture. Note that it does not automatically translate into resorting to entity beans. However, you'd better be sure that you really need those tools because as I said, they are awfully expensive (at least for the big players). Their learning curve is far from negligible, even for seasoned developers too. Now CMPs 2.0, despite a much better performance and quality of specification, are still very inferior to SQL in terms of functionalities due to the limitations of EJB-QL (even though that is supposed to evolve for EJB 2.1): in many situations, one can write SQL statements that do the job with only one network call when entity beans will require many and I'm not even using stored procedures (I don't recommend that by the way for maintainability). BMPs are just not really an option due to the limited optimisation possibilities. I agree though: entity beans work and may be satisfactory in many situations. My gut feeling is that they are less performant and maintainable than DAOs. But I guess there are as many different opinions out there regarding those issues as there are architects. After all, the only requirement behind any architecture is that it fulfils the customer's needs. So who really cares if it uses CMPs, BMPs, stored procedures or DAOs ? All I'm saying is that I'm not going to use entity beans just because it's recommended in the specs and has a life-cycle nor because it's object-oriented.
Thank you for your rich reply by the way.
I think JDO is great, specially after working with Entity Beans. I am always looking for TopLink like solution which makes far more sense when dealing with objects that represent ur back-end data.
I think JDO is a TopLink like solution but what it would do is provide a standard API for all these tools to implement. and thus i dont have learn ten different O/R mapping tools.
Kapil, I think that the assumption that JDO will be a "standardised" version of Toplink is perhaps one of the false hopes that exists about JDO. I dont think that JDO defines enough (yet), to be a vendor neutral O-R spec. However, I'm hoping to change my mind on this.
Out of interest, what in particular has made you feel JDO is great after using Entity Beans?
I too agreed with most of your points - especially regarding cost of high-end O-R tools, BMP, stored procs, the limitations of EJBQL (up to a point) and the over-kill nature of caching O-R solutions for some apps.
However, do you really think that DAO's are as quick to develop and maintain as CMP2 Beans?
I guess it depends on what you end up doing with your data... but the fact that you have no persistance code at all to debug & maintain is a big plus, in my mind. (You dont have to worry about non-typesafe ResultSets, nor lack of compile-time checking of SQL nor the added worry of maintaining database vendor independance on top of everything else. If the schema changes, you just change your mapping, and you wont find that months into production, some rarely executed code path throws a spanner in the database because of a dormant SQL bug).
In terms of CMP vs DAO performance, a lot of the DAO's I have seen look remarkably like Entity Beans (methods like load, store, create, etc...). Obviously the performance issues are going to be similar to BMP here. Otherwise, in order to optimise certain operations, the DAO's swell to accommodate each optimised use-case.
Whereas, using CMP, you use the container's optimisations and you have the flexibility to deploy the same bean using a number of different strategies (write once, deploy many) to satisfy different caching or transaction isolation requirements. Given that if you have a session bean facade, then you already have a CMP container at your disposal for no extra cost, so I dont see Entity Beans as a particularly expensive solution.
You may want to check out this tool
(perhaps you already have...). I had a quick play with it - you can see my comments. Its probably good enough for a lot of simple applications.
Hi guys! I think that there is some confusion around about what JDO is really about (and is not). JDO is in no way an O/R mapping technology. JDO is indeed not focussed on any particular persistence store. For JDO to be a standardised O/R mapper technology with Toplink-like capability, the way objects are mapped to tables in the RDBMS should be standardised as well. JDO allows defining in a standard way WHAT to persist, not HOW to persist. As a result, each vendor will come up with their own solution when it will come to do "the real work" (the object persistence stuff) with the side effect that moving from one JDO implementation to another will require re-defining all the O/R mappings. I hope you are not all disappointed (I am!). Or maybe is it the first step and that the most widely used persistence stores will be addressed in the future. However, JDO took more than 2 years to come up (I remember talks at JavaOne 2000) and I don't think that Java can wait another two years to define a standard O/R mapping solution. Persistence in relational databases represents more than 95% of the object persistence needs of developers. So I don't understand why SUN is developing a standard for "any" persistence store if in the nominal case developers don't have an end-to-end standard solution (i.e. to have both standard interfaces and classes to be used by application programmers when using classes whose instances are to be stored in persistent storage AND a standard way to specify object persistence when it comes to relational databases).
Chapter "Persistence Best Practices" in Ed Roman's book, "Mastering Enterprise JavaBeans" supports what you're saying. Note that my definition of a DAO is that of an object that translates business methods into SQL so it would probably be better called SQL proxy and has certainly nothing to do with BMPs. Also note that I talk about unit-tested DAOs.
I guess it's a trade-off between maintainability - CMPs are probably better in that respect after all and I'm not going to contradict Ed Roman!! :) - and SQL ability to do powerful operations in just one statement and one network call.
I haven't used the tool you mentioned before. The quantity of tools there are is just too much for me to cope with in an objective fashion. So I'll stick to IDEs for now. :)
<However, do you really think that DAO's are as quick to develop and maintain as CMP2 Beans? >
I think CMP2 Beans can be developed much quick, if you haven't encountered any problem, however, I usually do and it at least takes one or two days to get an answer from the forum and sometimes it takes longer but find out that it is not solvable. For example, how can you define a EJB QL that only retrieves the first 10 records from a query and how can you count the number of records in the table?
The above problems might be solvable but it is really frustrating to develop software in this way and that's why I don't like using entity beans and I think this could be the reasons why other people don't like entity beans.
Just my 2c :->
BTW I have only used J2EE/EJB for about a year, so other experts in TTS may not have encountered so many problems during the development. But I just want to point out the reasons that I think why so many people don't like entity bean.
Just my 2c again!
That is part of the things that EJB-QL cannot do at this stage. I am quite confident though that a function like count(*) will be implemented in the future releases of the EJB specification. But as to the other request (retrieving the first 10 records of a result set), I doubt it will ever be as I don't thing it is part of the SQL standard. That's where you can use DAOs, BMPs or O/R mapping tools.
I was really pretty excited about O-R mapping layers and JDO in particular. After years of dissatisfaction with EJB I was optimistic. So I downloaded a copy of Castor ran their examples against my Oracle instance. All was well. They I tried to tackle my first real world problem with JDO. Creating values in a drop down listbox, that should be nothing for a O-R persistance layer right? Nope. Filling the drop down required a three table join, pretty common stuff. Had to get a list of brokers, validate against an employee table that they still worked here and join it with a region table to get only the brokers that worked in the state I was interested in. Castor fell apart with that first real world implementation because it was designed under the simplistic notion that an object would map to a table and I wouldn't be putting anything really all that complex in the where clause. The other option was to write embedded SQL in the OQL which in my opinion negates the whole reason that I want to use a mapping layer. For this simple drop down fill query by using an O-R mapper I would have the following which will kill performance;
1) Interpret the OQL
2) Generating SQL
3) Creating a statement and sending it to the database
4) Parsing the ResultSet against the XML mapping file
5) The database will be executing a statement rather than
a much faster prepared statement.
Now for something really hard. Opening a new customer account. This operation touches around 15 tables. We have to perform different operations on the database depending on the type of account that the person is opening, the state where the account is opened, if they are an existing customer in a different line of business. Joins with the rep table and inserts so the reps and the people that referred the customer to us all get credit and commisson.
These are the kinds of things that we have to do in the real world and I don't see JDO or EJB-CMP actually working in any but the most simplistic applications. My preferred route is going to be hand written JDBC abstracted from the web facing front end and if remoting is needed using an EJB session bean.
I'd appriciate any comments, pro or con.
I think your posting highlights the great difference between *update* requirements and *inquiry* requirements for a system. Filling the drop list and the 15-odd things you mention for customer account are enquiries and often need joins, filtering, and ordering, and complex caching policy. On the other hand, enquiry does not need concurrency control (optimistic/pessimistic) or transactions. Once you have all the pieces for display to the user then the update is usually to one or several tables, in some sort of master-detail relationship. This is what an update product, such as JDO or EJB entity beans, should do well. For the enquiry part you would build your own product, based usually on static SQL. The complexity of enquiry precludes using an update product such ones that are currently available. The interactive use pattern is (i) get the master-detail objects through, e.g., JDO, (ii) fill in all the drop lists and human-readable descriptions of surrogate key data, using the enquiry system, (iii) let the user edit the records, (iv) validate using, again, the enquiry system, and then (v) save back the master-detail objects through JDO.
I agree with your first point. For many who have already invested in a J2EE 1.3 server, the case for also buying an O/R mapper is weaker, thanks to the capabilities of the EJB 2.0 CMP spec.
As for your second question, I don't think the need to turn RDBMS into OODBMS were 'created' by entity beans, this need simply comes as a result of people wanting to use the more maintainable object oriented programming approach, while still keeping their data persistent in an industry standard data store (ie: RDBMS). Entity beans and JDO are simply the latest parts to fulfill that need. Before entity beans, object databases and O/R mappers tried to filll that need.
It is good that you are choosing the right tool for the right job, we should all do it that way. However, if you want the benefits of an OO business model, then Entity beans/jdo and similar technologies are there to fill that need, not to 'create it'.
What are the benefits of an OO business model when it comes to RDBMS persistence ?
This sounds like an obvious question but... I have the impression that in essence the approach is somewhat flawed.
Hi, Floyd & Yann
JDO seems to be a hot topic in several publications. I am in the middle of reading an article on the topic in the Feb. JDJ (it seems to be a "real-world" case study).
I understand the benefits, mainly by this I mean saving the time of writing a data access layer. The entity EJB layer adds the benefit of remote access, as well as distribution / clustering, distributed transactions etc.
What I fail to see is a solid and viable methodology. What is the methodology when using JDO to analyse/design/implement/maintain a given project?
For example, when using a mixed environment, you do the normal analysis, come up with some use cases, and start the design on each layer. You use the experts for each layer. This way results a solid data layer design with all the benefits of the relational technology, a solid business logic layer, built on proper OO techniques and a mapping layer fetching or storing data from/to the data management system.
You have a clear data model (mostly independent of the application), you have a clean business model (calling interfaces to access the data model) and a "glue" layer. Each layer is optimized for the technology it's using. There is a blured area, which is exactly this data access layer.
So, what is a proper methodology when working with JDO or other O/R tools? The data model is completely out of the control of the database management, there is a mess regarding relational integrity, the resulted database can be accessed only via this particular layer, no other tool can use it or the integrity will be compromised. Or that is exactly where the tool vendors differentiate?
Your question stems from the fact that the technology is new and the different concepts are not quite clear. As Craig Russell said, there are at least three ways to use JDOs:
- As the container vendor's implementation of your CMP entity beans: this is transparent to you and should not appear in your design, only CMPs will be mentioned;
- As your implementation of your BMPs: in such case, the JDOs can be thought of as proxy objects and will appear on your design;
- As your implementation of your session beans: the session beans will be used as a facade for JDOs, no entity beans are required, JDOs sort of replace them and appear on your design.
Then of course, if your application does not use EJBs, JDOs will still be called from your business methods. You have to think of JDOs as delegation objects that deal with persistence and transactions but without the need for a container (someone correct me if I'm wrong). The way O/R mapping will be done behind the scenes is not specified by JDO but of course will be done. At this stage, vendors are free to choose their implementation, which may not be the case in the future specification.
In short, JDOs and other O/R mapping tools are wrappers around the database layer: think of them as proxies.
Thanks for your reply. Indeed this is how I understood JDO as well.
The point I was trying to make is that this implies basically a methodology that does not involve almost any data modelling in the process. You stop at the object model level, you mark some objects and fields as "persistent", together with the relationships between them.
My point in this case is that how feasible is such a complete data management layer agnosticism for an extendable and maintainable system. In the real world of continuously changing business requirements, the most important asset of many firms is their data. The fact that it is stored in a universally accessible data management platform shields the businesses from the faster-changing front-end technologies.
As many people mentioned, many JDO tools from the different vendors don't enforce any relational contstraints and/or make highly inefficient O/R mappings.
My question still is, what is a viable project methodology when developing with totally abstracted persistency layer? When a new release of the mapping tool comes out, it will have to come with sophisticated data and schema migration tools, and it will not be easy to access the database with other generic tools which do not use the same mapping technology or there is a risk of messing up the data.
I honestly think that any methodology is able to deal with that. Developing with an O/R mapping tool is not different from developing with any other kind of persistence layer. I would personally advocate the unified process with reasonably small iterations but XP or any waterfall methodologies are certainly fine. Actually, the methodology has nothing to do with it, it is much more a question of design. With object-oriented technology, go for OOAD. Upgrading the persistence tools is roughly the same issue: if it happens during the development, include it in the scope of your design (in an iteration if you're going through iterative development), if it happens on its own, measure the impact and modify what needs to be modified in your application.
Now your other issue is about whether your O/R mapping tool should be a single entry point to your data store. It depends upon your strategy and upon the persistence implementation. Note that this issue already exists with entity beans. Some constructors like BEA provide deployment descriptors such as 'db_is_shared' which informs the persistence tool that data may or may not be modified by another application with another persistence technology that accesses the same data store in order to avoid data corruption and allow for certain optimisations, notably with respect to caching. So I can only advise you to refer to the documentation of your persistence layer vendor to know how they deal with that.
Yes, indeed I noticed it has the same problem when using EJBs. What was interesting in the interview is that it's possible to do the mapping in reverse order: generate the data object layer starting from the data layer. My worry is that once you start using a transparent mapping tool, you have to treat your data model almost as a black-box. If for example you start building a backend reporting system based on different tools, when upgrading the mapping layer you might have to completely re-visit the reporting tool. The relational model's beauty was that it was aiming at isolating the data from the application ("data independence principle"), which is exactly the conflict point with the OO models, which merge data with code.
I think the debate over what's best is open. I have heard of a situation where an entity-EJB layer was attempted for a reporting application with huge data amounts and it failed badly. As you mentioned previously, I believe that when architecting any system, you have to have all the options open and use the best suitable technologies from case to case. It just does not make sense to read the entire database in the memory to make a count of the objects ;-)
There's no silver bullet. Upgrading or migrating a data model or an O/R mapping tool can never be 100% transparent just because your data represents business information. There is risk associated with all technologies. Developers, specification writers and O/R tool vendors are just trying to make it less important their own way, each way having its own set of advantages and drawbacks. Some are easier to maintain, others are more performant, others are more scalable. As you said, choose the technology depending upon what you or your client think the important levers are. In any case, it is good to have more choice with JDO.
By the way, I'm not sure you should think in terms of "what's best" but rather "what fulfils the requirements", which is a completely different approach and can save huge amounts of time, energy and money. :)
Why we think JDO is solution for everything.
It is simply O/R mapping tool which may be very helpful
for applications which use HTML-Servlet-JDO(JDBC)-Database.
We can also use it for BMP-JDO(JDBC)-Database pattern.
Now as for caching are concerned look for JDO
implementation which fulfill your caching requirements.
Also with time JDO is going to mature.
I think microsoft is going to have a similar solution: ObjectSpaces. There is a public newsgroup "microsoft.public.objectspaces" at msnews.microsoft.com. There is also a presentation:
"Advanced ADO.NET" 2002/02/11
At about 32:20 is the part of ObjectSpaces. At about 43:00
there is a code example.
Can someone *plz* tell me how I can watch this interview under Linux? Why are theserverside.com offering up Windows Media presentations anyway?
I came across this problem under Linux when they interviewed Craig McLanahan (sun/tomcat/struts). I looked into it and there is a media player project I think it's called mplayer (?) but I didn't ultimately have the time (read: was too soft) to apply the patches to a cvs checkout and jump through hoops of fire to get it to work.
I think the server side should offer the videos in some other format. I don't know, mmmmm perhaps something a little more cross platform. You can't even play them on NT because MS haven't released a v7 of media player for NT.
Linux is anti-american anyway, so nobody should be using it. ;)
I have a machine with two operating systems installed on it. It seems that I can't watch this video on either Linux or Windows NT.
Windows Media Player 7.0 is therefore not going to make it onto my hardware, and unless TheServerSide rethink their streaming media format support, neither will the interviews.
I suppose if I really liked seeing video buffering I would buy a TheServerSide sponsored operating system like win2k, use media player 7, then click my way through dozens of tiny video slices, each of which run marginally longer than the time required to buffer them (on 4 megabits of bandwidth I might add) but then again, perhaps not.
Would anyone else join in my request for an alternative?
This comment is not specific to this interview!
I like the way the interview is splity into slices which I can click on to get a specific answer to a specific question...
BUT please give us the option to watch the interview straight through from start to finish aswell!!!!!!!!!!!!!!!!!!!!!!!!!!