Discussions

J2EE patterns: Middleware data and object persistence.

  1. Middleware data and object persistence. (28 messages)

    Though EJB is the centric feature of J2EE architecture, it also has one big discord with J2EE OO concept - it's persistence.
    Commonly you interoperate with your database directly, using JDBC to provide persistence for your BMP beans. Thus most of your work is to hardcode the mapping between EJB object's attributes and relational SQL rows. So proverbial slogan "Enterprise beans allow application developer to concentrate on the business logic itself, while the complexities of underlying architecture are handled by the EJB server" do NOT works here..
    CMP solves this problem merely. That's cause mapping can't be shared between several e-beans and is implicitly done.
    So the answer is to differentiate between middleware and back-end data and store former in ODBMS, while the last remains in Warehouse DBMS. Next sections outline three main reasons of doing this.

    1.Incompatibility - Do not use JDBC at all?

    The main problem is at the turn of EJB's object oriented model and relational model of your underlying database. Lest look at the root of this problem and outline two basic OO concepts, on which your distributed Java applications are built:

    1) An OO paradigm uses a complex structures - classes which may encapsulate data as a base type or user defined structure, and a methods.

    2) Classes itself make up an hierarchy of any complexity.

    1st problem can be resolved by traditional relational model with a little enhancements (Support for methods as stored procedures and custom data types). This is what the object-relational databases are doing (Oracle 8i, IMB DB2 7.1, and other). But the second problem can't be resolved in two dimensional relational model framework which relationship model is only one-to-one/one-to-many. At the same time ODBMS being built on OO model 'understands' complex relationships such as sets, maps, trees and lists as well as any other.

    If you use BMP you always have to hardcode this mapping through JDBC.

    This method is labor intensive, error prone, requires SQL expertise, and does not have caching capability. So it will be much more appropriate to use ODBMS as storage for Java components. Thus you avoid the problem of Object-Relational mapping(O/R mapping) and get both rapid development and performance gains. That’s because lots of information stored in ODBMS does not need to be mapped into RDBMS. Next section explains why.


    2.Object (EJB/CORBA/Java) persistence - RDBMS is inapplicable solution?

    There are always lots of data like client session management (shopping chart for example) that has middleware nature. It means that these data has it's sense only for middle tier and persistence of that information is temporary
    (Persistent during serving client) - so there is no need to write it into back-end database.
    To be more concrete lets consider client shopping information. Back-end database should have only corporative level information such as information about sales that has been made, information about delivering products to clients, information about corporate partners and so on, but NOT client's shopping chart and personalization information. The component is an entity, which identifies business action/process, but not its result data, which is well identified by relational model. Thus for many of your EJBs ODBMS in middle tier becomes 'back-and' database. ODBMS being DBMS itself, brings all it's features including concurrency control, recovery and high availability to the middle tier. So if some client transactions will fail, recovery occurs only on those ODBMS which were coupled to application servers participating in these transactions. This protects your back-end RDBMS against frequent recovery caused by external client transactions, and speeds up your system(see next section).
    Thus ODBMS provides isolation level between EJB's business logic and back-end corporate information which is stored in back-end DBMS(traditionaly RDBMS). You can also work with this embedded ODBMS as you could do with RDBMS using SQL, furthermore you can use OQL extensions which are more natural for Java objects.
    But it's important to understand that though there could still remain objects that should store it's state in relational database, you don't always need to hardcode mapping through JDBC manually. This is what O/R mapping tools made for. Using them you always pay performance for rapidity development. Next section shows performance advantages of using ODBMS against RDBMS in middle tier.

    3.Performance - Use ODBMS as your middle tier database.

    ODBMS that is located in the middle tier is much more fast while maintaining object's state then it could RDBMS do.
    In RDBMS case you store information in different tables and to retrieve complete object(EJBs) state RDBMS engine will use several retrievals and JOINs on tables.

    Consider example when you have information about employees: employee name, status and employee's immediate superior. So in RDBMS it will look like this

    TABLE EMPLOYEES
    Name: Status: Superior:
    John 210 Bill
    Bill 300 Bob
    Bob 150 Joey

    When you need to know all John's superiors your query will result in retrieval based on primary key for every raw. So in this example there will be 3 retrieval. The same example using ODBMS:

    persistent class TEmployee {
      char *m_pszName;
      int m_iStatus;
      TEmployee *m_pChief;
    public:
      TEmployee(char * m_pszName, int m_iStatus, TEmployee *m_pChief)
      {
      // Constructor implementation
      }
    }

    In this case there will be no retrieval at all, because all references are stored as pointers to objects.
    Imagine the situation when you server is holding thousands transactions per minute. Commonly your application server is distributed across several machines during load balancing and the back-end RDBMS is located in it's own. So all these servers share one DBMS. Thus network traffic becomes the main bottleneck of your server overall throughput. Using several DBMS closely coupled to your application servers is the way out. This solution reduces the number of transactions between an application and a network and also allows data caching mechanism which greatly reduces the number of transactions between an application and a database.


    Summary:
    Remember, additional abstraction level(one more degree of freedom) yields many advantages to any architecture, which are not realizable at once(remember stubs in CORBA,DCOM wich brought local/remote transparency to the objects).
    There only two questions on it:
    1) Could YOUR architecture benefit from it?
    2) Will there be any performance or development/maintenance drawbacks as result of this abstraction?
    In case of J2EE server and middleware ODBMS the answers are:
    1) Yes, they are numerous.
    2) No, quite the contrary - there will be significant improvements to both.

    P.S.
    I plan to evaluate RDBMS(Oracle's 8i, NCR's Teradata) with JDBC and O/R mapping tool(Cocobase, Toplink) against ODBMS (Poet's or Versant's) using J2EE(WebLogic, Iona's iPortal)server and raw Corba ORB(Visibroker,Orbix) myself and to publish results.

    Any thoughts?

    Threaded Messages (28)

  2. Hi ,

    I just want to comment that I saw your conclusions and immediately after that you say you plan to evaluate.

    It's a strong belief of mine that unless you talk numbers, there's no way you can draw conclusions, especially in dilemas like "one is better than the other".

    Think that architecture cannot benefit from any technology. It remains the privilege of the end user.

    So, even if I've been doing Java for 4 years, I cannot say to somebody else that J2EE is better than MS$ or is better than Aapche + mod_perl. Or ODBMS is better than RDBMS.

    I simply don't have arguments good enough and at this point in time nobody has, not even industry experts.
    So, I prefer to say this is what I'm doing better, and I know exactly what and how, and you can expect me to do the job .
  3. Maybe and maybe not. Everything depends on how you access the data etc. If you think back, you may remember IBM products called Team Connection and Flowmark. Both these used ObjectStore as their database. They had serious performance problems. This is not to say that all blame was at Objectstores door step but when they rewrote the products using DB2 they scaled a lot better. Maybe, they understood RDBMS tuning better than OODBMS? I know some of the individuals involved and they are pretty sharp. I believe they did find problems.<P>
    I always compare a RDBMS and ODBMS to a saloon car and a formula one car. You can get a sports saloon and it can be very fast and you can still go shopping in it. Your mother (no disrespect mom) can drive it away and you're confident she'll be back. But, a formula one car is only good on a track. You need to be an expert to even take it out of the drive. An RDBMS can be safely used with little risk with most applications and you'll get a good result. This can't be said for object databases, for the right application you may get serious performance advantages over RDBMS but I don't believe this is true for most applications. I also don't believe that you can trust the tuning of one to 'most' people. People who really know these products well are few and far between and don't come cheap.<P>
    An issue you did not address was schema migration. When version two of your product rolls out, then you may find the migrating from one schema to another when using an ODBMS is more difficult than a RDBMS.<P>
    Extending a schema may also be more difficult when you need to do bug fixes etc on a production system.<P>
    I don't pretend to be an expert with the shipping versant and objectstore products on the market currently but around 3-4 years ago there were significant problems in these areas.<P>
    As an aside, Versant have an interesting product shipping, their VEC product I think. It basically acts as a middle tier database that also handles cache coherency in a cluster and handles fetching data into the middle tier from a RDMS back end database. Never used it, never downloaded it. But, it's an approach that I was seriously evaluating about 3 years ago.
  4. Costin,
    When I sad that I "plan" to evaluate, I meant full evaluation according to appropriate specifications. My assumptions on ODBMS performance advantages were made upon my own application which had rather sophisticated data model. It wasn't build in compliance with any existing benchmarking specification, so I can only affirm that my application with complex relationship model really benefited from ODBMS in terms that I explained, and may be It's not an exception.

    Billy,
    When I was saying about advantages of ODBMS against RDBMS I meant the advantages of OO model against relational itself. On the practice all depends on the concrete solution.

    Today ODBMS market is still far from being a threat to RDBMS's due to it's comparative immature. ODBMS market is only getting $1B, while RDBMS's is 1000 times bigger. Nevertheless the ODBMS market is growing rapidly: year-to-year growth rate of 50% is spectacular (read IDC reports).

    Most business applications today are well suited by relational model because their data model is rather simple. And as for performance, I can say that, of cause, brands like Oracle, IBM, NCR and some others will give a higher performance for these types of applications.

    There are a lot of comparisons reports on inet like Barry & Associates, Kelvin-dick-Associates which offer benchmarks on ODBMS and RDBMS for $$$ or even $$$$. Nevertheless no one of the ODBMS major vendors has a TCP membership so there are no official information on benchmarks.
    Here is an opinion from Objectivity relating TCP:
    "Given any benchmark definition, any vendor can go to great lengths to optimize the implementation of that benchmark, even to change the kernel of his DBMS appropriately. Some of this has happened with the well-known TPC benchmarks which, in any case, lack measurements relevant to many ODBMS applications."

    You also can read(for free) wallpaper from Objectivity about how their DB overperformed Oracle. To be fair, this report is far from being "Objectivity": test were made on Pentium 150 processor and 32MB of memory(on NT of cause). So Oracle just couldn't achieve it's real performance(My application was running P3 900Mhz, 256 RAM). Nevertheless its worth to read.
  5. I've worked with a pure ODBMS approach before (GemStone), and a pure BMP approach with WebLogic.

    Billy's analogy works quite well. With an ODBMS you can FLY like a formula one race car. The problem is that:
    - You can't do ad hoc querying easily
    - You need to know how to write / tweak your own indexes
    - Your tool to analyze your data is *Java*. You don't get to use the wonderful data admin tools out there (like SQL Navigator, DB Artisan, etc.)
    - Your performance settings / analysis tools are limited.

    So if you know (or are willing to learn) what you're doing, an ODBMS works well. I had to learn how to write my own tree-based and hash-based indexes. I had to learn how to write bitmap indexes. I had to learn how to write my own query engine. etc.

    Now Gemstone/J 4.x finally has rudimentary support for the above things so you don't have to write your own. Versant does as well. BUT -- when your requirements get complicated, you have to do your own thing.

    For instance, over-indexing your domain model can create transaction conflict nightmares. How do you reduce conflicts? You can't use the default indexes -- time to write your own pre-written identity set buckets and/or B-Trees with a specialized conflict management policy.

    What if you need to do a query on through a 1:N or N:N relationship (i.e. collection)? You can't use the built-in query engine in Gemstone. (Maybe in Versant). In an RDBMS a join would do this just fine. But for our case, we'd have to write our own nested-loop collection lookup, with some sort of keyed-attribute lookup per business object (using reflection on accessors or a pre-made attribute table in the root businessobject). For indexed cases, we'd have to use our identity set buckets (similar to an RDBMS hashed-join).

    All of this stuff takes time to learn & write, and the ODBMS vendors haven't been responsive with meeting these needs because it is *HARD* to create a universal query / indexing framework for all systems because you're given *so* much room to play with in objects. With an RDBMS you have much less room to play with -- SQL hides everything.

    I think the trick is to leverage an ODBMS if you truly have complex data configurations. Otherwise leverage an O/R mapper with cache, like CocoBase+Gemstone, Persistence PowerTier or TopLink+WebLogic. That way, your read-mostly data can be stored in the cache, and your transactions will be propagated to your back-end data store. This assumes, of course, that your DB doesn't have shared access. If it does, you can drop the cache and just use the plain old O/R mapper -- since you'll have to refresh your entities on each transaction begin.

    Writing a performant BMP system is just too freaking tedious to do unless you write your own framework to hide the guck. This has its own set of problems that are almost as difficult as the ones encountered with an ODBMS (i.e. you have to keep track of updates for efficient updating, you need to determine what dependent objects to batch-load vs. lazy load, you need to determine your isolation levels and what impacts they may have on your business methods, etc.)

    Trade-offs are the name of the game.
  6. I think you may find that many issues that existed in the past related to querying, database admin, schema evloution etc. have been addressed as all of the companies with true OODBMS have had to address the needs of OO developers. Most companies have targeted the "middleware" market due to the overwlming dominance of RDBMS. We are staying focused on being the data repository for OO apps while still addressing some of our clients' need for a "transactionsl" database that interfaces with an OO app and multiple back-end legacy RDBMS. There is a developing need for true federation as applications are becoming more and more complex. True distributed processing where there is a single logical view will be in more demand as the sophistication of OO developers increases in response to competitive preasure.

    Abraham
    Objectivity
  7. Stu,

    I think the trick is indeed is to leverage an ODBMS if you truly have complex data configurations. Otherwise leverage an O/R mapper with cache, like CocoBase+Gemstone, Persistence PowerTier or TopLink+WebLogic. And I think that persistence powertier transactional (as well as distributed synchronised )cache is quite well suited to handle heavy loads.

    However another interesting variation could be the Versants approach
    That is O-R Mapping(for utlising the RDBMS core facilties) for certain data items, as well as persisting the intermediary data generation in an Object store.

    And then using a light ODBMS cache for both the persistent approach

    Regards
    Himanshu Jain


  8. I enjoyed your thoughts and interest in Middleware data and object persistence. I also noticed that you mentioned Versant in your posting.

    Have you gone to www.Versant.com lately and looked at Versant enJin (WebSphere and BEA WebLogic Containers). This speaks directly to your point. "The ability to apply the object paradigm to middle tier processing and data and content management [within an application server environment] is, in IDC's view, an important element of building effective, intelligent eBusiness solutions."

    This is not a RDBMS vs. ODBMS. . . the view is that they work together within the ebusiness / application server paradigm. . .

    You have the ability to download enJin. . .

    Let me know if this interests you. . .

    Tom
  9. I enjoyed your thoughts and interest in Middleware data and object persistence. I also noticed that you mentioned Versant in your posting.

    Have you gone to www.Versant.com lately and looked at Versant enJin (WebSphere and BEA WebLogic Containers). This speaks directly to your point. "The ability to apply the object paradigm to middle tier processing and data and content management [within an application server environment] is, in IDC's view, an important element of building effective, intelligent eBusiness solutions."

    This is not a RDBMS vs. ODBMS. . . the view is that they work together within the ebusiness / application server paradigm. . .

    You have the ability to download enJin. . .

    Let me know if this interests you. . .

    Tom
  10. Basil:

    The ODBMS market is actually worth ~US$265 Million from figures from IDC. You should read the very good article by Neal Leavitt in IEEE Computer from August last year: "Whatever Happened to Object-Oriented Databases?". I would expect numbers to tail off and decline as the OODB vendors continue to shift positioning to EJB servers, XML servers, etc.

    On the performance side, I spent many years doing performance benchmarks and also worked with industry. OODBs do offer better performance in some cases, but the total cost of ownership is very high. In other words, finding the people with the know how and retaining them amongst other things are real issues to contend with.

    If you are interested in performance benchmarks, I maintain a web page that originally started life as part of my PhD bibliography:

    http://www.soi.city.ac.uk/~akmal/html.dir/benchmarks.html

    KR

    akmal

    --
    [ ---- akmal at soi.city.ac.uk ---- ]
    [ http://www.soi.city.ac.uk/~akmal/ ]
  11. I think the problem is very often put in the wrong terms:
      - what database will better satisfy our persistence needs.

    That is you model your information in OO terms, crerate a huge class graph, and then you come to the conclusion that an OO database that will persist your objects almost transparently would be perfect. And probably should offer better performance for this particular task (persisting objects).

    But if you consider various ways of solving the same problem (i.e. set of use cases, not implementation of an OO design), you might be in for some surprises.

    Of course, there could be domains (such as storing engineering information to support CAD/CAE systems) that are aparently better handled by OO databases.

    However, my belief is that for the general, "common - purpose" database related applications, RDBs are better equipped.

    And I wouldn't express my belief in this discussion if there was any signinificant figure at www.tpc.org posted by an OODB vendor.
  12. Costin:

    I agree that in many cases, one should adopt the "horses for courses"/"pick the right tool for the job" argument. However, there are genuine hurdles that people face in using OODBs which have limited their more widespread use. From my personal experiences, I have summarised some of the reasons why I think they have not done so well in a number of presentations. You can download them from my home page (URL below). In particular, both JSIG talks. OODBs have done well in a number of domains - originally engineering (where the technology has its roots), but in recent years, Telecomms and Finance. If you would like to see more examples of their use please check-out the two books I have co-edited that contain many OODB case studies:

    1. Object Databases in Practice, Prentice-Hall, 1998.
    and
    2. Succeeding with Object Databases, John Wiley, 2001.

    On the issue of TPC numbers, OLTP is not appropriate for OODBs, (which is why no OODB vendor has published TPC numbers and people rarely use OODBs for OLTP in my experience). However, some years ago, as part of my PhD work, I ran the City OLTP Benchmark on a number of OODB products and found that one in particular provided very good throughput. The results were published in the TAPOS journal just over a year ago.

    KR

    akmal

    --
    [ ---- akmal at soi.city.ac.uk ---- ]
    [ http://www.soi.city.ac.uk/~akmal/ ]
  13. Although I initially disagreed with Basil, I have to say upon a better read of what he was saying that he was probably right.

    So, if I got it correctly this time, he was saying that instead of using entity EJBs backed by RDBMS your better off writing against an ODBMS as your persistence store with no EJB at all.

    I tentatively have to agrree , although I don't have real experience with ODBMS (let aside my modest, do it your self,toys in Turbo-Pascal, and later Delphi).

    What I disagreed to was to using ODBMS to back your EJBs.

    Why do I agree without testing any ODBMS ?
    Because I came to the conclusion, that EJBs take the worst from the two worlds (Relational and OO) so a well written ODBMS should be able to outperform any kind of faithful J2EE implementation.

    To make my self clear I'll name the evils EJBs have combined of the two domains.

    From relational model:
           - they don't support inheritance and abstraction.
           At least that's not the part of the model, although you can always find workarounds.

    From OO model:
           - information has to be addressed only as part of full blown entities, thus reducing flexibility and efficiency
           - you cannot define operation on sets
             (My favorite example is: try to implement in EJB the following SQL:
               UPDATE <Table> SET _Field_ = <something> WHERE <condition>, and see what monster you've got)
           - you got yourself back into procedural lock (3rd generation) where in many cases you should be able to take advantage of higher level (4th generation) languages such as SQL. In plain English , with SQL you define your intended results, which is of course easier, while in 3rd generation languages you have to specify an algorithm to lead to those results
            - last but not least they miss a flexible query language to give you access to information.

    Still, I have to maintain my point that ODBMSs are generally less efficient and less flexible than RDBMSs.

    Probably there could be a model to combine the better of the two worlds, but the big forces in the industry have no incentive to push towards such a model.
  14. Costin, you got it right!
    I think that Java object + ODBMS is
    sometimes better than EJB + RDBMS
    If you really need distributed component model to you application, you use Corba objects instead. As for CMP, ODBMS's persistence is beyond comparison.
    And as for BMP, your UPDATE <Table> SET _Field_ = <something> WHERE <condition> monster is perfect illustration here.
    Most EJB advantages that you pay for are declarative transactions and security model. Generally I think it's not worth to exchange flexibility of pure Java and CORBA for these things(and all those evils, that you outlined).
  15. I think that Java object + ODBMS is sometimes better than EJB + RDBMS.


    What about JDO? One reference implementation is available now (file-based) and another is planned for SQL. One can use Java objects and still have an RDB.

    akmal
  16. Probably there could be a model to combine the better of the two worlds, but the big forces in the industry have no incentive to push towards such a model.


    If it is a database you are talking about, it's called Object-Relational and there are implementations around and available now.

    See, for example, Paul Brown's new book at Amazon.

    akmal
  17. If it is a database you are talking about, it's called

    > Object-Relational and there are implementations around
    > and available now.

    Well, I certainly was aware of the Object-Relational.
    The book you recommended it's subtitled "A plumber's guide" :). This doesn't sound too well.
    As to the Object-Relational "model" itself ... maybe, but maybe better not.

    As to what regards the JDO itself, we can safely say it will eventually be a lesser evil than Entity EJB.

    > One can use Java objects and still have an RDB.

    Of course everybody who programs java necessarily use Java objects, and many of us also use RDB :)
    But that doesn't mean we have to wait for the wise spec designers from Sun to tell us how to :)

    And when those designer interfere with our capability of writing good software, like they already did with EJB, that's just ridiculous.

    When I say interfere I mean : their work is used by marketing departments by making noise, vaporware, hype, persuading customers, project managers, IT journalists and so on.
    Otherwise, they are probably nice guys, I only wish they had to develop real projects against their specs.
  18. Probably there could be a model to combine the better of the two worlds.


    >> If it is a database you are talking about, it's called Object-Relational and there are implementations around and available now.

     IMHO Object-Relational EXTENTIONS to relation model(kernel still remains relational), as they are implemented in all major O-R DB's ( IBM DB2 UDB 7.1, Oracle 8i, 9i,..) are only marketing tricks( this is not evolution, but degradation). These extentions are QUIET UNusefull in practice, and they lack theory foundation. And what is more, most modern Java extensions to these relational DB's(ex. Java stored procedures) are also the same evil. I know no one who benefited from it in REAL projects. Instead of making press releases and heaping up object extensions most RDBMS vendors better implement full classical DOMAIN model and provide OR mapping layer such as TopLink for OO support. This is well known that OR mapping tools are really useful( much more then CMP and BMP Entity beans).

     There are another thing that should be considered -products like Javlin, enJin, Jasmine, Ja....
    In deed they are just the same ODBMS, but pared-down, designed for middle tier and integrated with J2EE servers.
    It seems that most ODBMS vendors finds this marketing move effective to promote their products (ODBMSs) in new attractive packing. Thats because most of ODBMS vendors are aimed to capture Middleware data niche, I think.
  19. I enjoyed your thoughts and interest in Middleware data and object persistence. I also noticed that you mentioned Versant in your posting.

    Have you gone to www.Versant.com lately and looked at Versant enJin (WebSphere and BEA WebLogic Containers). This speaks directly to your point. "The ability to apply the object paradigm to middle tier processing and data and content management [within an application server environment] is, in IDC's view, an important element of building effective, intelligent eBusiness solutions."

    This is not a RDBMS vs. ODBMS. . . the view is that they work together within the ebusiness / application server paradigm. . .

    You have the ability to download enJin. . .

    Let me know if this interests you. . .

    Tom
  20. I enjoyed your thoughts and interest in Middleware data and object persistence. I also noticed that you mentioned Versant in your posting.

    Have you gone to www.Versant.com lately and looked at Versant enJin (WebSphere and BEA WebLogic Containers). This speaks directly to your point. "The ability to apply the object paradigm to middle tier processing and data and content management [within an application server environment] is, in IDC's view, an important element of building effective, intelligent eBusiness solutions."

    You have the ability to download enJin. . .

    Let me know if this interests you. . .

    Tom
  21. Sorry, but I have to disagree with what you said
    *) O/R Mappers are slower than native JDBC: This is a myth; As I often state, people who make these tools _know_ what they are doing, they spend a lot of time on optimizing performance. Thus, in general, these tools are lots faster than your own, "proprietary" mapping, unless you are willing to really spend a lot of time on optimizing your database calls. Guess how many servers you can buy for that development time?

    Second, an ODBMS is sometimes superior to RDBMS, sometimes not, the simple rule is: the more complex your model is (i.e. the more relations there are between entities) the more the OODBMS will outperform the RDBMS. The flatter your tables are and the more records you have the more will the RDBMS outperform the ODBMS. Or, put another way, the more you navigate between your entities the better the ODBMS will do, the more you search and query the worse it will do. And, conclusion of this: if you design your application carefully, the ODBMS will do great, if you need to add a lot of querying while developing or if your design is bad it will be very slow.
    And, a last thing: Never ever use an ODBMS for simply storing a lot of customers and their orders. This is a waste of time and money.
    IMO it is best to use a combination of RDBMS and ODBMS, but of course this is only applicable if you do not care for the money, and it is AFAIK not possible with the current appservers at all.

    Messi

    As always, this is my opinion, and I just left out IMO in every sentence because it is not very readable :-)
  22. I enjoyed your thoughts and interest in Middleware data and object persistence. I also noticed that you mentioned Versant in your posting.

    Have you gone to www.Versant.com lately and looked at Versant enJin (WebSphere and BEA WebLogic Containers). This speaks directly to your point. "The ability to apply the object paradigm to middle tier processing and data and content management [within an application server environment] is, in IDC's view, an important element of building effective, intelligent eBusiness solutions."

    This is not a RDBMS vs. ODBMS. . . the view is that they work together within the ebusiness / application server paradigm. . .

    You have the ability to download enJin. . .

    Let me know if this interests you. . .

    Tom
  23. I'm one of the original architects of the ObjectStore commercial
    OODBMS, and one of founders (1988) of the company formerly known as
    Object Design Inc. and now known as eXcelon Corp. I'd like to comment
    on some of the key points in this thread. These are my own personal
    opinions, not by any means the official corporate voice of my
    employer.

    Basil's overall points are very good. In fact, quite a lot of
    ObjectStore customers these days are doing exactly what Basil is
    talking about: using the OODBMS in the middle tier as a fast cache,
    with a back-tier relational DBMS as the "database of record". In
    fact, we have a product called Javlin that works with ObjectStore for
    exactly this kind of configuration. ObjectStore's focus on being
    language-transparent is particularly useful when you're writing
    applications and want to directly manipulate persistent objects as if
    they were ordinary programing-language objects. Our customers have
    had a lot of success with this kind of architecture. Basil, if you'd
    like to follow up, send me email at dlw at exceloncorp dot com and I can put
    you in touch with someone who knows a lot about building this kind of
    system using ObjectStore and Javlin.

    So why have object databases not taken over the world? It's a long
    story, but I think the most severe problem is what economists call
    "network effects". If you make a new kind of car that works on Fuel
    X, nobody will buy it because there isn't a network of Fuel X gas
    stations, and nobody will open a Fuel X gas station because nobody has
    any of the new cars.

    Indeed, as Akmal Chaudhri says, it is harder to find and hire people
    who know all about ObjectStore than to find and hire people who know
    all about Oracle. It is harder to find third-party software that runs
    with object DBMS's than third-party software that runs with relational
    DBMS's. Both of these effects naturally encourage customers to buy
    relational DBMS's, and the same kind of positive feedback loop
    happens. I'm not saying that this is unfair -- this is the way the
    world works, for perfectly rational reasons.

    Right now I am at home using a Dell PC running Windows 98. It's not
    because I think Windows 98 is the most technologically superior
    operating system available. It's because there's so much software
    available for it, including the office applications my wife needs and
    the games my son likes. (Of course there are lots of other reasons,
    but many are along the same lines. I'm just trying to draw a broad
    analogy, not start a discussion about Linux.)

    A lot of my friends used to work at Apple on a new programming
    language called Dylan, which was technically a wonderful language.
    But I never saw how they could get past the network effects and gain a
    substantial body of users. Java used a variety of powerful phenomena
    to force itself on the scene: if Netscape had not made it possible to
    execute Java in their browser, Java might have never managed to punch
    through the network-effect barrier. (There were other special
    circumstances too, but I'll try to stick to the subject.)

    Also, migrating from one kind of DBMS to another is harder than
    migrating from, say, one Web browser to another. Once a company has
    set up its corporate information infrastructure based on a relational
    DBMS, they almost always have no interest whatsoever in considering
    uprooting it and losing a huge investment in past work. One can
    hardly blame them.

    But even more important: even when we first started Object Design, we
    never really thought we'd displace relational databases. Our idea was
    to produce a DBMS suitable for applications that the relational DBMS's
    were not so well suited, such as CAD, CASE, and so on. However, the
    market didn't work out quite that way; CAD was a respectable but not
    dominant area for sales of our product. Unfortunately, there hasn't
    been any single dominant application area, so we have had to extend
    the system in many ways, and try to adapt it to new application areas
    for which it was not originally designed. We've done a lot of that
    over the years, and ObjectStore is a lot more flexible now than it
    used to be, but it's a never-ending job. In the future I think you
    can expect to see us focusing on doing fewer things, better.

    Meanwhile I agree with the quotation from Objectivity that Basil cited
    regarding the TPC benchmarks: we have never been trying to compete in
    that space so those benchmarks are inappropriate. ObjectStore is not
    trying to be a better relational DBMS.

    Another problem has been lack of standardization, which more than
    anything else is due to major differences in technical philosophy
    between the companies in the business. The different OODBMS's are
    more technically disparate in their overall conception than the
    different RDBMS's.

    These days our company has two divisions: the Object Design division
    in charge mainly of the OODBMS itself and Javlin, and the eXcelon
    division which does XML and B2B products. These latter products use
    the OODBMS inside, but users of the product don't see it directly.
    The XML world provides a useful set of standards that we can make good
    use of in presenting a product whose overall concept is easier to
    explain, and fits in a more obvious way with other software systems.

    Regarding some of the specific points discussed in this thread: I
    don't remember much any more about the IBM Team Connection
    application, but I do remember IBM Flowmark. Yes, they did have
    problems with ObjectStore, but many of them were because they were
    using ObjectStore not as it was intended to be used. (For example,
    Flowmark would create very large numbers of "segments", which created
    a lot of trouble with the release of ObjectStore that they were using
    then.)

    This sort of thing is partly a consequence of their being fewer people
    around who knows "how ObjectStore is intended to be used" (as I was
    saying above about the network effects), and partly due to the complex
    and weird inter-corporate relationship between IBM and us at the time.
    These days we make a greater effort to make sure our customers get
    training and/or extra help (systems engineers, consultants, whatever)
    when making these architectural decisions, so that applications can
    be designed from the start to work harmoniously with the OODBMS.

    A lot of what Stu Charlton says is fair, but specifically regarding
    writing your own indexing software: you don't have to write your own
    indexes with ObjectStore, unless you want to implement an exotic kind
    (say, R-trees for doing queries about rectangular sections of
    two-dimensional spaces). We really do have extensible-hash and B-tree
    indexes, built in, and the query processor automatically uses whatever
    indexes exist that apply to your query. Yes, we do have a query
    processor and query optimization; yes, it's not strong on very
    semantically complex queries. But if you actually do need R-trees,
    you're a whole lot better off with ObjectStore than you'd be with any
    conventional relational DBMS.

  24. Hi Daniel,

    I'm very grateful that you shared with us some insights from an OODBMS insider.
    My profile is mainly as a Java developer, though I have studied extensively the inner workings of relational databases (I even got a DBA certification for fun couple of years ago), thus I'm eager to discuss this a little bit further with you, if you like.

    1. My understanding is that OO query languages are semantically less powerful (expressive) than SQL, is this right?

    2. If you have a O/R Persistence Tier or Middleware cache, you name it, against a RDBMS, isn't that you risk duplicating the inner workings of the RDBMS (concurency control and transaction management)?
    You can see that with EJB app server, where they say a lot about handling concurrency control for entity beans, while it is clear because the ejbLoad and ejbStore at beginning and at the end of a transaction, the RDBMS will do that anyway.

    3. There's a fundamental problem with OODBMS's :
    while a relational database lets you manipulate chunks of information (individual attributes), sets of entities (operation on sets), as well as regrouping information in result sets of almost arbitrary structures (like the joins you do in reports), the OODBMS, because the OO nature, lets you only manipulate instances only (full blown entities), for which you have a defined type with a defined set of operation, thus certain types of operations are more expensive because of this inflexibility.
    So how is this problem solved, is it possible to have a "dynamically typed" OODBMS, or an OODBMS that exposes the static view of Object data to a more powerful query language like SQL?

      
  25. 1. My understanding is that OO query languages are

    >> semantically less powerful (expressive) than SQL,
    >> is this right?

    Costin,
    This is one of the myths about ODBMS. There are ODMG standard for query language for object databases - OQL.
    It is not less powerful then base SQL, simply because it is superset of it. But only one of all major ODBMS vendors - POET supports it(There was also Ardent's O2, but since Informix bought Ardent, they dropped support for it).
    Nevertheless each of ODBMS vendors that do not support OQL supports it's own QL(eXcelon,Objectivity,Versant). But since all these vendors are members of ODMG , they are making these standards like OQL.
    Contradiction?
    As of query optimizer, yes, it is much less effective than those you have in RDBMS. But this is the price that you pay for OO freedom. Most optimization is on your own.
    Another point to consider here is complex queries.
    Using Teradata, for example, I can do this:
    (This query reports the amount of discount sales revenue that was received in each discount category)

    select
    sum(case when l_discount = 0.01
            then (l_extendedprice * l_discount)
            else 0 end (decimal (18,2)))(named DISC_1_PCT),
    sum(case when l_discount = 0.02
            then (l_extendedprice * l_discount)
            else 0 end (decimal (18,2))) (named DISC_2_PCT),
    ...
    where l_comment not like '%Ignore Discount%'
    from item;

    AFAIK you can't do such things as you do with RDBMS, but you can achieve the same result with ODBMS if YOU build the MODEL not the QUERY in appropriate way.
    Nevertheless RDBMSs have complete domination in data warehousing and data mining realm. That is why there are no MISes built on top of ODBMS.

    I'm glad to see Daniel here participating in this discussion. So here are my questions to him as to one of the architects of the ObjectStore.
    1) Why do you provide your own QL instead of supporting ODMG standard?
    2) Do you think that MIS market still can be attractive to ODBMS vendors?
  26. High Basil,

    I very much disagree with some of your points.

    Like "But this is the price that you pay for OO freedom".

    I very much doubt that OO models offers you that much of a freedom. Simply put , objects are constrained to their implementation language, or if you expose the Database Objects using Corba or some equivalent, you are back into Entity Beans design problems.
    My understanding is that ODBMS in geeral do ship collection of objects which are instantiated in client's address space.
    Also the queries generally retrieve objects only (within the classes you pre-determined in the database schema).
    This is far from "freedom" as I see it.

    "AFAIK you can't do such things as you do with RDBMS, but you can achieve the same result with ODBMS if YOU build the MODEL not the QUERY in appropriate way. "

    This is one of the promises of fourth generation languages. You specift the desired results (the QUERY). The MODEL has to be there anyway either relational/OO/network/hierarchical/ISAM or whatever.

    It is not necessary to reimplement algorithms over and over again for common data access patterns and therefore powerfull Query engines and powerfull query optimizers are a definite gain of RDBMS world and if ODBMS think of themselves as the next step (or logically more advanced) should never throw away.
    Apparently they do.
    I understand this downgrade is sometimes justified by very particular use-cases.
    But it shouldn't be presented the way you do.

    As to the ODMG standard I have only one comment:
    You have to buy the ODMG book to read this standard.
    It's not the money , but the idea itself.

    And since there are so many things one can chose to learn I definitely prefer RDBMS docs which are free (not to mention developer's licenses which are also free) and much more detailed to the point where Oracle for instance almost give you the equivalent of their source code.

    I'm sorry, but for my taste ODMG does a bad job promoting their standard, so I'll learn ODMG standard only if my next job depends on it.
    Although I'm very open to learning new things, I find ODMG motifs hard to swallow.
    Did they make that standard to collect royalties on the book?
    Since they made the book, I assume the contract they have with the publisher prevents the from putting a PDF file for the public.
    Very confusing, and very bad for object databases.
  27. I used OD's ObjectStore extensively; my previous company was a "pure OO shop" and we used ObjectStore as the backend of our server architecture. However in the end we reluctantly switched to a hybridized solution of OODBMS for business data persistence and RDBMS for logging persistence. We could not avoid "evil" Oracle because they provided better performance, flexability, protability and reliablility.

    Of course, it didn't help the eXcelon decided to shift its focus aware from ObjectStore and into messaging middleware... unfortunately the ODBMS market just wasn't catching on.

    Well, Oracle and other RDBMS vendors had over 25 years to hone their craft, so I hope ODBMS vendors will, pardon the pun, persist long enough to develop a superior alternative!
  28. Historically, the ORDBMS grew out of the failure of ODBMSs, which began to be released in the late 1980s, to win acceptance from corporate users. The ODBMS has limitations that can prevent it from taking on enterprise-wide tasks. For one thing, it doesn't share a standard query language like SQL. Secondly, it's not as scalable as the RDBMS. The ODBMS has uses. They work well for individual users or very small groups. But at a certain point, they croak on you. ODBMS-based applications normally perform well up to about 20 to 30 users or 5GB of data.

    As well as being far more scalable, the RDBMS usually is superior in areas of performance, security, integrity and availability. RDBMS vendors have spent years perfecting these features. They wont want to throw that research out.

    By storing objects in the object side of the ORDBMS but keeping the simpler data in the relational side, users may approach the best of both worlds. That's the reason the Wheel Trans Division opted for CA-Ingres. For example, the application stores customer profiles, which include address and phone number, eligibility, previous destinations, how often they've canceled rides without notice and so on, in relational tables.

    Think bout it man !

    ODBMS is cool but it ain't the panacea you purport it to be
  29. I enjoyed your thoughts and interest in Middleware data and object persistence. I also noticed that you mentioned Versant in your posting.

    Have you gone to www.Versant.com lately and looked at Versant enJin (WebSphere and BEA WebLogic Containers). This speaks directly to your point. "The ability to apply the object paradigm to middle tier processing and data and content management [within an application server environment] is, in IDC's view, an important element of building effective, intelligent eBusiness solutions."

    You have the ability to download enJin. . .

    Let me know if this interests you. . .

    Tom