Discussions

News: Graph Paging: Are we doing O/R mapping wrong?

  1. Graph Paging: Are we doing O/R mapping wrong? (113 messages)

    Brian McCallister has been playing with JDO 2 fetch groups, ZODB, thinking about TranQL, playing with Prevayler, and looking at TORPEDO. Then something clicked: We may all be doing O/R mapping wrong. Seriously, we probably are. Brian discusses his thoughts.
    The current popular approach is a thin wrapper arounnd JDBC. It is what OJB, Hibernate, and JPOX all do. I cannot comment on Kodo and Toplink as I cannot go browse around their sources, but I suspect it is the same. This is how we are used to thinking about it -- the objects you get are basically a stream (or collection) of database results.

    This isn't really what they are though. They are really closer to a swapped in page of the entire object graph. The query mechanism for the object graph, and query mechanism for the backend get confused (in the con-fuse sense). The JDO spec has the right idea in seperating object queries from persistence store queries (I do tend to agree with Gavin King that the JDOQL query language itself is somewhat less than elegant). The editing context can contain more or less than has been queried for, as long as what is accessed is available when it is needed.

    When you need to obtain a handle on an instance, a query language is bloody useful. OGNL defines a better object query language than either OQL, JDOQL or HSQL, though -- if you are talking purely objects. HSQL evolved as it did to avoid the loss inherent in this abstraction though, and works nicely. You are querying into the editing context though, and the context can determine, seperately form the exact query, what it does not already have loaded (thank you Jeremy and Dain). This is a lot of work probably best done in a haskell style language optimized for doing fun maths rather than pushing bits.

    Read Graph Paging: Are we doing O/R mapping wrong?

    Threaded Messages (113)

  2. Is O/R mapping just plain wrong?[ Go to top ]

    I agree with Brian McCallister that an O/R mapping should hide the data repository behind it; but I would extend it further: I consider that whatever the data source (csv file, XML document, RDBMS, and I forget many), you should query your in-memory copy with the same language. After all, it's all data, right? That's why I'd like it to be XPath, because it works for any form of data, tables or trees or whatever.

    My 2 cents...
  3. See OO vs RDBMS debates[ Go to top ]

    That's why OO databases were set to be popular,
    but then weren't. Trying to do the same sort
    of functionality over JDBC probably isn't worth the effort.
    In hibernate, for example, you can't have
    an object in multiple sessions, if you keep
    an in memory graph any object will definitely be
    in multiple sessions.
  4. Trying to make a persistance layer that are independent of database type is waste of money and time. Relational databases are superior and trying to hide the fact that a relational database is used, will also hide a lot of relational features.

    O/R mapping is wrong because it tries to convert the relational model into a object model with less features. A better approach is making an object model on top of JDBC (like http://butler.sourceforge.net). An object model like this should have classes like Table, Record, ForeignKey, Query, etc. This model will help you with statement and value-object population, but still not loose any features of the relational database.
  5. Is O/R mapping just plain wrong?[ Go to top ]

    Trying to make a persistance layer that are independent of database type is waste of money and time.
    But since the time and money are already spent, might as well use them. :)
     Relational databases are superior
    Superior to what? Flat files? Depends. They are not superior for persisting an object model.
     and trying to hide the fact that a relational database is used, will also hide a lot of relational features.O/R mapping is wrong because it tries to convert the relational model into a object model with less features.
    Sure would be nice if we actually had a persistance tool the match our development language.

    An OO model doesn't have less features. Just different.

    Hiding features? What features do you consistently need from the RDMBS that would outweigh using an ORM?
    A better approach is making an object model on top of JDBC (like http://butler.sourceforge.net).
    You wouldn't happen to have a hand in developing that would you? :)
     An object model like this should have classes like Table, Record, ForeignKey, Query, etc.
    Sure sounds like an object model of the DB model. Much like already exists in JDBC.
     This model will help you with statement and value-object population, but still not loose any features of the relational database.
    Value Objects? Uh-Oh.
  6. Is O/R mapping just plain wrong?[ Go to top ]

    Relational databases are superior
    Superior to what? Flat files? Depends. They are not superior for persisting an object model.
    Relational databases are superior to flat files because they have:
    * Indexing
    * Query possibilities
    * Record level concurrency control
    * Transactions
    * Etc, etc

    The relational model is superior to the network and hierarchical model used in object databases because the querying capabilites.
    Hiding features? What features do you consistently need from the RDMBS that would outweigh using an ORM?
    Querying. The query languages in ORM products (HQL, EQL, JDO-QL) is mostly not powerful enough, and if they are, they are almost exact copies of SQL. The same thing, but different name.
    An object model like this should have classes like Table, Record, ForeignKey, Query, etc.
    Sure sounds like an object model of the DB model. Much like already exists in JDBC.
    Not at all. In JDBC you are sending strings to the database.
    Value Objects? Uh-Oh.
    Almost all design patterns for enterprise applications uses value objects. Have a look at www.javasoft.com.
  7. Is O/R mapping just plain wrong?[ Go to top ]

    Relational databases are superior to flat files because they have: * Indexing* Query possibilities* Record level concurrency control* Transactions* Etc, etc
    Sure. If you need all that. :) Sigh. The point was sometimes it really isn't superior if you have static info and ... . Using OO techniques you can hide the implementation and change/test/... as needed.
    The relational model is superior to the network and hierarchical model used in object databases because the querying capabilites.
    Unless they have good querying capabilities.
    Querying. The query languages in ORM products (HQL, EQL, JDO-QL) is mostly not powerful enough, and if they are, they are almost exact copies of SQL.
    With Hibernate, you can use SQL if necessary. Seldom if ever is. Difference between HQL and SQL - You don't need to change the HQL everywhere when the schema changes. Had it happen big time just recently. Shocked the snot out of the DBAs when told them we were done. :)
    Not at all. In JDBC you are sending strings to the database.
    Ok.

     
    Almost all design patterns for enterprise applications uses value objects. Have a look at www.javasoft.com.
    As discussed elsewhere, this is an Anti-pattern.

    http://java.sun.com/blueprints/patterns/TransferObject.html
    "This class is used as the return type of a remote business method." & "Fetching multiple values in one server roundtrip decreases network traffic and minimizes latency and server resource usage."

    Check out Martin Fowlers Architect book on how to use them IF you need them.
  8. Is O/R mapping just plain wrong?[ Go to top ]

    The relational model is superior to the network and hierarchical model used in object databases because the querying capabilites.
    Unless they have good querying capabilities.
    Read a book about database fundementals and you will understand why the network and hierachical paradigms was abandoned 25 years ago.
    Difference between HQL and SQL - You don't need to change the HQL everywhere when the schema changes.
    This is not true. You can do a lot of schema changes in a relational database without changing the SQL statements.
  9. Is O/R mapping just plain wrong?[ Go to top ]

    Read a book about database fundementals and you will understand why the network and hierachical paradigms was abandoned 25 years ago.
    Please, drop the silliness. Don't go telling people to read un-named books. Don't pretend you're smarter than everyone else. (You may be, but telling people that they are idiots doesn't help them want to understand what you're saying.)

    Regarding your claims, I would say that SQL-based RDBMS systems (such as Oracle, DB2, Sybase, etc.) are by far the best general purpose solutions out there today. Most applications use that form of database for persistent storage etc. of data.

    However, network and hierarchical paradigms were not abandoned -- they just moved into the solutions niches in which they belonged. I see these types of systems (often embedded) in massive use, but for specific problems.

    Some day, IMHO, relational systems will move into their own best-fit niche, and whatever replaces them will eventually move into its niche. Progress is hard to stop in this industry .. and there are a lot of good ideas for moving past the limitations of the relational model.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  10. Is O/R mapping just plain wrong?[ Go to top ]

    I am not arguing for the use of relational databases everywhere. But I am arguing for not hiding the fact that you working with such database. Most applications don't need to be neutral to database paradigm. Are you trying to say that applications need to be able to switch to another type of database, because relational databases soon will be obsolete?
  11. Is O/R mapping just plain wrong?[ Go to top ]

    I am not arguing for the use of relational databases everywhere. But I am arguing for not hiding the fact that you working with such database. Most applications don't need to be neutral to database paradigm.
    (not trying to sound snotty but)
    I would say most applications don't need to be database vendor or database paradigm dependent.

    Would you say applications should be OS independent? How about application server independent?
    Are you trying to say that applications need to be able to switch to another type of database, because relational databases soon will be obsolete?
    I doubt soon. But the less we do about it soon, the less likely it will happen soon. And the more difficult it will be later. It is more about seperating the layers and the fact that the relational model doesn't fit well with the OO one. And the other way around. I know my answer is simplistic. But I'm not sure I have the time and space to fully develop the thought.

    I was thinking about this last night. I was wondering how many people here who are support the "data" concept have done it with a Java client (Swing, SWT, Echo, etc.) and how well did it work.
  12. Is O/R mapping just plain wrong?[ Go to top ]

    I am not arguing for the use of relational databases everywhere. But I am arguing for not hiding the fact that you working with such database.
    I think that is a perfectly valid approach for some applications. I'd even take it a step further and say that it's pointless to hide *which* database you are using in some applications. For example, knowing that you are using a certain version of Oracle, you can do all sorts of amazing and non-standard optimizations that can make orders-of-magnitude differences in performance.

    I'm not claiming to know what percentage of apps should go down and exploit what level of functionality. I know that there are many use cases for which the application should not ignore the native capabilities of the selected RDBMS.
    Most applications don't need to be neutral to database paradigm. Are you trying to say that applications need to be able to switch to another type of database, because relational databases soon will be obsolete?
    No. I agree that most applications don't need to be neutral.

    OTOH, for a lot of applications, it can help the quality of the application to build it on top of an object model for the data, allowing the programmers to work with the abstractions that they are used to. It's not a dissimilar concept from SOA, which allows developers to abstract to an even higher level (services) .. you could say that domain object modeling is to ER as SOA is to stored procedures. In their particular eras, both were used to model and encapsulate certain types of design and functional efforts.

    I think the important thing for an engineer is to understand the requirements well for each application that they're working on, and to be open to using new ideas when they are the most effective, and also open to ignoring the siren call of new ideas when they are not effective solutions. I call it UYB programming. I'll probably write a book about it some day ;-)

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  13. Is O/R mapping just plain wrong?[ Go to top ]

    Read a book about database fundementals and you will understand why the network and hierachical paradigms was abandoned 25 years ago.
    Wierd. Cause < 10 ago I was using IMS. And it hasn't gone away in the financial industry. Great for "data processing". Not for reporting.

    And I do know DB fundamentals and have lots of training in it. Worked with DB2 (MF, OS/2, Windows, Unix, AS/400), Oracle (Windows, Unix, Linux), SQL Server, Sybase, IMS, Adabas, and some "minor" ones.
    Difference between HQL and SQL - You don't need to change the HQL everywhere when the schema changes.
    This is not true. You can do a lot of schema changes in a relational database without changing the SQL statements.
    First. I didn't say you needed to change the SQL everytime. Second, I should have said table names and field names to be a little more precise. Sorry. Yeah, changing the field size is a schema change.
  14. Is O/R mapping just plain wrong?[ Go to top ]

    Wierd. Cause < 10 ago I was using IMS. And it hasn't gone away in the financial industry.
    There are still a lot of COBOL applications in the financial industry too. Is this your programming language of choise too? There are a lot of examples of old and obsolete technologies still in use in old applications, but that is not an argument for use them in new applications.
    Second, I should have said table names and field names to be a little more precise.
    So you want to change table and column names. Even if I don't understand why, this is still very easy to solve. Create a new view the the old table and column names, and the old SQL statements will still work.
  15. Is O/R mapping just plain wrong?[ Go to top ]

    Wierd. Cause < 10 ago I was using IMS. And it hasn't gone away in the financial industry.
    There are still a lot of COBOL applications in the financial industry too. Is this your programming language of choise too? There are a lot of examples of old and obsolete technologies still in use in old applications, but that is not an argument for use them in new applications.
    You had said abandonded. Did you mean something else? As for using COBOL in new apps - If one is going to deal with "data" one should use a language suited to dealing with "data" as "data". And COBOL is much better at it than Java.

    BTW, for those of you still doing COBOL, but like Eclipse - check out this COBOL plugin - http://www.lemosoft.com/prod_descr.php
    Second, I should have said table names and field names to be a little more precise.
    So you want to change table and column names. Even if I don't understand why, this is still very easy to solve. Create a new view the the old table and column names, and the old SQL statements will still work.I usually don't want to. But sometimes business needs change. And thus, so do dbs. Sometimes one gets a command from on high that the naming standards are changing. Or one needs to switch from Sybase to DB2 because it doesn't perform and the DB2 DBAs provide better support. But they have different naming standards. Happened this year.
  16. Is O/R mapping just plain wrong?[ Go to top ]

    As for using COBOL in new apps - If one is going to deal with "data" one should use a language suited to dealing with "data" as "data". And COBOL is much better at it than Java.
    Actually I work a little bit with maintaining COBOL applications too. And trust me, java is better in every way, except of performance. The problem with java is that it is hijacked by OO evangelists that forces programmers to write meaningless getters and setters, and strongly resists any attemt to use embedded SQL (SQLJ).
    Sometimes one gets a command from on high that the naming standards are changing.
    If your bosses has nothing else to do but telling you to change table and column names, you are a very poor employee.
    Or one needs to switch from Sybase to DB2 because it doesn't perform and the DB2 DBAs provide better support. But they have different naming standards. Happened this year.
    Yes, I know DB2 don't allow more than 8 positions in the identifier names. Some database vendors do their best not to follow standards (ANSI). But assuming that you are working with ANSI SQL databases, this is not a problem. And it also a good pratice not to have too long identifier names, or strange characters, because of problems with portability. Anyway, I am working with an application that runs on several different databases, and there are no major problem writing SQL statements that are portable.
  17. Is O/R mapping just plain wrong?[ Go to top ]

    Trying to make a persistance layer that are independent of database type is waste of money and time. Relational databases are superior and trying to hide the fact that a relational database is used, will also hide a lot of relational features.
    As far as I can tell, relational databases have never been "superior" at anything (except market penetration); rather they are "good enough" for most purposes, just as all general purpose solutions are, including "Java" as a language and "Unix" as an operating system. If you think that relational databases are fast, you've obviously never tried to get large amounts of data out of them, or sort large amounts of data in them. It took one company that I know of a week just to "SELECT" the data from one of their tables with a simple one-column sort, and most of the operations that they wanted to do with their data were not possible to do within a human lifetime using their database. The same tasks, being performed by a specialized data mining tool, took a few hours.

    As for your comment, ORM isn't intended to "hide the fact that a relational database is used," rather it tries to solve the problem that a relational database is being used in a world that is composed of objects.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  18. This is how we are used to thinking about it -- the objects you get are basically a stream (or collection) of database results. This isn't really what they are though. They are really closer to a swapped in page of the entire object graph.
    This reminds me of a work called Hybrid Adaptive Caching done at MIT by Barbara Liskov:

    http://www.pmg.lcs.mit.edu/~castro/hac-sosp97/published.html

    It's about a distributed object database. I think the problem it is trying to solve is balancing the conflicting requirements of using pages vs. using single objects.

    As for myself, after years of dealing with the so-called impedance mismatch, I think the following:

    - Mapping RBDMSs into objects isn't worth it. Even though Java is my bread-and-butter I now think that DBAs should be left alone to design databases as they see fit. The application should stop pretending that the data is anything other than data (i.e. no behavior.)

    - Query languages should not be used inside applications. SQL is good for manual queries or for stored procs, not for applications.

    - For applications there should be a much lower-level api with the primitives of relational algebra, e.g. cartesian productions, projections, etc. And there should be explicit support for hints. In fact, these days everybody should support nested-loop joins, hash joins, etc., so I should be able to specify those.

    You can also use O/R mappings successfully (I always use it,) but it often seems awkward.

    Guglielmo
  19. As for myself, after years of dealing with the so-called impedance mismatch, I think the following: [...] The application should stop pretending that the data is anything other than data (i.e. no behavior.)

     Guglielmo
    Couldn't express that better than you did - I absolutely agree with you.
    Other opinions out there?

    Regards,
    Stefan
  20. I agree fully[ Go to top ]

    +1
    I was digging deep into all this O/R and OODB stuff beginning in the mid 90ies with C++ and then with Java. I'm still following the debate with some interest but my conclusion is the same as your's.
  21. Couldn't express that better than you did - I absolutely agree with you.Other opinions out there?Regards,Stefan
    Hi, I respectfully disagree that data should not contain behavior. Isn't one of the goals of OO encapsulation? The data should be very close to the behavior and isolated from parts of the system that don't need to be mucking around with it. On the other hand if you have data that transcends layers of your application then you end up with code that must deal with the data structures as they exist in the relational/xml/graph world and not as they should be represented in the object world.

    I understand that you can have isolation layers such as DAO, EJB, etc but aren't you accomplishing the same encapsulation in a much less elegant way?

    Regards,

    Ted Slusser
  22. Hi, I respectfully disagree that data should not contain behavior. Isn't one of the goals of OO encapsulation? The data should be very close to the behavior and isolated from parts of the system that don't need to be mucking around with it. On the other hand if you have data that transcends layers of your application then you end up with code that must deal with the data structures as they exist in the relational/xml/graph world and not as they should be represented in the object world.I understand that you can have isolation layers such as DAO, EJB, etc but aren't you accomplishing the same encapsulation in a much less elegant way?Regards,Ted Slusser
    I just think that for OLTP people build systems in layers, and in a very entrenched and well-established ways, and basically the patterns that emerged put behavior in controllers instead of with the data.

    There is also what (I think) might in the past have been called "the Booch method", which puts behavior together with the data, e.g. a Circle class should have a method called getArea().

    I think there is a place for both Booch and MVC, and the hard part is knowing which is which. I think by now we know that the Booch method definitely works great with Collections and Windowing Systems, but for OLTP people like to separate the business logic from the data. The reason is basically one of timescales: the data (often) lives much longer than the business logic.

    Guglielmo
  23. Data outlives behaviour[ Go to top ]

    The data should be very close to the behavior and isolated from parts of the system that don't need to be mucking around with it.
    The problem with tandem data+behaviour is that data has much longer life span than behaviour.
    So this marriage looks like a marriage between turtle and fly :))

    Data should be independent and do not care about any behaviour.
  24. Data wandering to another domain[ Go to top ]

    The problem with tandem data+behaviour is that data has much longer life span than behaviour.So this marriage looks like a marriage between turtle and fly :))Data should be independent and do not care about any behaviour.
    Is that so?

    If you say that data has a longer life span, you may be right for data
    that goes over different versions of software or data that gets
    migrated from one system to another.

    Yet the data is not the same anymore, as it is not used in the same problem
    domain anymore!

    In best case, you can reuse your schema. This is where RDBMS really shine.
    Often this is possible: A customer database can probably be used by many
    applications without too much a change. Still this is domain sharing
    and breaks at the instant when an application using this data
    wants to have a schema change.

    In worst case though, you have to migrate data to this new domain,
    transforming it as you move along.
    This may be easy if you have a proper abstraction of your data:
    Java Objects or an XML presentation work nicely in this case.
  25. [..] If you say that data has a longer life span, you may be right for datathat goes over different versions of software or data that getsmigrated from one system to another.Yet the data is not the same anymore, as it is not used in the same problemdomain anymore!In best case, you can reuse your schema.
    Some data is not application-specific, and can truly be shared among applications. For example, reference data in the financial industry is of this type. It is used in the same form in many applications (and across different companies).
  26. Data wandering to another domain[ Go to top ]

    [..] If you say that data has a longer life span, you may be right for datathat goes over different versions of software or data that getsmigrated from one system to another.Yet the data is not the same anymore, as it is not used in the same problemdomain anymore!In best case, you can reuse your schema.
    Some data is not application-specific, and can truly be shared among applications. For example, reference data in the financial industry is of this type. It is used in the same form in many applications (and across different companies).
    Is immutable? If so, I can see this falling into a different category. But if is added to and/or is changed, then it is not.
  27. I think the following:- Mapping RBDMSs into objects isn't worth it. Even though Java is my bread-and-butter I now think that DBAs should be left alone to design databases as they see fit. The application should stop pretending that the data is anything other than data (i.e. no behavior.)-

    All of the implementaions of VMs and OODBs that I'm familiar with always seperate the data from code/behavior but in the end, the view is that they are together. So, I see nothing wrong with keeping the view that state and behavior are together. That said, there might be something in your idea of separating state and behavior though at the moment I can see no real reason why you'd want to do that., the always seperate data.

    "Query languages should not be used inside applications."

    Humm, at some level, you are always needing to find something. If you don't use a "query language", how do you proprose to do so? I see your proposal but is that not just another form of a "query language"? TopLink successful models these types of low level primitives but here is the real problem


     SQL is good for manual queries or for stored procs, not for applications.- For applications there should be a much lower-level api with the primitives of relational algebra, e.g. cartesian productions, projections, etc.

    TopLink successful models these types of low level primitives but here is the real problem and hence the sense of awkwardness that most people feel. O/R mapping occurs at too high a level in the that is it for all intensive purposes, part of the application when it should really be a part of the VM. What we have in a VM is not just a virtual machine. It is a virtual personal computer that is only capable of running a single application. Sure we can kludge it to run multiple applications but fundamentally it is a virtual PC. VM already holds the class data (the contents of the .class file) in perm space and the data in young and old space. So the fact that we have data in a DB just means that we are trying to manage an object graph in a separate disparate memory space. The VM is already well equipped to deal with managing memory spaces. It does it quite effectively. Unfortunately when we deal with a disparate memory space (such as data being held in an RDB) we have to create some abstraction of the underlying mechanisms and then they are exposed to us. They are (of course) hidden to us if we just keep things in Java memory.

    IME, the absolute BEST place to manage any memory space is from with-in the VM. Draw this conclusion from my work with GemStone/J. With GemStone/J (based on a custom VM), one acquired a context into the “persistent” memory space from a (transactional) session. The elegance in this solution is that to persist, one only has to connect the object to an object whose root is in the persistent memory space (i.e. attach it to the context or another object attached to the context) and then perform a commit on the session/transaction.

    The market non-acceptance of OODBMS systems is a multi-dimensional problem that is yet another discussion that I will not digress into here with. It is safe to say that the market was not ready for OODBMS and as good as the vendor offerings were, they exposed many of the problems that all persistent mechanisms face but have so far been able to hide. The absence of a standardized query language was another problem.

    Regard, Kirk

    http://www.javaperformancetuning.com
    http://kirk.blog-city.com
  28. Mapping RBDMSs into objects isn't worth it.
    RDBMSs are here to stay. So is OO. As a result, we will have to do something, and that something will necessarily be isomorphic to O/R mapping.
    The application should stop pretending that the data is anything other than data (i.e. no behavior.)
    Yes, the bean-model of an object is inappropriate for objects that represent database rows.
    SQL is good for manual queries or for stored procs, not for applications.- For applications there should be a much lower-level api with the primitives of relational algebra, e.g. cartesian productions, projections, etc.
    There should be, and there is!

    Cilantro ORM ( http://www.cilantroORM.us ) models queries as first-class Java objects, so relational algebra primitives (such as intersection, union, and projection) are just methods on the queries. Cilantro exposes the full power of the underlying RDBMS in 100% pure Java: no special query language, no descriptor files, no precompiling, none of that.

    Cilantro is available on SourceForge. I wrote Cilantro and I'm trying to drum up additional interest in it. I would appreciate any comments people have.
  29. - Mapping RBDMSs into objects isn't worth it. Even though Java is my
    > bread-and-butter I now think that DBAs should be left alone to design
    > databases as they see fit. The application should stop pretending that
    > the data is anything other than data (i.e. no behavior.)

    Agreed

    >Query languages should not be used inside applications. SQL is good for manual
    >queries or for stored procs, not for applications.

    Probably I'm missing the point: why shouldn't use SQL in the application? If you have your queries well separated from the application (not always easy, but possible) you could have the declarative power of SQL (*) and you could share them with the DBA who could optimize them (or rethink the physical layout of the tables)

    > For applications there should be a much lower-level api with the primitives
    > of relational algebra, e.g. cartesian productions, projections, etc.

    If you're saying we need a better query language I agree... if you're talking
    about a real API I disagree.. I'd hate to have to change my application code
    for a modification in the DB.. I'd rather change a file with the queries.

    > And there should be explicit support for hints. In fact, these days
    > everybody should support nested-loop joins, hash joins, etc., so I should
    > be able to specify those.

    Really? I use Oracle RDBMS and I'm pretty happy with its Cost Based Optimizer (I use "hints" as a last resort.. usually the problem lies in the statistics gathering process). The same "query" can be better implemented with "hash joins" or "nested joins" depending on the amount of data. If I choose to use an API that let me choose the type of joins I could choose a sub-optimal query path: unless I re-implement the CBO in my application to choose the better query path.. (not so easy, I guess :-)

    (*) Ok, it's not perfect, but I prefer it to every procedural alternative I can imagine

    Bye,
    Insac
  30. <blockquoteAs for myself, after years of dealing with the so-called impedance mismatch, I think the following:- Mapping RBDMSs into objects isn't worth it. Even though Java is my bread-and-butter I now think that DBAs should be left alone to design databases as they see fit. The application should stop pretending that the data is anything other than data (i.e. no behavior.)- Query languages should not be used inside applications. SQL is good for manual queries or for stored procs, not for applications.- For applications there should be a much lower-level api with the primitives of relational algebra, e.g. cartesian productions, projections, etc. And there should be explicit support for hints. In fact, these days everybody should support nested-loop joins, hash joins, etc., so I should be able to specify those.Right. In a less robust fashion, this is how we did database applications 30 years ago, using network and other old fashioned databases (for those not as old as me, the term network here refers to a pre-relations database structure, not to networks as we know them today).

    For most applications, I think it still a more productive paradigm.
  31. I strongly agree with
    - Mapping RBDMSs into objects isn't worth it. Even though Java is my bread-and-butter I now think that DBAs should be left alone to design databases as they see fit. The application should stop pretending that the data is anything other than data (i.e. no behavior.)

    - Query languages should not be used inside applications. SQL is good for manual queries or for stored procs, not for applications.
    though not with
    - For applications there should be a much lower-level api with the primitives of relational algebra, e.g. cartesian productions, projections, etc. And there should be explicit support for hints. In fact, these days everybody should support nested-loop joins, hash joins, etc., so I should be able to specify those.
    My take on it is that DBAs shall design databases as they see they fit and also design DML statements as they see they fit. The statement then shall be compiled to Java classes and here we go - no SQL (or any other xQL) in Java.
    I exercise just this approach using SQLC

    Pavel.
  32. I graduated with a BSCS in 2000.[ Go to top ]

    I originally thought that you were saying that in your limited experience of only 4 years.... etc.
  33. Graph Querying[ Go to top ]

    Hi Guglielmo, people,

    OK. First, applications don't deal in arbitrary sections of object graph - they deal in tree segments and slices, with occasional 'graph' features such as shared nodes.

    All we really need for access, are path expressions in a query. JDOQL provides a fine starting point. eg

      Order.lines = the lines of an order
      Order.lines.product = lines and product data needed to display the order

    Basically the structure is there, the field names are defined, all we have to do is say what we want - and not get sidetracked into vast and wasteful SELECT FROM pseudo-sql syntaxes.

    A more complex example:

      Customer.orders.lines.products = the set of products ordered by a customer

    > This is how we are used to thinking about it -- the objects you get are
    > basically a stream (or collection) of database results. This isn't really what
    > they are though. They are really closer to a swapped in page of the entire
    > object graph.

    'Graph' behaviours have to be implemented, but this is not a useful concept to think about or address data in OO systems -- because it ignores and overlooks the actual usable structures by which applications address & access data.

    Much better instead to think Tree access for different use cases, with occasional shared nodes.

    PowerMap JDO has some early features for Querying with path expressions, allowing exactly this form of bulk data retrieval. Paths can specified for the Result Expressions including plural expressions to retrieve subordinate items.

    This is still early access but check it out if you're interested.

    www.powermapjdo.com


    A few other points on data modelling:

    > Mapping RBDMSs into objects isn't worth it. Even though Java is my
    > bread-and-butter I now think that DBAs should be left alone to design
    > databases as they see fit. The application should stop pretending that the
    > data is anything other than data (i.e. no behavior.)

    Having written numbers of business systems, I'm absolutely in favour of separating object model from database. This wasn't standard practice in the early days but damn if keeping things clean & separate ain't a lot better in the long run.

    > Query languages should not be used inside applications. SQL is good for manual
    > queries or for stored procs, not for applications.

    I agree absolutely. SQL is just so retarded and awkward, when you consider how simple most business system logic is.

    Basically business apps come down to, 'iterate that list', 'iterate that sublist', 'aggregate the items' written X hundred number of times.

    We're still working with the Java language, so you still have to write the loops out. But our JDO Query technology is directed to at least select the data efficiently for you.

    > - For applications there should be a much lower-level api with the primitives
    > of relational algebra, e.g. cartesian productions, projections, etc.

    Object References will do just fine, in singular and plural forms.

    One of the things people need to realize, is that access patterns are not fixed to the mapping or metadata - they entirely depend which usecase the application is performing.

    The Query is just such an ideal platform for specifying concise navigational expressions. Yet I'm still constantly amazed at people pushing those pseudo-SQL syntaxes. These obscure pretty much any understanding of application data access and seem proud to turn the trivial into the obscure... :-)


    Cheers,
    Thomas Whitmore
    www.powermapjdo.com
  34. I don't believe we will ever see QL using Java syntax be flexible enough for queries, but is this the cause for doing "O/R" mapping wrong?

    I look to RDBMS databases and SQL and feel the problem is there. SQL projects data in only two dimensions and this is the main root for the actual “o/r” limitations. When you look at data, you may want to see it from various perspectives and dimensions, if query for data, you are just selecting a slice of multidimensional data, but what SQL gives you is a very limited 2 dimensional view of it.
  35. I don't believe we will ever see QL using Java syntax be flexible enough for queries, but is this the cause for doing "O/R" mapping wrong?I look to RDBMS databases and SQL and feel the problem is there. SQL projects data in only two dimensions and this is the main root for the actual “o/r” limitations. When you look at data, you may want to see it from various perspectives and dimensions, if query for data, you are just selecting a slice of multidimensional data, but what SQL gives you is a very limited 2 dimensional view of it.
    what about using OLAP to handles cases where the data really is multi-demensional? Obviously, doing real-time OLAP queries have a ton of performance considerations, but in situations where a minor lag is acceptable and transactions aren't needed, using some sort of OLAP tool might be a better fit.

    A common example is viewing sales totals and sales percentage by time (week,month,year), geographic region and category.
  36. I would love to use Oracle Objects. I think they are really nice. But they never really took of because oracle does not provide any client side language bindings for them. Their JPublisher is just pathetic. I would love to see a JDO implementation which is able to persist to oracle objects. It probably will be much simpler than traditional O/R mapping. It supports inheritance, collections and other goodies out of the box
  37. Alex,

    can you provide more info. Did you really
    used OraObjects? do you really use inheritance nad
    overloading. What tool do you use to develop? what do you
    do when you need to change parent object (hint:
    manually drop and recreate ALLLLLLLLL children objects
    and ALLLLLLLLLL tables that store these objects).
    This is just show-stopper... unfortunately...

    Alex V.
  38. I have yet to find anything more flexible than a relational database with a well normalized schema and with the possibility to layer views over it using all kinds of set operations. Java (and many other OO languages) with its access paths that are cast in stone at compile time are not going to rival that any time soon. But lets see what the next big thing.
  39. I have yet to find anything more flexible than a relational database with a well normalized schema and with the possibility to layer views over it using all kinds of set operations. Java (and many other OO languages) with its access paths that are cast in stone at compile time are not going to rival that any time soon. But lets see what the next big thing.
    I have yet to see anything more flexible than viewing data either as one big byte array or one big stream of bytes.

    It's much more flexible than a relational database, trust me. My idea is so flexible that all databases are built on top of it, including Oracle and all the other relational databases, not to mention all the OODBMSs.

    ;-)

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  40. I have yet to find anything more flexible than a relational database with a well normalized schema and with the possibility to layer views over it using all kinds of set operations. Java (and many other OO languages) with its access paths that are cast in stone at compile time are not going to rival that any time soon. But lets see what the next big thing.
    I have yet to see anything more flexible than viewing data either as one big byte array or one big stream of bytes.It's much more flexible than a relational database, trust me. My idea is so flexible that all databases are built on top of it, including Oracle and all the other relational databases, not to mention all the OODBMSs.;-)Peace,Cameron PurdyTangosol, Inc.Coherence: Shared Memories for J2EE Clusters
    Yes indeed, and as you just demonstrated, natural language is _the_ most flexible representation of anything. Unfortunately, it is about as useful for expressing formal logical relationships as byte arrays. ;-)
  41. I have yet to find anything more flexible than a relational database with a well normalized schema and with the possibility to layer views over it using all kinds of set operations. Java (and many other OO languages) with its access paths that are cast in stone at compile time are not going to rival that any time soon. But lets see what the next big thing.
    I have yet to see anything more flexible than viewing data either as one big byte array or one big stream of bytes.It's much more flexible than a relational database, trust me. My idea is so flexible that all databases are built on top of it, including Oracle and all the other relational databases, not to mention all the OODBMSs.;-)Peace,Cameron PurdyTangosol, Inc.Coherence: Shared Memories for J2EE Clusters
    Yes indeed, and as you just demonstrated, natural language is _the_ most flexible representation of anything. Unfortunately, it is about as useful for expressing formal logical relationships as byte arrays. ;-)
    Yes, it is not a good idea to expressing formal logical relationships as BLOBS (JAVA objects) for this reason and to keep this garbage in memory cache as key value pair. You do not need RDBMS if you can express your data this way, BDB is designed for this OO use case and it works without problems for many enterprise systems.
  42. BTW this is a link with marketing stuff, but good software talks itself and doe's not need lame case studies on this site:
    http://www.sleepycat.com/solutions/customers.shtml
  43. I have yet to find anything more flexible than a relational database with a well normalized schema and with the possibility to layer views over it using all kinds of set operations. Java (and many other OO languages) with its access paths that are cast in stone at compile time are not going to rival that any time soon. But lets see what the next big thing.
    I have yet to see anything more flexible than viewing data either as one big byte array or one big stream of bytes.It's much more flexible than a relational database, trust me. My idea is so flexible that all databases are built on top of it, including Oracle and all the other relational databases, not to mention all the OODBMSs.;-)Peace,Cameron PurdyTangosol, Inc.Coherence: Shared Memories for J2EE Clusters
    Yes indeed, and as you just demonstrated, natural language is _the_ most flexible representation of anything. Unfortunately, it is about as useful for expressing formal logical relationships as byte arrays. ;-)
    Yes, it is not a good idea to expressing formal logical relationships as BLOBS (JAVA objects) for this reason and to keep this garbage in memory cache as key value pair. You do not need RDBMS if you can express your data this way, BDB is designed for this OO use case and it works without problems for many enterprise systems.
    I know BDB. It's great if you need navigational access, or look up things by key, but it's useless if you have queries of any complexity.
  44. Yes, it is not a good idea to expressing formal logical relationships as BLOBS ..
    OK, you're going to have to forgive me for not putting the following tag at the end of my last post: "</sarcasm>" ..

    If you go back, I was responding to someone who suggested that all this new-fangled technology was useless because the relational model with full normalization could support everything that anyone ever needed.

    It occurred to me that years ago, back when Oracle and Sybase were just used for toy projects (on those silly "mini" computers that nothing "real" was run on,) everyone asked why you would use this inefficent "SQL thingie" when you had COBOL and flat files and whatever the other acronyms there were that I can't remember anymore.

    Of course, if you're willing to take on all those responsibilities by yourself, you might as well just view all of your data as one big byte array .. but trying to explain the sarcasm any further would sap whatever small remaining amount of humor that it contained.

    So it seems that those once-new-fangled SQL guys are now old-guard and hoping that their pet technology is never eclipsed (no pun intended) by new ideas. That's how our industry works, and why it's still relevant.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  45. ...I was responding to someone who suggested that all this new-fangled technology was useless because the relational model with full normalization could support everything that anyone ever needed. ... So it seems that those once-new-fangled SQL guys are now old-guard and hoping that their pet technology is never eclipsed (no pun intended) by new ideas.
    Cameron, I'm not aware that I suggested OO was useless and my own history is actually quite different from the cliche you offer. I spent the best part of the 1990ies as a glowing advocate of everything OO. I worked with OO databases, read all the gurus, wrote O/R frameworks, etc. OO was my native language, so to speak, not COBOL. OO was the glitzy new thing and relational was the old "legacy" dinosour I had to cope with. It was only later that I started to investigate more of the formal foundations of various knowledge representation mechanisms, including OO, the relational model (which is based on predicate logic), description logics based systems (RDF, RDF/S, OWL, ...), set theory, etc. At that point I started to see the limitations of the type systems of mainstream OO and came back to acknowledge the interesting properties and the power of the relational model. I also realized that OO's success was severely hindered by a lack of formal clarity, which is in my view, one of the reasons why MDA doesn't get off the ground.

    So, what I want to say is this: I don't belong to those who defend an old model just because it's all they know or because it was once the new kid on the block. I'm trying to make informed decisions and I don't think the last word about all of this is ever to be spoken. There's much room for improvement but my feeling is that there hasn't been much activity in terms of searching for better models since OO has gained world domination in mainstream software development. That's a pity.

    Anyway, I did understand your sarcasm very well and my own reply to your posting wasn't meant to sound dead serious either.

    Alexander
  46. It occurred to me that years ago, back when Oracle and Sybase were just used for toy projects (on those silly "mini" computers that nothing "real" was run on,) everyone asked why you would use this inefficent "SQL thingie" when you had COBOL and flat files and whatever the other acronyms there were that I can't remember anymore.
    VSAM. :)
  47. What is the best way?[ Go to top ]

    I graduated with a BSCS in 2000. I still really don't see the big deal about O/R. I looked at various O/R packages after I had already created my own wrapper for JDBC. I basically use the following methods when using JDBC:

    * Most of my classes map table data into class properties from a HashMap
    * Most of my classes map the table data into a DOM node that is assembled into a DOM document with other classes.
    * Rarely do I need to process special queries (Usually for reports).

    The only thing that I think would improve things is a code generation GUI and a graph methodology (Right now all my data can be represented as a table or tree). It is simple to represent a graph in a table with an extra field containing a list of connected nodes. This is the only time I think it is okay to have a field in a table that contains complex data.

    99% of my queries are very simple and don't require joins or complicated SQL. i.e. no need for a special query language.

    Do any of the O/R tools support a better Graph Methodology than the one I described above and/or provide a Code Generation GUI?

    Here is a trick question you should ask yourself. Should all data be represented as a Graph? Why wouldn’t you represent data as a graph?
  48. What is the best way?[ Go to top ]

    I still really don't see the big deal about O/R. I looked at various O/R packages after I had already created my own wrapper for JDBC.
    This is very well. In the terms of Patterns of Enterprise Application Architecture by Martin Fowler, it seems to me that you'd be using something like a generic Table Data Gateway which works just fine as a persistence solution, at least as long as you don't have a complex domain model that would warrant using an O/R mapping.

    When the domain logic gets complex, it often makes sense to use a domain model, and then it often makes sense to use an O/R mapping solution to persist domain objects that form the model.

    A more thorough discussion can be found in the book -- a good part of it in the chapter "Mapping to Relational Databases" that is available online as a PDF document.


    Disclaimer: I dont't profit from anyone reading nor buying the book in any way, I just think it's a very good book that helped me a lot in clarifying the decisonmaking and tradeoffs in exactly this sort of choices (when to use a domain model and when transactions scripts, when to use data mappers and when more straightforward persistence code, when to use a central controller and when a page controller, ...).

    - Timo
  49. Graduate[ Go to top ]

    I graduated with a BSCS in 2000.
    How in the world is that in any way relevant to the rest of your post? Most, if not all, persons posting on this forum probably have an equal education, but you don't see all other posts beginning with those words. Please...
  50. Stop Trolling[ Go to top ]

    I graduated with a BSCS in 2000.
    How in the world is that in any way relevant to the rest of your post? Most, if not all, persons posting on this forum probably have an equal education, but you don't see all other posts beginning with those words. Please...
    I was not aggrandizing my education you idiot. I am sure that many a PhD posted to this thread with decades more experience than I. That is why I usually read the threads at this website.

    The point was to give the perspective that despite more than four years of experience and testing out EJB and O/R packages like Hibernate I still had not encountered the need for a complex O/R package with a special query language. i.e. I am happy with the simple O/R that uses a JDBC wrapper and interfaces.
  51. Providing hinting about what objects are going to be needed, rather than how to pull them from the rdbms ... becomes a lot more useful as you can express the same intention in a way that lets the system know what you want, rather than flat out telling it. ...
    This type of throughput-oriented hinting is hard to do through any existing o/r mapper I know of. It is not difficult to describe, however. It really begs for a flexible object query language.
    I believe that query languages can go only so far before becoming too cumbersome and convoluted to understand and implement. Making hinting an additional aspect of the query mechanism provides a neat solution to the problem described.

    While designing and implementing JDX OR-Mapper, we anticipated this issue and came up with the notion of directed queries such that one can dynamically describe (hint) what parts of the object graph should be fetched in a query. This hinting does not involve specifying complex SQL join statements. We went a step further to let the user describe filter conditions for a collection of objects connected at any level in an object graph. In essence the object graph to be fetched can be described very precisely, both in depth and in breadth, at runtime. JDX also provides mechanism to subsequently fetch any branches of an object that were discarded during the original query.

    -- Damodar Periwal
    Software Tree, Inc.
    Simplify Data Integration
  52. I agree with the guy who said that we've been talking about this since the early 90's when ObjectStore, Versant and Gemstone where going to kill Sybase and Oracle.

    I classify this into 2 kinds of problem domains. One is a relative small application which is the only or primary user of a database. This means a database of under 1/4 TB, maybe a couple hundred users doing screen based transactions. (Maybe using a J2EE model or any other kind of multi-tier arch) For this kind of app, the OR mapping is pretty simple and all these tools work pretty well. These apps could probably migrate to an OODBMS and still be fine.

    Another problem domain is where you have a corporate datastore. This maybe be anywhere from .5TB and up (The ODS in our shop is over 10TB). It has multiple primary applications that are heavy readers and updaters. Some of these applications are written in languages other than Java. There are SLAs on transaction response times so that access paths need to be statically bound and therefore much more predictible. The DB is used to feed one or more data warehouses. The schemas often need to evolve without bring down any of the systems.

    This is where RDBMSs shine.
    1) They provide a security model that is independent of an application to control who has access to do what with which data.
    2) They provide for schema evolution and definition
    3) They separate the role of the programmer who worries about logic and data structures from the one of the DBA who is responsible for backup/restore, physical setup - internal file locations, memory caches etc, index reorgs and much more.
    4) They are optimized for searching or fetching huge amounts of data often times orders of magnitude faster/cheaper that programming languages in cases where you are searching through hundreds of millions of objects(rows)

    This is not to say that RDBMSs don't have lots of shortcomings, but until some of the above points are addressed by these alternative approaches, you wont see the large shops throwing away their Oracles and DB2, so we will probably be working on the O/R question for another couple of decades :)

    -Andy F.
  53. Well this thread has turned pretty depressing. I guess we might as well go back to COBOL since is it much better at processing data than Java.

    We really need to move beyond relational databases. Massive corporate databases are a problem, not a solution. I don't know what the move is. I don't really see anything on the horizon. But I do know that every place I have been, and I have been plenty of places, thinking of data as just data has caused problems both great and small.
  54. Graph Paging: Are we doing O/R mapping wrong?[ Go to top ]

    Mark,

    Data is data is data. Only basic relations (let say 2Dimetional)
    are casted in stone: all other fractal object graphs of unknown
    dimention are temporary requests from temporary managers of
    temporary businesses....
    not ephemeral Data will overlived processes, reports, technology, Java, NET,
    you,me...

    Alex V.
  55. Oracle Objects[ Go to top ]

    Alex,

    I am afraid I can not help you much. I gave up on it 2-3 years ago after trying hard to get any meaningful java bindings for oracle objects. I tried TopLink but it was too expensive and not very good at it anyways. Plus at that time objects much more limited in their features than 10g has now. I would be very interested in your experience. I can see some difficulties in doing tunned writes on objects by O/R mappers. If I recall correctly it expects whole object in update statement
  56. Graph Paging: Are we doing O/R mapping wrong?[ Go to top ]

    Data is data is data.
    "Data" always has some rules about it. What is its definition? How does it come into existence. How long is it in existence? How is it changed? How does it relate to other "data"? How does it change in relation to "data". How did it get such a great role on a 90's TV show? :)

    Sure, if it is thought of as data, it will "out live" the code. Actually it really doesn't. The rules and views are usually converted to a different language.

    I was think about this last night and this morning. I currently sit next to datawarehouse people. And I hear things they say. And laugh and cry. "Well we have to get 13+ systems in synch to change to add/delete/modify this field." Nothing like moving quickly to changing business needs. I worked last summer at an organization (large and everyone would know their name) and they passed tons of "data" from one system to another. Two different systems created an online bill and an paper version. And they were always out of synch. And I spent days proving that our code was right and the "data" we got was wrong. This was going on before I showed up and after I left.
  57. Well this thread has turned pretty depressing. I guess we might as well go back to COBOL since is it much better at processing data than Java.
    Yes, COBOL is great.
  58. Here's a thought[ Go to top ]

    I don't know about other people, but the environments I am used to have fluctuating requirements and specs. Without some sort of ORM, the developers end up changing a bunch of code, because a column like "name" is split up into three columns in the DB. One other common scenario I see is having to integrate with multiple datastores, with a mix of flat file, B-tree, LDAP and RDBMS. In those situations, having an ORM doesn't make much difference. Situations where there's several database schemas, I find ORM is a life saver. It's a nightmare to have domain objects mapped directly to database tables. In these types of situations, I find ORM makes life easier and more manageable. having said that, for simple applications where there's only one database schema and there will always only be one, ORM will probably seem over kill.

    often in large corporations the mish-mash of database schemas is fixed in stone. throwing it away is absolutely not an option, so you better have an elegant way of accessing and updating data in those systems. Until someone writes an ORM that proves it is better than Hibernate for complex environments, I'm sticking to what works.
  59. Re: Here's a thought[ Go to top ]

    I think we need to decide who's boss: Objects or the relational model.

    If the relational model is the boss, then don't bother REMODELING the whole domain into objects.. What a waste of time!! Instead deal with the relational model directly, and spend your time properly layering your business processes with procedures. You get encapsulation by mandating the use of procedures to perform your business proceses. Inheritance, polimorphism?? Well, people have lived without it for years, you can too (in fact EJB doesn't really support this either).

    Maybe I'm dense or inexperienced but I don't see what the benefit is of modeling the business domain twice. If relational rules, then so be it. If you have the option of modeling everything from the OO perspective, then do that and use some automagic persistence service (like prevalayer or whatever) and forget the whole relational modeling entirely.

    What I've concluded is that relational usually rules. Relational is proven, well understood, and there is a science for evaluating it. OO, as much as we like the idea, and have tried it, we cannot do the same rigorous analisis over it, and few people, if any, get it right. Data driven apps, therefore need to be relational based. Use the OO for the framework, but keep your business logic as procedures (or cacheable commands) and forget remodeling, O/R mapping, and all that extra baggage. If your OO model is not going to replace the relational one "to make life simpler" for all your enterprise's apps, then why bother?
  60. Re: Here's a thought[ Go to top ]

    I think we need to decide who's boss: Objects or the relational model.If the relational model is the boss, then don't bother REMODELING the whole domain into objects.. What a waste of time!! Instead deal with the relational model directly, and spend your time properly layering your business processes with procedures. You get encapsulation by mandating the use of procedures to perform your business proceses. Inheritance, polimorphism?? Well, people have lived without it for years, you can too (in fact EJB doesn't really support this either).Maybe I'm dense or inexperienced but I don't see what the benefit is of modeling the business domain twice. If relational rules, then so be it. If you have the option of modeling everything from the OO perspective, then do that and use some automagic persistence service (like prevalayer or whatever) and forget the whole relational modeling entirely.What I've concluded is that relational usually rules. Relational is proven, well understood, and there is a science for evaluating it. OO, as much as we like the idea, and have tried it, we cannot do the same rigorous analisis over it, and few people, if any, get it right. Data driven apps, therefore need to be relational based. Use the OO for the framework, but keep your business logic as procedures (or cacheable commands) and forget remodeling, O/R mapping, and all that extra baggage. If your OO model is not going to replace the relational one "to make life simpler" for all your enterprise's apps, then why bother?
    I haven't decided either way, but for the fun of debating. Let's relational is the boss. what happens when you're given the task of writing a transactional application that has to work with 8 different relational database models? the models are different enough that there's a ton of collisions and mis-matches. If a single application is going to work with all 8 databases, there's 8 bosses to report to.

    How would you solve this challenge, given those 8 bosses are fixed? this is a common scenario by the way in large corporations with multiple divisions and IT departments.
  61. Re: Here's a thought[ Go to top ]

    well the (relational) data is king, chances are the apps will be rewritten a few times over before you change the database.
  62. Re: Here's a thought[ Go to top ]

    I dunno.. sounds like you have a big problem.

    When I have 8 dbs to deal with I'll look for that solution.. Many of us have a single database.. do we need to wrap that in objects, or CORBA, or DCOM, or webservices, or whatever today's fad is to worry about the day that may never come for us?

    Or should we just develop a simpler solution, that can scale nicely to a few hundred users, that will work fine for several customers, and not worry about a possible future need for integration (when that happens we'll charge again for it)..

    Your problem is that you have 8 kings. Develop the one to rule them all, and put enough clout to that solution. Then phase out old apps to your new solution and eventually move it all into a single data store. A monumental task in my opinion but EAI apps are designed to facilitate this.

    Can the current state of O/R mapping handle 8 dbs with any reasonable performance?? I'm ignorant there.. but I'd think you'd need sophisticated Bean Managed Persistance at that point. And that is manual not O/R mapping.. ouch.
  63. Re: Here's a thought[ Go to top ]

    I dunno.. sounds like you have a big problem.When I have 8 dbs to deal with I'll look for that solution.. Many of us have a single database.. do we need to wrap that in objects, or CORBA, or DCOM, or webservices, or whatever today's fad is to worry about the day that may never come for us?Or should we just develop a simpler solution, that can scale nicely to a few hundred users, that will work fine for several customers, and not worry about a possible future need for integration (when that happens we'll charge again for it)..Your problem is that you have 8 kings. Develop the one to rule them all, and put enough clout to that solution. Then phase out old apps to your new solution and eventually move it all into a single data store. A monumental task in my opinion but EAI apps are designed to facilitate this. Can the current state of O/R mapping handle 8 dbs with any reasonable performance?? I'm ignorant there.. but I'd think you'd need sophisticated Bean Managed Persistance at that point. And that is manual not O/R mapping.. ouch.
    the main downside I see in this approach is the time line for phasing out all the other models is decades, if at all. It's not acceptable to say "phase out that old mainframe and the old database model handling those important money transfers." It's only after the new application proves itself after a decade of rock solid production performance that the possibility of slowly phasing out a 20 yr old mainframe is even considered. Even then I seriously doubt it would happen. The whole idea of one schema to rule them all is also not possible, because each division can't step on the toes of the others.

    it's a delicate balance between integrating with existing legacy systems and allowing for enough flexibility to grow. In some cases, there are lots of entity beans. For the more trivial/moderate cases, ORM might be an acceptable solution. Some of the more hairy problems are related to situations where the object domain model has one-to-many relationship, but one of the data models has many-to-many. A common example of this in the financial world is the concept of issuer. A bond might have one or more issuers, which may assume 100% responsibility in case any of the other issues default. Some SEC regulations require trading systems to check the exposure of a fund to set a percent (like 10%), but how the system calculates the exposure demends on several parameters. In some cases, the data is many-to-many. while other systems may flatten the relationship to one-to-many.

    I prefer to take a pragmatic approach and try to find a balance. In most of these scenarios, there is no right solution and never will be. Especially when you're trying to build a system based on government regulations, which are written in legalise. It's quite a difficult challenge to solve well and still maintain rock solid performance, scalability and reliability.
  64. Why not try to satisfy both[ Go to top ]

    We have been happily doing both - good DB design and programming against decent object models (never database directly) I have to admit our apps do not shuffle terrabytes of data (would be a problem) they are just your average business intelligence/mission support apps with fairly complex logic. We usually design data model and object model in parallel and work on it untill we are happy with both. Data is a king in terms of life span. No question about it. It is also your last resort if you need to do something you data model is incapable to do (dirty back door stuff) or to integrate with other systems written in different languages. So your data model should be good, clean, documented and understood by DBAs - no ugly models autogenerated by O/R mappers. We do let our O/R mapping process to influence our data model but as little as possible. As result our object model is somewtah data centrick but very clean and simple. We try not to put any business logic in it (it changes too much and pollute persistence layer) but only hard rules (such as managing bi-derectional associations, enforcing various consistency, uniquiness and other rules we usually encode in metadata layer of our data/object model)
    Once Object Model is done we are very happy to never see jdbc in our day to day programming. I would like to praize Solarmetric's Kodo JDO here for very flexible O/R mapping layer which allows us to do all kind of neat things

    Alex
  65. Re: Why not try to satisfy both[ Go to top ]

    I just got rid of the Object Model entirely.. and just reuse procedures to load up the data I want to see. The everyone then depends on the data model.

    How many classes/pages do you change when your boss now says "can you add one more data field to that form?"?

    I change one web page view, a config file (that takes care of validation), and a database table. NOTHING in between (if the field requires some sort of calculation, then I may need to change the procedure that does that calculation).

    I bet you have to update your object model.
  66. Re: Why not try to satisfy both[ Go to top ]

    Well, here we go, "One man show" if amin leaves the company, Application ??????
  67. Yep[ Go to top ]

    +1
  68. Re: Why not try to satisfy both[ Go to top ]

    Well, here we go, "One man show" if amin leaves the company, Application ??????
    (My reason for discussing these issues is to try to understand why O/R mapping and object heavy approaches are advocated.. It is possible that I'm just dumb and don't get it. I admit that. That is why I'm throwing out my ideas so that you can help me believe in the ONE TRUE WAY(tm). If I argue hard it is not out of arrogance, but so I can taught otherwise.)

    No no no..

    You missinterpret me.

    What I'm trying to do is find a balance with the hacked up approach common in PHP/ASP development (but which is fast) and the overdesigned approach in J2EE, especially when Enterprise integration is NOT the issue. Of course, this is a philosophical discussion, in real life we have to follow standards so that other can come behind us and maintain what we've done.

    The approach I am advocating is the same one that many used before in client/server apps. Where we had a data model, and stored procedures for data access. This is not a new thing. Then the app describes the business proceses as Commands (GoF pattern) which can be cached if needed.

    Advantages:
    - less code to maintain
    - easier to maintain
    - all my programmers have data model training, they aren't affraid of schemas
    - no object model to maintain
    - you don't need to be a rocket scientist to understand it

    Disadvantages:
    - not the "standard" Java way to do things

    Two model disadvantages:
    - two models is twice the work
    - programmers have a hard time grokking two models
    - changes need to be in both models
    - complexity not needed for simpler apps

    Two model advantages:
    - helps if you have multiple dbs
    - hepls if you have multiple datasources
    - good if you need to do some serious caching for heavy scaleability
    - "standard" model, so dedicated programmers will understand it (or think they do)
    - helps separate responsibilities in large teams
    - (... add some more here so I know)

    Why exactly is my approach harder to maintain than EJB, or a POJO model?

    (other than the fact that you may be able to find highly trained EJB people.. but you'll need to read a bit of documentation + code to grok my approach??)
  69. Re: Here's a thought[ Go to top ]

    I think we need to decide who's boss: Objects or the relational model.
    You are right, but "easy of use", sexy JAVA and OO religion is just more popular in "average programmer world" at this time.
  70. OO Religion[ Go to top ]

    I am really hoping that most people posting on this site would agree that the OO approach to design and development is not a religion but rather the furthest we went so far in our efforts to make our systems easier to design, develop and manage. Everything in the OO approach is based on CLASSIFICATION. Classification, as a way of managing complexity, is yet be beaten, and that's exactly what one would have to do in order to push the OO approach into domain of religion.

    It is true that OO databases haven't caught up yet with far more mature relation counterparts, but that will hopefully change. In the meantime, I am willing to strugle with less than perfect OR mapping approach.

    Aleks
  71. OO Religion[ Go to top ]

    All modeling approaches are based on classification. OO is very good way to model and code things too, but I do not think it is a solution for all problems and I see nothing wrong in declarative approaches like SQL and COBOL.
  72. OO Religion[ Go to top ]

    You could say that all modelling techniques are based on some form of classification, but:

    Classification, in the broadest terms, means grouping things according to their characteristics (properties and behaviour). Except in the most trivial cases, the things being classified have some common characteristics, which ALWAYS leads to hierarchical organization of things, where things that are not in leaf nodes are ABSTRACT, and where things further from the root of the hierarchy INHERIT characteristics from the things closer to the root - which introduces POLYMORPHISM.

    To model a system using a technique that doesn't support classification in the broadest terms, means accepting less than optimal way of expressing complexities of the system (modeling cross-concerns is a separate issue that I don't want to go into here).

    Relational modeling clearly doesn't support classification in the broadest terms, but rather introduces limitations like reducing the set of characteristics of things to properties only (no behaviour and ability to express polymorphism), and making extremely difficult to express abstraction and inheritance in a sensible way.

    SQL/COBOL will not die anytime soon since 99% of today's persistence systems are based on them, but that doesn't mean that we don't know that better is needed. I will live with SQL and relational databases because I have to, but I don't want to propagate less-than-perfect model to my business tiers.
  73. OO Religion[ Go to top ]

    Relational modeling clearly doesn't support classification in the broadest terms, but rather introduces limitations like reducing the set of characteristics of things to properties only (no behaviour and ability to express polymorphism), and making extremely difficult to express abstraction and inheritance in a sensible way.
    I see no problems to express abstraction and inheritance in E/R and it is more powerfull than any OO modeling technique that doesn't support rules (UML). OO modeling techniques can not express behaviour and it must use some declaratyve way (<<stereotype>>) picture, E/R doe's not need any workarounds to express behaviour .SQL,DDL,DML are declarative themself and clear without pictures, probably it is too abstract and takes more time to learn, but it is a great idea. Declaratyve ways like AWK and regular expressions are very powerfull too, it was proved in practice and in theory. JAVA started to use this technique too (annotations), but it can not replace declaratyve languages for data processing or string maching.
  74. OO Religion[ Go to top ]

    I see this forum implementation is bit crappy and I can not read my last post, I hope it will be possible to read it on Windows.
  75. OO Religion[ Go to top ]

    I see no problems to express abstraction and inheritance in E/R
    Yes, you can express it, but you would be really thinking in objects and translating your mental pictures of objects to ER notation, adding in the process implementation details like primary/foreign keys as the way to tie together related entities. If I am looking up an entity in my data store, I would like to be able to do this just by specifying the object type and object identity. With ORDB I am forced to deal with implementation details on how the object is partitioned in the DB schema. This is why I need OR mapping - to shield my application from the data store implementation details as much as I can.
    and it is more powerfull than any OO modeling technique that doesn't support rules (UML).
    UML is just a notation that supports the OO modeling tachnique. BTW, it also support the ER tachnique too.
    OO modeling techniques can not express behaviour and it must use some declaratyve way (<<stereotype>>) picture, E/R doe's not need any workarounds to express behaviour .SQL,DDL,DML are declarative themself and clear without pictures, probably it is too abstract and takes more time to learn, but it is a great idea. Declaratyve ways like AWK and regular expressions are very powerfull too, it was proved in practice and in theory. JAVA started to use this technique too (annotations), but it can not replace declaratyve languages for data processing or string maching.
    With OO and UML modeling you express the objects' behaviour through sequence diagrams, state transition diagrams, etc. When you start talking about SQL I don't know anymore if you are talking about implementation, or about SQL as a mean to model behaviour. Anyway, SQL is far from being able to express the entity's behaviour, and that is why procedural languages like PL/SQL were invented.

    The more I have to deal with OR mapping, the more I see OR databases as legacy systems that I am forces to live with.
  76. OO Religion[ Go to top ]

    Yes, you can express it, but you would be really thinking in objects and translating your mental pictures of objects to ER notation, adding in the process implementation details like primary/foreign keys as the way to tie together related entities. If I am looking up an entity in my data store, I would like to be able to do this just by specifying the object type and object identity. With ORDB I am forced to deal with implementation details on how the object is partitioned in the DB schema. This is why I need OR mapping - to shield my application from the data store implementation details as much as I can.
    It may be a compromise, but I find the ER approach works much better than a pure OO approach when dealing with RDBMS systems. In other words, you can't quite pretend that a relational database is JBOO (just a bunch of objects ;-) ..

    Maybe JDO2 / EJB3 (etc.) will eventually change my mind, but it's just too hard to completely hide certain types of details in Java. I don't know if OODBMS as a concept is achievable for general purpose systems. The RDBMS model is awful long in the tooth, and is the source of all sorts of frustration, but it is the work horse until someone comes up with something more compelling.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  77. OO Religion[ Go to top ]

    It may be a compromise, but I find the ER approach works much better than a pure OO approach when dealing with RDBMS systems. In other words, you can't quite pretend that a relational database is JBOO (just a bunch of objects ;-)
    Recently I have been trying to acheve this with Kodo, which has already implemented most JDO2 features. Still, I cannot pretend that my DB2 stores objects, I am constrained by the existing data model, so the approach is simmilar to the "object-relational" modeling process mentioned earlier in this thread.

    But my point was different - it is the RDBMS and ER modeling itself that is becoming the road block on the way of pushing data store implementation details further away from applications. I am well aware that I will have to live with relational databases for many more years, but I keep dreaming of better ways.
  78. OO Religion[ Go to top ]

    It may be a compromise, but I find the ER approach works much better than a pure OO approach when dealing with RDBMS systems. In other words, you can't quite pretend that a relational database is JBOO (just a bunch of objects ;-)
    Recently I have been trying to acheve this with Kodo, which has already implemented most JDO2 features. Still, I cannot pretend that my DB2 stores objects, I am constrained by the existing data model, so the approach is simmilar to the "object-relational" modeling process mentioned earlier in this thread.But my point was different - it is the RDBMS and ER modeling itself that is becoming the road block on the way of pushing data store implementation details further away from applications. I am well aware that I will have to live with relational databases for many more years, but I keep dreaming of better ways.
    But as long as people continue to think "data" and RDBMS first, it will be very difficult.

    How about a Post Relational database? - http://www.intersystems.com/cache/

    And an interesting read - http://www.intersystems.com/cache/technology/whitepapers/baroudi_bloor.pdf

    Not to leave the .Netters of out it - http://www.developerfusion.com/show/4564/
  79. OO Religion[ Go to top ]

    How about a Post Relational database?
    The Pick / Prime crowd just re-named "multi-valued architecture" to "post-relational architecture." Unfortunately, these technologies are (for the most part) pre-relational, with a bit of OO glued on top.

    (Fool me once, ..)

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  80. OO Religion[ Go to top ]

    How about a Post Relational database?
    The Pick / Prime crowd just re-named "multi-valued architecture" to "post-relational architecture." Unfortunately, these technologies are (for the most part) pre-relational, with a bit of OO glued on top.(Fool me once, ..)
    I've not used, so Ok. I'll take your, uh, advice. :)

    But I do like their adverts in the magazines. Hooking a RDB to an OO application is not the optimal thing to do. But until we have something that at least is a suitable sub and the PHBs will let us use it, we'll deal with it as beast we can.
  81. OO Religion[ Go to top ]

    With OO and UML modeling you express the objects' behaviour through sequence diagrams, state transition diagrams, etc. When you start talking about SQL I don't know anymore if you are talking about implementation, or about SQL as a mean to model behaviour. Anyway, SQL is far from being able to express the entity's behaviour, and that is why procedural languages like PL/SQL were invented.The more I have to deal with OR mapping, the more I see OR databases as legacy systems that I am forces to live with.
    There is nothing object oriented in State diagrams, object oriented programming language can be used to implement finite state machine as "State" design pattern, but state transition table is the most common way to implement it (using transformation from "composite" or "herarchy" to "flat" model)

    DDL express entity behaviour as rules in declaratyve way and you do not need to generate and to customize any code to add rules to data, procedural languages can be used to extend SQL or DDL, but not to express entity behaviour.

    I find integrated ER case tools or plain text editor is a better way to model database than UML case tool with ER profile and adapter generators.
    This OO stuff is good to express button behaviour, events, properties, but not entity attributes and relationships.
  82. OO Religion[ Go to top ]

    This OO stuff is good to express button behaviour, events, properties, but not entity attributes and relationships.
    But entities ARE objects with attributes AND implicit behaviour. Relationships, presented in UML notation, are those lines that connect objects. The impedance missmatch problem is created at the moment when you specify in the model HOW the relationships will be implemented (primary/foreign keys). This problem doesn't exist for higher level objects (above the database) because programming languages hide implementation of relationships among objects, so we can easily navigate through relationships without worying how objects' references are resolved. We just need to extend this concept to databases. There is a big issue on the best query language to be used with such object data stores, but I don't want to start on that again.
  83. OO Religion[ Go to top ]

    Object model is not something bad, but there are more ways to make it bad for data access than ER. This is not a trivial problem if you will start to think about data migration and evolution, one of ways is to avoid this problem if possible. ER modeling has no standard notation (if we do not count extended UML ), but it is more mature and I think it solves *data* related problems better.
  84. OO Religion[ Go to top ]

    E/R doe's not need any workarounds to express behaviour .SQL,DDL,DML are declarative themself and clear without pictures, probably it is too abstract and takes more time to learn, but it is a great idea. Declaratyve ways like AWK and regular expressions are very powerfull too, it was proved in practice and in theory. JAVA started to use this technique too (annotations), but it can not replace declaratyve languages for data processing or string maching.
    When you start talking about SQL I don't know anymore if you are talking about implementation, or about SQL as a mean to model behaviour. Anyway, SQL is far from being able to express the entity's behaviour, and that is why procedural languages like PL/SQL were invented.

    The more I have to deal with OR mapping, the more I see OR databases as legacy systems that I am forces to live with.
  85. Here's a thought[ Go to top ]

    Hi Peter

    In other threads you have mentioned about the 3K TPS type of systems you work with. Have you used OR mapping (Hibernate?) successfully in these environments?

    Pratheep P
  86. Here's a thought[ Go to top ]

    Hi PeterIn other threads you have mentioned about the 3K TPS type of systems you work with. Have you used OR mapping (Hibernate?) successfully in these environments?Pratheep P
    For read operations I've seen ORM used, but for write operations like hardcore transactions it's usually hand coded for optimal performance. I haven't had a chance to try hibernate in high throughput situations for writes, so I'm not qualified to make that judgement. You're better off asking Hibernate mailing list that question. I have my own biases, but I find ORM useful for handling different database models and reducing the cost of trivial changes, like column names or splitting one column to multiple. The situations I've seen first hand is the UI doesn't change, but some additional columns and tables are added for back-end processing. In those cases, an ORM makes it so the back-end can add necessary concepts without creating more work for GUI developers un-necessarily. Cranking up throughput is hard enough without adding lots of mapping and conversions. For the last two years, I've been working on regulatory compliance, so my focus is very narrow. The types of problems I deal with are related to handling all sorts of data models and making it so the object domain model is consistent and extensible. All within some heavy performance requirements.
  87. Here's a thought[ Go to top ]

    Thanks :-)

    Pratheep
  88. Here's a thought[ Go to top ]

    Thanks :-)Pratheep
    I guess the bottom line is to take time to know the technology, and business requirements thoroughly before chosing a particular approach. I'm not sure I provided any insight, since it's mostly common sense. An additional note, for non-performance intensive applications where the integration is hairy, I tend to lean towards some sort of mapping approach. In those cases, I prefer to use JAXB style domain objects and have the applications only use the interface. That way an user can provide their own ObjectFactory, which handles the dirty details. It's not a perfect solution, but I find it fits my needs over other options.
  89. If java is actually a graph (which I agree with) and persistence means swapping parts of the graph in and out from disk, then the Java objects should simple be stored binary and not be forced into relational tables.

    Now, this creates tons of problems ofcourse, but with versioning the persisted classes the majority of theseproblems probably could be overcome with e.g. automatic scheme evolution.

    Unfortunately the probably most disturbing problem is with querying these objects for report generation, but I remember a persistency solution (but what was the name?) which had a SQL dialect directly on top of the objects.

    So instead of seeing tables as objects, they saw objects as tables.

    Hmmmm. What is the main use of the data? Report generation or access from an application.
  90. If java is actually a graph (which I agree with) and persistence means swapping parts of the graph in and out from disk, then the Java objects should simple be stored binary and not be forced into relational tables.
    This does not solve the primary problem of how do you manage references in your object graph. The VM understands this problem, it is responsible for managing them. If you persist and/or retrieving a portion of an object graph using Toplink, you must either to the reference management yourself or polute your object graph with toplink value objects for proxies. Each of these solutions is cumbersome and is an indication that asolution is being applied at the wrong level or in the wrong place. This is why I feel that the VM it'self should be responsible for all interactions with the persistance memory space.
    automatic scheme evolution.
    is a problem in every system. If you change the schema in a relational DB, it will be a rare occasion when you will not be forced to change SQL. Unfortuantely, OODBMS systems really exposed this fact where it doesn't seem to be "in your face" when using RDBMS but it's still there... just seems to be accounted for differently for some reason.
    Unfortunately the probably most disturbing problem is with querying these objects for report generation
    and here is the rub... lack of a standard query language. That problem was being worked on but by the time the work started... Oracle and others already had it solved. No doubt, RDBMS is a "dummer" technology (sophisticated under the hood) and thus is more understandable to the average Joe.
  91. If java is actually a graph (which I agree with) and persistence means swapping parts of the graph in and out from disk, then the Java objects should simple be stored binary and not be forced into relational tables.
    This does not solve the primary problem of how do you manage references in your object graph. The VM understands this problem, it is responsible for managing them. If you persist and/or retrieving a portion of an object graph using Toplink...This is why I feel that the VM it'self should be responsible for all interactions with the persistance memory space.


    Quite correct. The idea is to approach persistency WITHOUT an RDBMS underneath. The actual VM memory is persisted (probably leaving out some memory related stuff, but there is NO mapping at all). Most data interaction will be from the application, so the storage should be optimized for that. And the when the time comes to access the data from, for example, a reporting tool, or some adhoc way, there is a relatively poor performing SQL interface (in constrast to the otherwise relatively poor performing Java interface), because it has to map onto the binary objects.
  92. JXPath[ Go to top ]

    Unfortunately the probably most disturbing problem is with querying these objects for report generation, but I remember a persistency solution (but what was the name?) which had a SQL dialect directly on top of the objects. ...
    First, let me point you to a query language I think might be useful: JXPath (http://jakarta.apache.org/commons/jxpath/). It is very much like XPaths for XML but it works on object graphs instead of XML documents. Of course, it is more suited for trees, but you can traverse any grpah like a tree -- just watch out for cycles.

    Second, I think a powerful way to think abotu object persistence is to think what you would do if you had no persistence at all. For example, if you we're writing a completely in memory address book and you needed to frequently search for Person objects by their last name, what would you do? Well, if you use a List, you could to a linear search. Maybe you'd also have a Map keyed off of last name to the same Person objects to speed that up. Either way, the List and the Map are both Objects that you "query" just not with a special query language. That seems ot be part of the motivation behind projects like XL2 (http://www.xl2.net). If you have transparent peristence, then you don't every need to query the st ore directly but must access your objects through other objects. Once you have transparent persistence, then it doesn't matter if the source ofthe objects is in a DB, the other side of an RMI connection, or whereever.
    What is the main use of the data? Report generation or access from an application.
    I've pondered this a lot when I think about transparent Object persistence. One of the big advantages of an RDBMS is all of the tools that can utilize the standard query language to look at the data. However,t his can always be a danger, as well. As others have said, OO is supposed to provide encapsulation. That means an RDBMS is a back door. Someone could go and stick data in your RDBMS tables that fits the table structure but not the Object ti is backing. Some woudl say you should use stored procedurs and triggers then, but to truly replicate your OO encapsulation would mean duplicating all business logic in the RDBMS!

    That's what I like about the concept of something like XPath paired with XL2. You can still do reports with a query language, it's just that the query language works with Objects themselves to honor encapsulation rather than violate it.

    Of course, I'm not sure how much even I buy into my own statements above :)
  93. Go iBatis with E/R mapping and you'll find it so much more productive.

    As for going against multiple DBs, we have gone against Oracle, PostgreSQL,
    MS, MySQL shortly. Tweaking to the platform and schema is a cinch.

    DB is King! Not just if the app has to support Fortune 500 traffic (say, like Procter & Gamble) but esp. then. Data is data, all display is tabular (I agree with Vic), keep them in Collections, Maps.

    For rich apps (Swing, SWT, RCP), you don't need to move objects between client and server. Just move a Hashmap and an identifier for the insert/update/delete Sql statement you want to execute remotely. Did it and it rocks! That's iBatis for distributed apps.

    We rolled out 4000 copies of a desktop app with iBatis and Postgres as a searchable database. I don't see why complicate our lives with OR mapping.
  94. Go iBatis with E/R mapping and you'll find it so much more productive.As for going against multiple DBs, we have gone against Oracle, PostgreSQL, MS, MySQL shortly. Tweaking to the platform and schema is a cinch.DB is King! Not just if the app has to support Fortune 500 traffic (say, like Procter & Gamble) but esp. then. Data is data, all display is tabular (I agree with Vic), keep them in Collections, Maps. For rich apps (Swing, SWT, RCP), you don't need to move objects between client and server. Just move a Hashmap and an identifier for the insert/update/delete Sql statement you want to execute remotely. Did it and it rocks! That's iBatis for distributed apps.We rolled out 4000 copies of a desktop app with iBatis and Postgres as a searchable database. I don't see why complicate our lives with OR mapping.
    Yes, it is a good way and there is no problems to wrapp dynamic data structures with typed structs too, It is trivial to generate this stuff if "static" is very important aspect (I found it is possible do it at runtime too)
    I can not understand how tables ans SQL are more "complex" than graphs and home made QL's. I see the need of O/R mapping as the flaw in modeling.
  95. Value of OR Mapping[ Go to top ]

    O/R products try to achieve several independent goals at once:

    (1) an "easy" way of working with the RDBMS in terms of short and easy to understand code on your side;

    (2) a reusable objects i.e. the same object will be used several times in the _same_ or a different application;

    (3) the possibility of representing the data at the layer you work with differently than the underlying storage;

    (4) you get objects, because other people say objects are good, though you are not too sure exactly why;

    (5) more "correct" or optimal code, mostly in the engine, on the theory that it will be better than your hand-crafted solution.

    Looking at these one by one, my HO is:

    (1) Sure the code will be short but it will be far from easy to understand what is going on behind the vendor APIs. The first time you troubleshoot a performance or a mapping problem (and that will surely surface in any moderately complex application), you will dearly pay for all the time you may have saved up to that moment.

    (2) Sometimes you read the data once and then use it in several different places. That may warrant the effort of creating a better repository for the data than whatever you get from the database or your favorite wrapper. Whether you do it by hand or with a tool, the effort should pay off when you use the same result multiple times. But more often you just have a page or screen that has to be populated, or you just go through the data once and then forget it, and so instead of bothering to create objects to read the data in, then take it out and present it, the most effortless approach is to just work with the data in whatever shape or form it comes.

    (3) In my humble, not too long career, I have never seen the need for this. When an application is modeled and services an RDBMS model, it follows the model to the letter. If you rename a field in the database, I don't see why you would try to hide that from the code that works with the model in the application. Just make the same change thought the application, what can be clearer than that? Obscure the change in mismatches between the two models and wish good luck to the folks who will maintain it. In the few cases where it is necessary to have independence between a database and an application model, there are better ways than some obscure O/R mapping tool. In that case the two models are very different. There are data conversion products that are far more powerful than any O/R mapping tool. You can see the changes right in front of you, as opposed to have the tool hide them from you, which is one of the goals of O/R and the philosophy of those who architected most of those tools. And if you are in the do-it-yourself camp, just incorporate some model conversion step in your application that isolates the changes, so that they can easily be found and maintained later.

    (4) I see the good side of OO when the data and *all* of the behavior is tightly coupled. For a data model that is persisted to a back end and presented in a front end, it means the objects should contain all the logic that interacts with the database as well as all the logic that populates and collects changes from screens. Moving data between the application and the database is not too different from moving data between the application and the screens. Build an OO model that way, bundle all application layers into one, and you will see its power and level of code reusability within the application. However, keeping the data separate from its persistence or presentation is usually a better idea, and the myriad of frameworks and APIs that help transfer data between application layers make it a relatively painless choice. On one side you have all kinds of JDBC wrappers and frameworks, and on the other side you have Struts and similar frameworks that will work with your beans and JSP screens.

    (5) Really? In some prior life awhile ago I was misfortunate to have been forced to use some commercial O/R products. What a pain! Everything I had seen was a poor attempt to convert a product that had started as something meant to be used on a client/server application into a server-side solution. Endless performance problems and arguments with the vendors, inevitably ending with the excuse, "these are new and immature technologies, we are a new company, this is a new version"... I wonder why the technologies don't seem that immature when I take a simple proven approach.

    My conclusion? O/R frameworks are mostly not worth it. I favor JavaBeans that match the database tables because my latest JDBC wrapper populates them easily using reflection, and my favorite web services tool (Axis with major additions) generates beans from XML schema/WSDL which are directly usable throughout the application. As for an application architecture, I like the book "UML with Components" which advocates breaking down your model domain into independent parts and then creating one major component per model part and making it responsible for all data maintenance in that part. It calls them "System Components" that are reusable across applications and independent of each other, and distinguishes them from "Application Components" that are the higher layer and tie the application together, and are responsible for the UI and the interaction between this application and the environment.

    My 2 cents.
  96. Value of OR Mapping[ Go to top ]

    My conclusion? O/R frameworks are mostly not worth it. I favor JavaBeans that match the database tables because my latest JDBC wrapper populates them easily using reflection, and my favorite web services tool (Axis with major additions) generates beans from XML schema/WSDL which are directly usable throughout the application
    I am glad that you built yourself an O/R mapping tool that serves your needs. Perhaps you can publish it so people can use it, should they dicide it fits their needs and it is better than Hibernate or some other popular tools. Personaly, I wouldn't use your tool because I like some features in other O/R mapping tools like object caching, query caching, batching, etc.
  97. Value of OR Mapping[ Go to top ]

    Oh, it does not deserve the name O/R tool. It is just a JDBC wrapper API that adds the nice features of extracting all sql statement texts in configuration files, logging of statements incl. parameter values, populating JavaBeans using reflection, and encapsulating the sequence of connect/execute/close within the wrapper, so that you can't forget to close it yourself. Can't publish it as the company owns it, but there is harly a point--anybody favoring an API wrapper would most likely use their own.

    I understand your point about caching, but allow me to split it into two parts: automatically looking up an object before reading it in case it has already been read (and assuming you are not afraid of stale data), and the actual caching. The latter is useful on its own, and you will often need to also cache objects/data that did not come from the database. The former is sometimes useful, sometimes not, depending on the app. Most read operations are simple enough, so that you don't need any help for keeping track of what was read from the database. But there are applications, and I have worked on such, that have long complex reads going through multiple levels and components calling each other, where it is hard to predict whether (and when) a particular piece of data will be read. For those cases, I fully agree, it is nice to have object caching based on object type and primary key value. My wrapper API does caching of results, but that is not tied to single sessions, transactions or updates. So it is mostly useful for static data.

    The JDBC drivers and the databases we use do the query caching, as long as you use prepared statements. You create a new prepared statement and pass text, and the driver (or server?) matches it with cached statements.

    My wrapper supports batching to the extent that JDBC does it, which, if I am not mistaken, is somewhat limited, as it either requires you to use regular SQL statements whose text and parameters you concatenate or identical parameterized statements, just with different parameter values. I don't know what Hybernate does. It can't exceed what the JDBC APIs allow it to do. But since it is likely that it generates all the update SQLs, it may be doing a nice job of batching the updates, while you are not concerning yourself with creating dynamic SQL. Ok, that will be in the next version of my wrapper. :-)

    In my company we have gone through some endless discussions on this topic of EJBs vs O/R vs utility APIs, and my personal opinion was that the simple solution of a wrapper will work better for us than any known O/R tool or entity EJBs (which support object caching). By the way, Hybernate was not available when we made the decision.

    I did write a couple of O/R frameworks: one for my thesis and one for the web products at my former company. That one worked pretty well, I was happy with it. But it wasn't written in Java.
  98. Value of OR Mapping[ Go to top ]

    You say Potato (long A), He says Potato (short A) ... .
    You say Wrapper, He says Mapper ... .
  99. O/X mapping is hard[ Go to top ]

    The problem isn't limited to O/R mapping - object to xml mapping, and remoting technologies like RMI all suffer from the same problem: when you have a graph of objects instead of an object which simply contains other objects as components, how do you efficiently transform that graph of objects to and from any other representation efficiently? Java serialization takes the lazy way out and just serializes the entire graph, which for most serious domain models is out of the question.

    You run into the same problem trying to map an object model that is in one environment like Java into another one like DCOM, .Net, etc. Or even within Java, try to write code that maps the SWT object model onto the Swing one and vice versa. Possible, but quite difficult.

    To really do any of these right, you have to painstakingly tailor the mapping. And it is very sensitive to changes in the domain, you may have to painstakingly re-tailor the mapping.

    This is not so much an O/R problem, it is a fundamental problem with oo technology. Object graphs are expressive, powerful, able to provide encapsulation and polymorphism, all that great stuff. But objects suck at one thing: being interoperable with other technologies and representations.

    Contrast this with "flat" technologies like RDBMS and C. They are low-level, un-sexy, but they are far more interoperable. You can build any kind of layer on top of them you want, in any technology you want. Compare the task of wrapping the windows GUI api in Java (a flat API) with wrapping Qt, an object-oriented one. Qt requires you to inherit from the C++ classes in some cases to get the desired behavior - now how are you going subclass the peer object from Java?
  100. O/X mapping is hard[ Go to top ]

    The problem isn't limited to O/R mapping - object to xml mapping, and remoting technologies like RMI all suffer from the same problem: when you have a graph of objects instead of an object which simply contains other objects as components, how do you efficiently transform that graph of objects to and from any other representation efficiently? Java serialization takes the lazy way out and just serializes the entire graph, which for most serious domain models is out of the question.You run into the same problem trying to map an object model that is in one environment like Java into another one like DCOM, .Net, etc. Or even within Java, try to write code that maps the SWT object model onto the Swing one and vice versa. Possible, but quite difficult. To really do any of these right, you have to painstakingly tailor the mapping. And it is very sensitive to changes in the domain, you may have to painstakingly re-tailor the mapping.This is not so much an O/R problem, it is a fundamental problem with oo technology. Object graphs are expressive, powerful, able to provide encapsulation and polymorphism, all that great stuff. But objects suck at one thing: being interoperable with other technologies and representations.Contrast this with "flat" technologies like RDBMS and C. They are low-level, un-sexy, but they are far more interoperable. You can build any kind of layer on top of them you want, in any technology you want. Compare the task of wrapping the windows GUI api in Java (a flat API) with wrapping Qt, an object-oriented one. Qt requires you to inherit from the C++ classes in some cases to get the desired behavior - now how are you going subclass the peer object from Java?
    That's a great explanation of the real problem. Much clearer than my attempt.
  101. WebObjects - Enterprise Objects and KVC[ Go to top ]

    I've heard several people here mention using XPath to query object graphs. This seems very natural to me as I've used similar techniques to represent object graphs as XML for use with XSLT and XPath queries.

    Originally developed by NeXT, Apple's WebObjects uses KVC (Key Value Coding) to access to Enterprise Objects values, including related object graphs. EOs can be based on a generic object base class or a subclass you define with additional business logic. You can receive values from objects using simple calls such as..

    String streetAddress = (String)listing.valueForKeyPath(“address.street”);

    to retrieve the street from a listing's related address object, or ...

    NSArray employees = (NSArray)this.valueForKeyPath("company.employees.firstName");

    to retrieve the first names of all employees.

    You can also set values using a similar technique (not a very useful illustration, but) ...

    this.takeValueForKeyPath("Fred", "company.employees.firstName");

    Which sets the first name of all employees to "Fred". Very powerful.

    In addition, KVC abstracts how EOs present their data by following a ordered list of access methods, including accessors, public fields, default query functions, etc., to handle requests.

    EOs can be configured to pre-fetch related objects by default or can be lazily loaded using "Faults" that are resolved to EOs when requested. Fetch Specifications can be created which encapsulate commonly used queries, and are assigned to Entities for use at runtime.

    WebObjects also offers a lightweight, data only representation, RawRows, which allows you to retrieve an array of results without creating a full-blown EO for each result. A EO can be easily be created from a RawRow object if required. EO properties, such as the number of related entities, can be derived from raw SQL queries and accessed using standard accessors or KVC.

    For those of you who prefer to use Eclipse over XCode / Project Builder, WOProject is an Eclipse based IDE for WebObjects Development that uses Ant for project development, builds and deployment.

    KVC OVERVIEW
    http://developer.apple.com/documentation/WebObjects/Enterprise_Objects/Introduction/chapter_2_section_6.html

    OBJECT GRAPH / RELATIONSHIPS WITH KVC
    http://developer.apple.com/documentation/WebObjects/Enterprise_Objects/EnterpriseObjects/chapter_3_section_7.html

    WOProject
    http://objectstyle.org/woproject/
  102. Design data first, not objects first[ Go to top ]

    O/R mapping is not merely an activity that occurs after domain modeling; it’s a fundamental design issue. Given the constraint that the state of persistent objects must be stored in a relational database, you’re already doomed to deal with the so-called impedance mismatch, so give up your pure OO design methods and use “object/relational design” methods instead.

    Don’t try to generate relational data structures from OO class definitions. Derive both the class definitions and the database table definitions from an abstract domain model while following some simple design rules:

    1. Don’t allow inheritance relationships in the domain model. (Gang of Four, Joshua Bloch and other gurus tell us to favor composition over inheritance anyway.)

    2. Although each class in the domain model should have exactly one corresponding database table for storing object state, domain classes generally don’t exhibit any persistence capability. They can exhibit any kind of behavior except persistence, i.e., they cannot have methods such as find, store, etc.

    3. For all persistence functionality, include in the design a separate Data Mapper class corresponding to each class/table pair. (This should be a realization of Martin Fowler’s Data Mapper pattern from his book “Patterns of Enterprise Application Architecture”.) Useful implementations of Data Mapper classes can be generated from the database schema.

    If your domain modeling tool cannot generate DDL, regard the database schema to be the definitive representation of the domain model and derive the persistent instance variables for each domain class from database column definitions. (Whether every instance variable should have a public setter and getter method is a religious debate beyond the scope of this discussion.)
  103. Design data first, not objects first[ Go to top ]

    Given the constraint that the state of persistent objects must be stored in a relational database, you’re already doomed to deal with the so-called impedance mismatch, so give up your pure OO design methods and use “object/relational design” methods instead.
    I fully agree, I guess I went too far beating the "OO is a religion" comment.
    3. For all persistence functionality, include in the design a separate Data Mapper class corresponding to each class/table pair. (This should be a realization of Martin Fowler’s Data Mapper pattern from his book “Patterns of Enterprise Application Architecture”.) Useful implementations of Data Mapper classes can be generated from the database schema.
    Data Mapper functionality is alredy included in OR mapping tools like Hibernate, Kodo and others. Why wouldn't we just let the OR mapping tool persist our domain objects?
  104. Data Mapper functionality is alredy included in OR mapping tools like Hibernate
    Yes Hibernate is a great tool and Hibernate3 is goind to be more RDBMS friendly. O/R mapping tools are not equal, most of them solve trivial problems only (80 -20% rule), but this 20% of cases are most important for system security,performance and maintainence. You will have very big problems if you are giong to solve 100% of problems with OOP and mappings. My favorite declarative ways do not solve 100% of problems too, but it is not a reason to say SQL, RDBMS, COBOL, ... are wrong because they old . My resistance is caused by this OO populism, data agregation on client, experimental QL's without any mathematical background, client side security and auditing, transaction ignorance, platform specific data (JAVA Object BLOBs), ER models poluted with 1:1,
    application level concurrency control, .... (OO religion in short) . I see no reason to reinvent mature technologies, solve unsolved problems.
  105. inheritance[ Go to top ]

    1. Don’t allow inheritance relationships in the domain model. (Gang of Four, Joshua Bloch and other gurus tell us to favor composition over inheritance anyway.)
    I agree, I try to avoid inheritance. This strays a bit from the topic, but I have a modeling situation where I am curious how it could be better modeled using composition. My model has a number of devices, all of which inherit from a common Device class, with common properties like "name", "address", "model", and then subclasses for each specific type of device, like maybe a Camera, some kind of serial device, etc, each with their own device-specific properties, like maybe frame rate, baud rate, whatever. Descending from a common parent is nice because then in my Hibernate mapping, there is a "device" table with the basic info on all devices, that I can join to for reporting. the specific properties for each subtype ends up in a set of joined tables, 1 for each subtype.
    So how could I do this better using composition?
  106. inheritance[ Go to top ]

    So how could I do this better using composition?

    You can't. Inheritance good. Think about your DNA
    and your rich uncle.
  107. This was my hope for Oracle's App Server[ Go to top ]

    When I first heard about Oracle's App Server, this is what I was hoping for -- A J2EE container embedded inside the database so my EJBs would essentially run like stored procedures. It seemed like the ultimate solution to the persistence wars and would have really given Oracle an edge.
  108. I don't pay a tremendous amount of attention to the various O/R Mapper products (sorry :-) but do recognize (and deal with) the ongoing problem of multiple data sources intrinsic to web apps and integration with legacy environments. That's why Service Data Objects (JSR 235?) seem like such a 'decent' (i was going to say great but that's pretty optimistic) solution for better object to <n> source mappers. I've played with SDO a little bit (from WSAD 5.1.2 WDO prerelease) and it seems like a pretty good blend of incore cache service, bi-directional persistence, and uniform navigation. As a disconnected data graph, it seems like it would be pretty much layer independent as well.

    So, does SDO with at least an RDBMS mediator do what is really needed, or do you think it's just another standard?
  109. and ask ourselves questions like: why do i need a database at all?

    when you can have 16GB of main memory for very little cash, do you really need a "database"? perhaps a sophisticated object cache will do, writing through to one of the nice journaling filesystem that are available today.

    now that your objects are properly related as they should be, and we can go about the business of building robust applications.
  110. and ask ourselves questions like: why do i need a database at all? when you can have 16GB of main memory for very little cash, do you really need a "database"?
    The purpose for using a database is not just for writing information to a disk. How would you implement search functionality in an client application without select statements? If you only have an object model, you will encounter the same problem that network databases experienced. You can only find objects using the id, or through special "entry" objects. The nice thing with relational databases is that you can find record/objects using any attribute or combination of them.

    BTW: Modern databases caches most of the data into the primary memory.

    Fredrik,
    http://butler.sourceforge.net
  111. The purpose for using a database is not just for writing information to a disk. How would you implement search functionality in an client application without select statements?
    By criteria? Why let RDBMSs have all the fun?
    If you only have an object model, you will encounter the same problem that network databases experienced. You can only find objects using the id, or through special "entry" objects.
    Maybe in the past. There is always the future. It can be emulated currently with Hibernate.
    The nice thing with relational databases is that you can find record/objects using any attribute or combination of them.
    The bad thing is that required structure is a poor match for Domain, UI portion of the application and encourages poor architecture (ie integration at the db level - and before anyone says it, just cause everyone is jumping off the bridge doesn't mean you should).
  112. By criteria? Why let RDBMSs have all the fun?
    This criteria stuff would just be a re-inventing of SQL. I have seen a number of criteria implementations, and they are either not powerful enough or extremly similar to a select statement.

    Anyway, if you don't use a RDBMS, you will need some sort of engine to run your criterias. And this engine will be extremly similar to a RDMS (but with another name).
    If you only have an object model, you will encounter the same problem that network databases experienced. You can only find objects using the id, or through special "entry" objects.
    Maybe in the past. There is always the future. It can be emulated currently with Hibernate.
    Same as obove. This emulator will be very similar to a RDBMS.
    The nice thing with relational databases is that you can find record/objects using any attribute or combination of them.
    The bad thing is that required structure is a poor match for Domain,
    UI portion of the application and encourages poor architecture.
    Why should the database structure be a poor match for the domain model?

    Why would the use of a RDBMS encourage poor architecture??

    Fredrik,
    http://butler.sourceforge.net
  113. Crossing the Chasm[ Go to top ]

    This is a very interesting thread - in fact, I've seen this thread going on under one form or another for the past 10 years or so... RDBMS's have such a strong foothold in almost all forms of enterprise today that they have ended up designing *us* instead of the other way around. I remember a consulting project I was contracted to do a few years back with a large company which was using Oracle for it's e-commerce site paying about $500K/year for licenses (!!). After a rapid analysis of their data access requirements, I suggested they use LDAP instead - 99.99% of their data was textual, hierarchical and approx. 99.99 to 0.01 read to write ratio. They got order of magnitude performance improvement and saved $500K/year...

    Why is this relevant? Well - I think that LDAP is the most underrated OODB out there - and the most widely adopted - but most folks still think of it as a Authentication / user management server.... Obviously LDAP is not good for EVERY application - but it can certainly be used "natively" with OOD.

    In short - OODBs are a reality and are being used well when there's a standard and tools to support the standard. JNDI is part of that standard.

    ...But LDAP / RDBMS's and such are artifacts of the Client-Server days - Large Servers, CPU-bound data-repositories, hard to manage and maintain.

    Listen up people! We are well into the 3rd wave of computing paradigms: distributed computing - and today, there's yet another - much more esoteric standard for managing data - Jini/JavaSpaces - that calls distributed computing home.

    With JavaSpaces you have a remarkably simple but powerful data management model that keeps data separate from behavior - but in a federated, lightweight and distributed manner. My company - GigaSpaces - offers an Enterprise Application Grid - built on top of Jini/JavaSpaces that allows you to query the space using JDBC, JNDI or our own API - very fast, very scalable, and entirely software-based - you don't need to use heavy and expensive metal and no over-paid DBAs....

    Cheers,
    Gad Barnea
    GigaSpaces Technologies.
  114. Crossing the Chasm[ Go to top ]

    RDBMS's have such a strong foothold in almost all forms of enterprise today that they have ended up designing *us* instead of the other way around.
    You might ask yourself why they have such a strong foothold. Maybe they are good?
    using Oracle for it's e-commerce site paying about $500K/year for licenses (!!).
    Yes, Oracle are expenive, but there are lot of other vendors. There are serveral production-stable databases that you can get for free. If you want a cheap of free database, you should defenetly choose a relational database. And if you later want a more advanced database, you can easily switch to such, due to the standardization in SQL.
    Well - I think that LDAP is the most underrated OODB out there
    LDAP is nothing else but a hierchical database which disappered from the market 25 years ago. It is only useful in a very specialized area. There are examples of very fast relational databases too.
    but it can certainly be used "natively" with OOD.
    Yes, OO purists do everything to avoid using relational database. Even picking up the obsolete hierachical databases is an option for them.
    OODBs are a reality and are being used well when there's a standard and tools to support the standard.
    OODBs have failed badly the last 15 years. They have not fulfilled the expectaions.
    JNDI is part of that standard.
    JNDI is a standard for accessing directory (hierarchical) services.
    RDBMS's and such are artifacts of the Client-Server days - Large Servers, CPU-bound data-repositories, hard to manage and maintain.
    But using artifacts from the 60's seem to be OK for you. "Hard to manage and maintain", you don't know what you are talkning about.

    Fredrik Bertilsson
    http://butler.sourceforge.net