Ted Neward: Tech Talk on object databases

Discussions

News: Ted Neward: Tech Talk on object databases

  1. Ted Neward sat down with TheServerSide.com to discuss what he called the "Vietnam of Computer Science," referring to object/relational mapping. In this tech talk, you'll learn why Ted describes object/relational mapping as a quagmire and why he believes programmers pretend that the translation from object model to relational model is invisible, while it's clearly not. Ted examines object databases, which don't translate to a relational form at all, highlighting db4o as an example of a successfully deployed object database, and describes where their strengths and weaknesses as compared to object/relational systems. JavaOne addendum On Tuesday, May 8, 2007, I went by the db4o demo booth at JavaOne in the Java Playground section of the Pavilion. There, Nik Wekwerth and Takenori Sato showed off Takenori's Dijkstra algorithm implementation with db4o to find the best route through Tokyo streets, based on various invented constraints and the actual city streets. Takenori said that db4o allowed the streets to be modeled as a set of actual relationships, without specific navigation - in other words, a street had direct references to other streets that connected to it, in a directed graph. It's certainly possible to create directed graphs with relational databases, but it is easier and far faster to express the idea as a set of objects - and with flash drives, there's a distinct advantage in how the data is paged off of disk. It was an excellent, impressive demo. Well recommended if you're looking for a justification of object databases - much as Ted suggested in his talk, except Ted's genealogical example wasn't as useful as Takenori's demo.

    Threaded Messages (54)

  2. I think this is well worth a listen. I don't think anyone is going to throw their Hibernates or JDOs from the windows after listening to this video but Neward's insights certainly serve as very worthwhile 'knock at the door' invitation for people with open minds to entertain thoughts that close minded people would choose to close out or avoid/deny. Any discussion this initiates would be very healthy for the community as a whole. For my money db4o alone, with it's explicit fetch depth model and the need to explicitly load relationships beyond the specified fetch depth (ie., no automatic 'load on demand' or 'lazy loading') takes the 'transparent' out of 'transparent persistence' so I'll be sticking with ORMs that provide this for now (JDO, Hibernate). It is interesting to note, however, that JPOX now supports db4o as an underlying datastore which means that you can use db4o, forget about db4o fetch depths and explicit relationship loading (JPOX provides all that 'load on demand' for you, transparently) - so we can now have a object database in Java using a well engineered persistence implementation - wow - all the benefits of an object database that Newards talks about with the benefits of a mature, high performance, transparent persistence wrapper. (Isn't that just what I've craved for the last 10 years?)
  3. Chris,
    For my money db4o alone, with it's explicit fetch depth model and the need to explicitly load relationships beyond the specified fetch depth (ie., no automatic 'load on demand' or 'lazy loading') takes the 'transparent' out of 'transparent persistence' so I'll be sticking with ORMs that provide this for now
    Transparent activation in db4o is on the way.
    It is interesting to note, however, that JPOX now supports db4o as an underlying datastore which means that you can use db4o
    By going with JPOX, you lose the beautiful simplicity of interacting directly with db4o's programming model. Native Queries are one of the most compelling advantages of using db4o, and should not be discounted.
  4. Transparent activation in db4o is on the way.
    Hmmm, sounds good. I think the db4o developers were opposed to transparent activation/'auto load on demand' a while back - it's good to see they've seen the light! I suppose with db4o's initial market being embedded systems tight control over object activation was probably mandatory. In a server world transparent activation is almost mandatory these days.
    By going with JPOX, you lose the beautiful simplicity of interacting directly with db4o's programming model. Native Queries are one of the most compelling advantages of using db4o, and should not be discounted.
    The JPOX/JDO model is extremely simple also. I can just persist POJOs transparently. Nothing changes in my Java code whether my underlying store is a RDBMS or an Object database - which is really nice because I'm free to 'dip my toes in the water' and try different datastores (including db4o!) without rewriting my app. JPOX/JDO gives me a level of vendor independence which most experienced developers will appreciate if they've ever been burnt by vendor lock in before.
    Native queries sound nice (letting the compiler find syntax errors in the query is very cool) but I've not had any major issues using JDOQL - seems to work just fine.
  5. Transparent activation in db4o
    The "real" OO database used to page fault on access so they could bring in the objects you used rather than have to figure it out ahead of time. Why not do that?
  6. There are two very good reasons why this isn't used: 1. The JVM probably doesn't allow such low level access to the physical hardware to allow page fault based transparent activation to be implemented in a Java application. 2. ObjectStore has held a patent on using page faulting to implement transparent activation from way back in the C++ days. Fortunately Java byte code enhancement as used in JDO (and I believe db4o will use it too) provides a high performance alternative to page faulting which makes such low level (and patented) mechanisms unnecessary. The other option used in non byte code enhancing persistence solutions is the use of proxies but that has its own set of issues.
  7. Good to know. Thanks. I'd be interested to know what mechanisms in the JVM give you the low level control to fault a reference. Any pointers on that?
  8. > Transparent activation in db4o. The "real" OO database used to page fault on access so they could bring in the objects you used rather than have to figure it out ahead of time. Why not do that?
    I seem remember reading somewhere that performance on this isn't as good as you might expect, OS code to resolve a page fault were a lot slower than object database fetching by itself.
  9. > Transparent activation in db4o. The "real" OO database used to page fault on access so they could bring in the objects you used rather than have to figure it out ahead of time. Why not do that?

    I seem remember reading somewhere that performance on this isn't as good as you might expect, OS code to resolve a page fault were a lot slower than object database fetching by itself.
    Feel free to swizzle...=) http://citeseer.ist.psu.edu/cache/papers/cs/2629/http:zSzzSzwww.cs.washington.eduzSzhomeszSznarazSzqualszSzreport.pdf/narasayya95analysis.pdf
  10. By going with JPOX, you lose the beautiful simplicity of interacting directly with db4o's programming model.
    By going with JPOX you gain access to a standardised API (well in fact two ... JDO2 and JPA1) something that DB4O has never had. Some people value datastore agnosticity and use of standards.
  11. Whatever floats your boat[ Go to top ]

    By going with JPOX you gain access to a standardised API (well in fact two ... JDO2 and JPA1) something that DB4O has never had. Some people value datastore agnosticity and use of standards.
    Agreed. And some will build a DAO abstraction layer on top of that as well. You can insulate your application from depending on db4o with a DAO layer if that's a concern. Db4o's API is so darn simple, its hard to compare to other 'standardized' APIs. To persist an object, you just call a setter: db.set(object). You can query by example (using the JavaBean convention - a de facto standard). You can query using Native Queries, in which you're interacting directly with the properties on your domain model, which in my opinion is better than any standard. Some people will be comfortable trading datastore agnosticity and standards for development productivity and performance (in terms of memory consumption and speed).
  12. Native Queries[ Go to top ]

    You can query using Native Queries, in which you're interacting directly with the properties on your domain model, which in my opinion is better than any standard.
    :-) Closures will make Native Queries look really nice. Here is a short example, taken from my blog: List cats = database.query( { Cat cat => cat.getName().equals("Wolke") } ); This is just plain pure Java with no visible syntax additions, so it is already a "standard" before other database vendors will notice. -- Carl Rosenberger Chief Software Architect db4objects Inc.
  13. Re: Native Queries[ Go to top ]

    And for those SQL addicts, you can use this on top of db4o: http://josql.sourceforge.net/, and use SQL to query your objects!
  14. Re: Whatever floats your boat[ Go to top ]

    By going with JPOX you gain access to a standardised API (well in fact two ... JDO2 and JPA1) something that DB4O has never had. Some people value datastore agnosticity and use of standards.

    Agreed. And some will build a DAO abstraction layer on top of that as well. You can insulate your application from depending on db4o with a DAO layer if that's a concern.

    Hmm, using standard implementation of the DAO pattern ? I mean, with 1 DAO class for each domain class (POJO) ? My best wishes !! Anyone is free to beat himself with a stick whereve he wants.
    Db4o's API is so darn simple, its hard to compare to other 'standardized' APIs. To persist an object, you just call a setter: db.set(object).
    With JDO you call pm.makePersistent(object) with Hibernate you call s.save(object) From what you write it seems that you never gave a try to JDO or Hibernate .....
    You can query by example (using the JavaBean convention - a de facto standard).
    This works only if your persistent object is a simple bag of attributes with setter and getter and nothing else.
    You can query using Native Queries, in which you're interacting directly with the properties on your domain model, which in my opinion is better than any standard.
    JDO allows the access to "native" connection to the underlying datastore, so if really needed you can go on the bare "hardware".


    Some people will be comfortable trading datastore agnosticity and standards for development productivity and performance (in terms of memory consumption and speed).
    Are your really sure you know what are you talking about ? Guido.
  15. Re: Whatever floats your boat[ Go to top ]

    Hmm, using standard implementation of the DAO pattern ? I mean, with 1 DAO class for each domain class (POJO) ? My best wishes !! Anyone is free to beat himself with a stick whereve he wants.
    Who said the standard implementation of the DAO pattern required one DAO class for each domain class? I merely mentioned DAO's as an abstraction option for those who want to avoid dependencies. DAO's can be made as coarse-grained as you like.
    With JDO you call pm.makePersistent(object) with Hibernate you call s.save(object) From what you write it seems that you never gave a try to JDO or Hibernate .....
    I have used Hibernate for two years and I can tell you that db4o is far easier to use. There is no question that the API is simpler than both JDO and Hibernate, because there practically is no API. Db4o's Native Queries are written in Java code. Writing Native Queries is simpler than JDOQL, HQL and SQL, they're type-safe, and can be refactored more easily.
    You can query by example (using the JavaBean convention - a de facto standard).
    This works only if your persistent object is a simple bag of attributes with setter and getter and nothing else.
    Correct. So what?
    You can query using Native Queries, in which you're interacting directly with the properties on your domain model, which in my opinion is better than any standard.
    JDO allows the access to "native" connection to the underlying datastore, so if really needed you can go on the bare "hardware".
    Its obvious that you don't understand what "Db4o Native Queries" are. They have nothing to do with getting access to the "bare hardware". Did you actually watch Ted Neward's Tech Talk? He discusses Native Queries in response to Question #9.
    Some people will be comfortable trading datastore agnosticity and standards for development productivity and performance (in terms of memory consumption and speed).
    Are your really sure you know what are you talking about ?
    There are plenty of people who would agree that db4o provides productivity and performance gains. Guido, your flame was at best an ignited fart! ;) You made assumptions, you didn't take the time to understand the native queries concept, and worst of all, you didn't disclose the reason for your bias.
  16. Re: Whatever floats your boat[ Go to top ]

    Hmm, using standard implementation of the DAO pattern ?
    I mean, with 1 DAO class for each domain class (POJO) ?
    My best wishes !!
    Anyone is free to beat himself with a stick whereve he wants.
    Who said the standard implementation of the DAO pattern required one DAO class for each domain class? I merely mentioned DAO's as an abstraction option for those who want to avoid dependencies. DAO's can be made as coarse-grained as you like.
    Over the last two years I've used JDO and Hibernate and never needed to write any DAOs and yet I still avoid dependencies.
    The 'transparent' nature of 'transparent persistence' in JPOX is what allows me to avoid dependencies. When I use the exposed domain model pattern there is no need for an archaic time consuming, DAO architecture.
    If I were to use db4o alone I would have to litter my code with lot's of fetch depth and manual relationship fetching code - that's not transparent and puts db4o dependencies into my code. I know you say transparent activation is on it's way but it's already here and working in JPOX and Hibernate.
    Some people will be comfortable trading datastore agnosticity and standards for development productivity and performance (in terms of memory consumption and speed).
    If I run db4o under JPOX I can get the best of both worlds and my code remains datastore agnostic and I don't have to manually handle relationship loading when I reach the end of my preset fetch depths. Isn't that a win/win/win for JPOX, db4o and the developer.
    Guido, your flame was at best an ignited fart! ;)
    Come on now - I always thought that the OODBMS guys were meant to be the nice guys of the industry!
    You made assumptions, you didn't take the time to understand the native queries concept, and worst of all, you didn't disclose the reason for your bias.
    Well I'd better declare mine then: http://stepaheadsoftware.com/products/javelin/javelin.htm http://expojo.com
  17. Re: Whatever floats your boat[ Go to top ]

    Hmm, using standard implementation of the DAO pattern ?
    I mean, with 1 DAO class for each domain class (POJO) ?
    My best wishes !!
    Anyone is free to beat himself with a stick whereve he wants.
    Who said the standard implementation of the DAO pattern required one DAO class for each domain class? I merely mentioned DAO's as an abstraction option for those who want to avoid dependencies. DAO's can be made as coarse-grained as you like.
    In fact mine was a question. The problem is that with DAO term different people mean different things. What is normally explained is 1 domain class == 1 DAO class.
    With JDO you call pm.makePersistent(object)
    with Hibernate you call s.save(object)
    From what you write it seems that you never gave a try to
    JDO or Hibernate .....
    I have used Hibernate for two years and I can tell you that db4o is far easier to use. There is no question that the API is simpler than both JDO and Hibernate, because there practically is no API. Db4o's Native Queries are written in Java code. Writing Native Queries is simpler than JDOQL, HQL and SQL, they're type-safe, and can be refactored more easily.
    Yes, for sure. Maybe your example was not the best one (DB.set(object)) to show API simplicity.
    You can query by example (using the JavaBean convention - a de facto standard).
    This works only if your persistent object is a simple bag of attributes with setter and getter and nothing else.
    Correct. So what?
    There are a lot of things that are de-facto standard still being wrong things, or, at best, of limited use (remember anemic model ?)
    You can query using Native Queries, in which you're interacting directly with the properties on your domain model, which in my opinion is better than any standard.
    JDO allows the access to "native" connection to the underlying datastore, so if really needed you can go on the bare "hardware".
    Its obvious that you don't understand what "Db4o Native Queries" are. They have nothing to do with getting access to the "bare hardware". Did you actually watch Ted Neward's Tech Talk? He discusses Native Queries in response to Question #9.
    I am sorry, I know very well what native queries are. I have followed rather closely all the discussion about native queries from Carl Rosenberg on JDOCentral (well, here my memory might fail). I think that you didn't catch the quotes around the "hardware". I meant that with JDO you can get direct access to the underlying datastore in the same way as you get sql Connection in the RDBMS mapping case, gaining the full power of db4o (the "bare hardware").
    Some people will be comfortable trading datastore agnosticity and standards for development productivity and performance (in terms of memory consumption and speed).
    Are your really sure you know what are you talking about ?
    There are plenty of people who would agree that db4o provides productivity and performance gains. Guido, your flame was at best an ignited fart! ;) You made assumptions, you didn't take the time to understand the native queries concept
    First, congrats for you Oxford studies. And I will not go further even if you deserve something better. Please, can you point out what are my assumptions ? Instead, it seems that you are spreading some...FUD ? Flame ? When ? Please, read again my post. Take your time to think a while.
    , and worst of all, you didn't disclose the reason for your bias.
    Well, don't you think to be a little ridicule ? You can find my signature in every post at jpox.org or zeroc.com I didn't put here because I had no time and will to change. But surely you think that the real reason is that behind the pointed link there is some spectre plotting in the dark. Guido P.S. What is the reason of my bias ? Tell me, I don't know. Do you think that credits on my site hide some dark interests that cannot be unveiled ? Believe or not, there is no other interest but a technical one.
  18. Let's put db40 to rest for a little while. Ted's discussion is very interesting, far beyond of what db40 offers or the actual defense of Hibernate/JDO (that I see nobody attacks). The core question would be: I'm using ORM, why should I change my mind and start getting messy with non-standard, nobody used approaches? In fact I think Ted hits the target when he talks about the light use of the tools and technologies. 99% of the bunch of developers you ask about, will tell you they want the solution fast and simple. Note they talked about the solution and not the problem, as it happens almost always. The problem with this is that the actual problem domain may be too large for the local view of the developers, and then they decisions may impact the overall solution, and that is what Ted warns us about. Yes, I need to work with some data in a table, and I have a great tool that reads it for me from a relational database, brings it to my dear object world and I can work with my data object to my content. But the developer may not be able to see a lot of the wrong things that are happening around him. Let's see: 1. There is IM when mapping from relational to Object. Hummm...relations...inheritance... 2. Data Objects? That is, a passive object that is just a bunch of data? And now we have objects with just business logic in them? Sounds like old procedural style programing to me, but using objects instead of data structures and libraries. 3. Bring raw relational data to objects to be processed into business information? Why can't it be processed into business information before being converted into an object? 4. Tweak the relational model to fit a round object? And what about all the other people that needs a clean a "normal" relational model? I will go further, and ask: Should we model my data using an object model, or a relational one? Actually, developers work by tradition, and it is a tradition to use relational data. Here is what I think: 1. Relational is suited for massive data with logical relations and coherent sets of attributes. That is usually not business information, but raw tables with data rows. 2. Object structs are more business oriented, real world representations of particular cases, entities with behavior. 3. Working with data is optimized in the DBMS, so it should be it who works and process data. Java, for instance, should 'n be fetching long lists of rows and adding them up. 4. So, split your data domains, differentiate the massive organized data storage from the particular business one, persist in the appropriate tool and let the specialized servers do their job with data, focus Java into real high level business work. And let the objects play they role. By this I mean we are not simply changing years of tradition because it is theoretically correct. All those wonderful tools may stay, we can simply add objects a new sense of being. So Object Databases do make sense. And XML databases too... XML is the main thing transported for SOA, you cannot just ignored it. William Martinez Pomares
  19. Not sure I agree with the statement about objects being closer to business. What happens if the domain model has numerous many-to-many relationships? In a case like that, going with a relational approach would be easier in my opinion. for situations where an UI needs a particular view, I'd just create a view. I don't know how OODBMS handle many-to-many relationships, but often programmers simplify the model to 1-to-many in the object view. anyone know how db4o and other OODBMS handle models with many man-to-many relationships? peter
  20. Hi Peter. Actually, my first observation would be: Which domain are you talking about? You can model a solution using an Object Oriented approach or a Entity/Relation. In one you have objects that encapsulate their attributes and that are able to inherit from other objects. In the other, you have entities with defined public attributes that relate to other entities. Note that I don't see the relations in the object model and I don't see inheritance in the relational one. Let's imagine we have parents and children. A parent may have many children, and a children may have 0-2 parents (more if we count step-mothers and step-fathers). That is a many-to-many relationship. You wan to store all the people in the world, so you are able to count how many parents have more that 3 children, or how many children have only one parent. I would think in a relational database for that. Now let's think of other business needs. You want to feed a family and also make the parents give away part of their money to their kids to go to school. Objects. For that you may design a person object, from where parent is derived. Parent will inherit all Person's attributes and behavior (like eating what you feed them), but will also contain a list of children, that are actually other persons. Note that in this case, a parent may have another parent or a simple person as children. Then you send a feed() message to all the family, and to each parent a giveMoney() message. As you may see, object may contain attributes that are lists. That is as close as a relation as you may get. And as you see, Objects help model behavior, business rules, while relations and entities help you organize your data for access and queries; there is no behavior implicit in a E/R model. The idea is not choose one and forget the other, but used them where appropriate, even both at the same time! About db40, I'm not an expert on it so I don't know if they handle the relationship concept. William Martinez Pomares.
  21. The idea is not choose one and forget the other, but used them where appropriate, even both at the same time!
    I agree with that completely. I don't see any issue or competition between OODBMS and RDBMS. Sometimes it makes sense to use an OO centric database like ObjectStore or OLAP. The kind of relationship I was thinking of is too complex to describe in a post. To use the parent-child analogy, the kinds of cases I've had to deal with are more like: parent child grandparent parant child aunt niece cousin sencond cousin step brother step parent If I had to model this in an OO approach, I might create a Person object with lists for: parent, grandparent, cousins, aunts, uncles, second cousin, etc. Doing that feels a bit cludgey to me, when a relational approach would keep my Person object simpler. Many business models I've seen over the last 6 years have these types of complex relationships. In contrast, something like configuration files or hierarchical data would fit nicely in an OODBMS. peter
  22. The idea is not choose one and forget the other, but used them where appropriate, even both at the same time!


    I agree with that completely. I don't see any issue or competition between OODBMS and RDBMS. Sometimes it makes sense to use an OO centric database like ObjectStore or OLAP. The kind of relationship I was thinking of is too complex to describe in a post. To use the parent-child analogy, the kinds of cases I've had to deal with are more like:

    parent child
    grandparent parant child
    aunt niece
    cousin sencond cousin
    step brother step parent

    If I had to model this in an OO approach, I might create a Person object with lists for: parent, grandparent, cousins, aunts, uncles, second cousin, etc.

    Doing that feels a bit cludgey to me, when a relational approach would keep my Person object simpler. Many business models I've seen over the last 6 years have these types of complex relationships.

    In contrast, something like configuration files or hierarchical data would fit nicely in an OODBMS.

    peter
    Brightly analyzed by Martin Fowler, you know. Guido.
  23. Re: the parent-child analogy[ Go to top ]

    Such a cyclic, many to many relatioship, is one of the best domain for ODBMS. db4objects' demo is a navigation system handling road data, which consists of tons of directed graph. You can represent human with one class like this: Human 2* Human Who is grand father, cosine, or niece depends on from where you see. So it's not static, you can not define it statically. Road is defined like this: Road *--Cyclic-->* Road Takenori
  24. Re: the parent-child analogy[ Go to top ]

    Such a cyclic, many to many relatioship, is one of the best domain for ODBMS. db4objects' demo is a navigation system handling road data, which consists of tons of directed graph.

    You can represent human with one class like this:

    Human 2* Human

    Who is grand father, cosine, or niece depends on from where you see. So it's not static, you can not define it statically.

    Road is defined like this:

    Road *--Cyclic-->* Road


    Takenori
    I'm gonna sound ignorant here, but not sure I understand. Say I want to make a Person object that doesn't have lists for cousins, uncles, aunts, brothers, sisters, etc. How would using an OODBMS or Db4o make it better than relational tables? is there a link for the demo? peter lin
  25. Re: the parent-child analogy[ Go to top ]

    Hi Peter, Sorry, not yet ready for the link to the demo app. So let me show a brief code here to implement the idea. I would like to keep Person as simple as possible. class Person{ public static final byte GENDER_MALE = 0; public static final byte GENDER_FEMALE = 1; private byte gender; private Person mother; private Person father; private List children; } To get a uncle, cousine, or any other relatives can be implemented as a method like this: public List getUncles(Person fromWhom){ List uncles = new ArrayList(); Person grandFather = null; // mother's grandFather = fromWhom.getMother().getFather(); for(int i=0; i<grandFather.getChildren().size(); i++){ if(grandFather.getChildren().get(i).getGender() != Person.GENDER_MALE) continue; uncles.add(grandFather.getChildren().get(i)); } // father's grandFather = fromWhom.getFather().getFather(); for(int i=0; i<grandFather.getChildren().size(); i++){ if(grandFather.getChildren().get(i).getGender() != Person.GENDER_MALE || grandFather.getChildren().get(i).equals(fromWhom.getFather())) continue; uncles.add(grandFather.getChildren().get(i)); } return uncles; } When you take a look at it, easy to understand, isn't it. This describes well about real world. And you can use many OO design patterns like Strategy and Composite. When you need a navigational feature or more OO designs, ODBMS is better. While for a set based feature, RDBMS is better. It depends on you application. Takenori
  26. Re: the parent-child analogy[ Go to top ]

    ok, I see what you mean, but that implies I need to load all persons related to that instance and iterate over them in memory to filter it. Related to that, what if I'm building a geneology systems for Mormons and the relationships are huge and the database is huge? The benefit I see with relational approach is that I can create an indexed view or materialized view. The approach you describe could be used with RDBMS also, so I still don't see the benefit. Does db4o provide the equivalent of indexed views and lazy loading so that performance will be predictable. Note i didn't say "performance is fast". How does db4o store the data in the approach you describe? and how does that translate into actual benefit from a development and runtime perspective? I'm asking because I've used OLAP in the past for directory services. Atleast with OLAP, there isn't the ability to create index views. I know some online directory site use a combination of OODBMS and RDBMS to compliment each other. The OODBMS is used for fast synonym and typo lookup. The actual data is stored relationally. I honestly think there's no competition between OODBMS and RDBMS. It's about finding a workable solution that solves the problem at hand for me. peter
  27. Native Queries Uncles Example[ Go to top ]

    Takenori's posting just showed a possibility how relations can look like with OO. His example does not demonstrate db4o querying capabilities, it just shows how traversal could look like. Here is a possible way to get all uncles using Native Queries: public class Person { private Gender gender; private Person mother; private Person father; public ObjectSet uncles(ObjectContainer db) { final Person me = this; return db.query(new Predicate() { public boolean match(Person p) { return (me.mother.mother == p.mother || me.mother.father == p.father || me.father.mother == p.mother || me.father.father == p.father) && me.father != p && p.gender == Gender.MALE; } }); } } You also asked how to store such a construct in db4o. Now that one is really easy: ObjectContainer db = Db4o.openFile("my.db"); db.set(new Person()); db.close(); Done. :-) db4o analyzes your classes and stores all reachable objects for every object that you store. No schema definitions. No XML files. Just store any object of any complexity with one line of code. -- Carl Rosenberger Chief Software Architect db4objects Inc.
  28. If I'm reading the code correctly, that is a behavior of the Person object. It isn't a behavior of the object database. What happens when I'm searching for all employees that make X salary and there's 300K employees? Wouldn't it be better to have the database (relational or OO) optimize the query? I've used rich objects and data objects in the past, so it's nothing new. If the dataset is small, I may consider using a ODBMS approach. In the case where the datasets are huge, with complex relationships and dozens of views, a relational approach feels like a better fit. When I asked "how does it store the data?" I didn't mean the physical storage. my fault for not asking the right question. What I meant is this. 1. does the database index the data to improve queries? 2. does the database internally generate object id's to insure object identity is consistent? 3. does the database provide partitioning schemes and how does it handle indexing with respect to partitioning? 4. does the database have the ability to partition, and replicate indexes for efficient distributed queries? 5. when I tell the database save a subclass of Person, which has a list of grandparents, sisters and brothers, how does it handle the inheritance? OODMBS are great for certain kinds of tasks, but I think over selling the benefits of OODBMS only serves to discredit the usefulness. my bias 2 cents. peter
  29. If I'm reading the code correctly, that is a behavior of the Person object. It isn't a behavior of the object database. What happens when I'm searching for all employees that make X salary and there's 300K employees? Wouldn't it be better to have the database (relational or OO) optimize the query?
    That's exactly what Native Queries are all about: The code you write within the Predicate#match() method gets analyzed at source code or byte code level at compiletime, load time or execution time and the query runs against database indexes. Answering your questions for db4o:
    1. does the database index the data to improve queries?
    Yes.
    2. does the database internally generate object id's to insure object identity is consistent?
    Yes.
    3. does the database provide partitioning schemes and how does it handle indexing with respect to partitioning?
    No, but creating db4o databases comes at zero cost (no schema maintenance) and you can replicate between multiple db4o databases using dRS if you like.
    4. does the database have the ability to partition, and replicate indexes for efficient distributed queries?
    Same as above: Replication is the magic word if you want load balancing.
    5. when I tell the database save a subclass of Person, which has a list of grandparents, sisters and brothers, how does it handle the inheritance?
    It just simply stores *any object* you like, it's born OO. -- Carl Rosenberger Chief Software Architect db4objects Inc.
  30. What if I don't want to query the objects in-memory because I need to either: A) support 50 concurrent requests per second b) the dataset is over 20K c) I don't want the server to chew up precious CPU cycles d) I want to keep memory usage to a minimum I could write my own sql compiler that generates an optimized query plan, but that doesn't solve problems A-D. Replication isn't the same thing as partitioning. There's reason why partitioning is used. I've used partitioning to divide data by geographic region, or by business division. In the case of ETL (extract transform load) operations on data warehouses, the dataset is in the terabyte range, so how does replication help? I think small embedded OO databases are useful for small embeded situations, but it doesn't feel appropriate for the kind of work I do. If I were to write a wireless application, it would be silly to use an embedded relational database. It's just a poor fit and rather awkward. Using an OO database for something like a phone, or pda makes much more sense. I'm no expert on databases, but the indexes and primary keys affect how one partitions the data, which is why i asked about how db4o indexes. I'm still unclear about whether db4o supports the concept of views and indexed views. Or is that simply just another object. How does db4o avoid data duplication, if I do want to have multiple views of the same data? peter
  31. What if I don't want to query the objects in-memory
    I think there is still a misunderstanding. Native Queries do not load all objects into memory. The Native Query optimizer analyzes the #match() method and converts it to a database query that uses indexes, for db4o that would be SODA. Indexed queries will not load objects into memory. In the contrary, we use BTree algebra within the query processor to construct virtual sets that consume hardly any memory at all.
    I'm no expert on databases, but the indexes and primary keys affect how one partitions the data, which is why i asked about how db4o indexes.
    db4o creates BTree indexes on fields, if you ask it to do so, either by the configuration API or by annotations.
    I'm still unclear about whether db4o supports the concept of views and indexed views. Or is that simply just another object.
    Currently db4o offers three possibilities if your application wants a different "view" on stored objects: (1) The reflection layer is pluggable. You can change what a "class" is and what an "object" is. If you want to do that, you can write a reflector to work against getters and setters or against .NET properties instead of fields. In the reflector you could also map attributes to others. (2) Translators convert objects to other objects when they are stored. (3) Aliases allow you to "rename" classes at runtime, for instance to work against a different package or namespace, or to share the same objects between Java and .NET. For all of the above, you always get objects, of course. :-)
    How does db4o avoid data duplication, if I do want to have multiple views of the same data?
    I don't see where there should be data duplication if the conversion happens in both directions in the same way. db4o keeps a (weak) reference to all instantiated objects, so it understands updates. -- Carl Rosenberger Chief Software Architect db4objects Inc.
  32. Although this is specific to db40, I'm curious. 1. It seems uncles() is a method that is part of person. Could I make a search for Persons in another class that is not a person? Or should all classes contain their own search methods? 2. If so, in the match() method we directly access private attributes. That means you cannot write that code in another class not being a Person, right? Can I access attributes by method, like getMother()? 3. I see the search is just a match operation. Right? 4. When you add the object to the database, will all its private attributes be stored too? Thanks!
  33. 1. It seems uncles() is a method that is part of person. Could I make a search for Persons in another class that is not a person? Or should all classes contain their own search methods?
    You can write queries in any class, anywhere. The native query optimizer detects the signature of Predicate objects.
    2. If so, in the match() method we directly access private attributes. That means you cannot write that code in another class not being a Person, right? Can I access attributes by method, like getMother()?
    Yes, you can use methods instead of fields, even methods cascading to other methods, but there are limits to what the query optimizer can "understand". That's why the query optimizer should provide feedback, if a query is run unoptimized.
    3. I see the search is just a match operation. Right?
    That's correct.
    4. When you add the object to the database, will all its private attributes be stored too?
    Yes. -- Carl Rosenberger Chief Software Architect db4objects Inc.
  34. Very good solution, Takenory. Here I find one problem actual developers have, me included: We are so used to work in one way, to think in one way, that solutions out of our known solution space are not as easily visualized. That is what I refer to when I said there is a steep learning curve. Now, It also brings to the table another issue I presented up above in one of my firsts posts. The actual business logic to obtain the list of uncles is data processing logic. The list is the real business information. In that sense, as I suggested above, all data processing business logic should be executed in the server, where the access to the data is optimized, and also where specialized data processing language lies. Let me explain: If I have modeled this in E/R, and use a RDBMS, then I would create a stored procedure to process all required tables returning just the list, I would not load all the tables rows into my business logic tier to do the processing there. I would expect the same for OODBMS. So, that code you wrote, is supposed to be executed in the BL tier? If so, it means all those objects should be loaded to the BL tier and then that data processing done in there. The OODBMS would then be just a place to store object data and do some search with that data. To really use the OODBMS search capabilities, I would then need to make public the lists of children and perform several searches. Now, what if we accept that OODBMS stores and manages objects, complete with behavior. That would require the OODBMS to store a particular object type, let's say Java Objects. Now let's imagine that I can create Stored procedures, written in java (Oh, no learning curve!) but executed in the OODBMS tier, to process objects. Objects can respond. And the actual application will receive the processed list of objects. It seems the Road path example presented was amazing. Could you explain what was the role of the OODBMS in there? What features the server provided to help with that? Where was the logic placed? Memory and communication footprint? Interesting indeed. William Martinez Pomares.
  35. The kind of relationship I was thinking of is too complex to describe in a post. To use the parent-child analogy, the kinds of cases I've had to deal with are more like:

    parent child
    grandparent parant child
    aunt niece
    cousin sencond cousin
    step brother step parent

    If I had to model this in an OO approach, I might create a Person object with lists for: parent, grandparent, cousins, aunts, uncles, second cousin, etc.

    Doing that feels a bit cludgey to me, when a relational approach would keep my Person object simpler.
    With OO you don't have to explicitly model each relationship. You can model the fundamental ones using explicit relationships eg., my biological mother, my biological father - every mammal has one of each (except Dolly cloned the sheep but forget about dolly for now ok!) - and perhaps have a spouse or defacto relationship.
    We then let the 'behaviour' of objects work out the rest based on these fundamental explicit relationships. For example - to find a person's grand parents call person.getGrandParents() which navigates to both parents and then navigates to both of their parents - voila -you have grandparents but you never had to explicitly create a grand parent relationship. You can do the same for uncles, aunts and sibling relationships - and all this behaviour is encapsulated in the Person object. The caller doesn't even have to know which relationships are explicitly declared and which are derived - it all just works.
    That's the magic I've been working with for nearly 15 years and I find it hard to see why some people still just don't get it. Maybe they're scared they'll fall of the edge of the earth or something...
  36. these fundamental explicit relationships
    How would you encode this information without setting up explicit relationships? If the relationships are explicit there's no need for navigation.
  37. Typically objects are much better at modeling things in the real world (like business entities etc.,) than tables and rows because the real world is made up of objects, not tables and rows - imagine how boring a world made up of tables and rows would be.
    Some extremely close minded relational database people put blinkers on and pretend everything in the world looks like tables and rows but they're only doing themselves emotional harm that they may never recover from ;)
  38. Typically objects are much better at modeling things in the real world (like business entities etc.,) than tables and rows because the real world is made up of objects, not tables and rows
    I don't think the relational model encourages you to think about data in terms of rows and columns, so much as predicates - statements of fact, with a placeholder in each column for nouns, adjectives, etc. In this sense the relational model is very much about the real world, each "row" is a statement, with a context provided by its table (which concept is lacking in object models usually). And if as the previous poster observed (quite incitefully) that our persistent objects are just bags of fields, then it's hard to see where the advantage is there...
  39. Typically objects are much better at modeling things in the real world (like business entities etc.,) than tables and rows because the real world is made up of objects, not tables and rows


    I don't think the relational model encourages you to think about data in terms of rows and columns, so much as predicates - statements of fact, with a placeholder in each column for nouns, adjectives, etc. In this sense the relational model is very much about the real world, each "row" is a statement, with a context provided by its table (which concept is lacking in object models usually).

    And if as the previous poster observed (quite incitefully) that our persistent objects are just bags of fields, then it's hard to see where the advantage is there...
    OK, entities are (almost) always something close to real world things. The pb is that looking at a relational model of the world you miss the behavioural part of an entity that is tightly connected to intrinsic nature of the entity itself. This behavioural part must not be confused with the "way entities are used in a particular application, in a particular time frame", these are the business rules. Obviously, if the persistent classes are modeled as plain vanilla bag of attributes you are simply seeing a world of things without life; puppets that you can stretch beyond any (intrinsics) physic rule. Guido.
  40. entities are (almost) always something close to real world things.
    But their form, though normalization, doesn't match the real world thing anymore and that's where the issues come from.
    simply seeing a world of things without life
    When I have door, for example, does it have a life or does its life come from relationships? The door has a different life as its being manufactured, as its being sold, and in your house. Which is its life? This is the problem with OO is that life is broader than your objects Horatio.
  41. > entities are (almost) always something close to real world things.

    But their form, though normalization, doesn't match the real world thing anymore and that's where the issues come from.

    > simply seeing a world of things without life

    When I have door, for example, does it have a life or does its life come from relationships? The door has a different life as its being manufactured, as its being sold, and in your house. Which is its life? This is the problem with OO is that life is broader than your objects Horatio.
    Quoting (almost) Rumbaugh: an object is an identifiable entity in the problem domain. Recalling a terrible example from Data and Reality: "my car is blue" What is blue ? An attribute ? It depends upon the problem domain. What if I would query the cars with color having a certain RGB configuration ? Back to doors, what if the problem domain deals with opening and closing doors ? If door is a bag of attribute, then it is up to the business logic to guarantee that an open door cannot be opened. If door has life, opening an open door results in a IllegalStateException. The business rules control IF and WHY a door must be opened or closed. This is not visible if you look at a relational model of doors. Guido.
  42. Ok. Valraven: It seems there is a confusion between life and life cycle. I think Guido refers to "life" as the behavioral rules of the object, not the stages it passes during its lifetime. Now, we have a reality here: the Relational model was created to model data entities and their relations, while the object orientation was created to model real life entities, which includes behavior. There is an actual IM problem when we talk about Object Databases. If a DB is going to store objects: are we going to store only the data from objects, or are we going to store also their behavior? If we are going to store only the data, there should be a way to "complete" the object with its actual missing behavior part when loaded, or even when searched for. What do I mean? Objects have public and private attributes. To look for door with an open state I would look for objects that return true when asked obj.isOpen(), if I'm not able to access the private state attribute. Note I'm not breaking encapsulation just for the sake of searching an object. But to have the above thing, I need to store the complete object, not only the data. Thus, it wouldn't be a database, but an "objectbase", where complete instances live. That sounds more like JavaSpaces, isn't it, but with the store and search capabilities potentiated. Now, it also implies the object must be something generic, which is not: you have Java Objects, Ruby Objects, .NET Objects, etc. So, you cannot store "objects" with behavior as a generic thing. If we go back to the first idea to save only data, and to complete it with behavior when loaded into my application, we have problems with encapsulation and we limit the objects potential. We also are tight coupling the application language to the repository. So Relational DBs are generic, but Objects DBs are language specific. HUmmm. So, in short, objects are to model the business rules and may need to be persisted. Relational entities are to model data and their relationships. Two separate things, two separate ways to persist in a repository. Two ways to search, organize and model. The discussion should then be directed to: 1. When should I use each? 2. How can I combine each? 3. What should I expect from an Object Database? William Martinez Pomares.
  43. In ObjectStore and other OODBM's I've seen and used, only the "persistent" fields of each object are stored. The class reference is maintained so the object can be reinstantiated when needed. The problem is that if you change the fields of the class, then (at least as of 2 years ago) you have unload and reload the objects' data. (Unload with the old class and reload using the new.) It's analagous to changing a compiled schema in a network (e.g. CODASYL) database. The schema is a mapping of the data on the logical or physical store so if you change it, it is no longer valid for the data in the store. Hence the unload/reload. There would be no point in storing the class definition for each instance would there? As far as relationships go, the solution is completely up to you. You have access to a full set of collections. ObjectStore implements the collection interfaces of Java. Also, you have object references which are maintained when an object is persisted, unless you explicitly marked it as a transient reference. So the answer is that you implement the relationships the same way you would in Heap. Realistically you'd want to insure that you weren't doing crazy things that you might get away with in Heap but not in a physical store. There is nothing magical about OODBMS's. You can apply your knowledge of file systems, indexes and data structures and pretty much guess what they are doing. ObjectStore works very much like a network database with the same performance advantages and flexibility (in terms of managing change and unanticipated navigation paths) disadvantages.
  44. Ok. That confirms that OO databases are partial stores of objects (just the data component, and may not be the whole data after all), and that they are tightly coupled to the implementation language. Not all objects can be persisted. That's a limitation, and thus OODBMS cannot be used as a replacement of RDBMS, although they may be needed for some modeling. William Martinez Pomares
  45. Valraven: It seems there is a confusion between life and life cycle. I think Guido refers to "life" as the behavioral rules of the object, not the stages it passes during its lifetime.
    These aren't stages. These are objects representing a thing in particular contexts. The underlying thing exists separately from the objects that only represent behaviours. From your POW you only see the door as one thing, when really it is many and may be many more tomorrow. Let's say the door goes to the dump and a scientists wants to track how long it takes to decompose. Or perhaps the door is a collectible and it is resold on ebay. The doorness is transcendent. So you can't represent anything by objects alone, unless you stick to a very small world. Unfortunately, this exactly what happens over time as systems evolve and change.
  46. These aren't stages. These are objects representing a thing in particular contexts. The underlying thing exists separately from the objects that only represent behaviours. From your POW you only see the door as one thing, when really it is many and may be many more tomorrow. Let's say the door goes to the dump and a scientists wants to track how long it takes to decompose. Or perhaps the door is a collectible and it is resold on ebay. The doorness is transcendent. So you can't represent anything by objects alone, unless you stick to a very small world. Unfortunately, this exactly what happens over time as systems evolve and change.
    I agree with you. The Door object is not an absolute representation. It is just a model of something, abstracted, and thus not detail complete. It has just the value and behavior we need to solve my specific problem. Those are the worlds you mention. Same happens with entities, they have only the needed fields and relations. One interesting question would be: which is easier to evolve, OO or E/R? William Martinez Pomares
  47. One interesting question would be: which is easier to evolve, OO or E/R?

    William Martinez Pomares
    (With apologies to Brooks) The "evolution" you ask about, is it difficult due to accidental complexity or essential complexity? Should one even consider the kind of context being suggested for the door (door for manufacturer, door in a house, door a collectible item, door in dump, door's decomposition rate) in an application. Even though the 'doorness' is transcendent, one is not interested in doorness for the heck of it, but rather in a particular context. That (to me) is 'abstraction' in OO - represent (model (verb)) objects to relevant/appropriate level "in the context" of application. If all of those contexts really, aboslutely need to be considered in an application domain - then we are dealing with an essentially complex problem. OO / ER (/ or functional) are really just conceptual tools to define, frame, and solve the multiple facets of problem. Considering which is more easily evolved (OO/ER/functional) is (I think) actually just an accidental complexity problem - today's tools and technology make OODBMS evolution hard (and language specific), whereas RDBMS have lots of tooling to make that easy. Being practioners, I have no qualms and objections discussing the tools of the trade (and their limitaions, applicability, etc). The trouble with debates like these is that they tend to start being tool-promotion exercise - rather than a tool-selection exercise (for a particular problem domain) that the practioners should debate. I am enjoying the debate though (as long as it doesn't focus on db4o alone).
  48. > Valraven: It seems there is a confusion between life
    > and life cycle. I think Guido refers to "life" as
    > the behavioral rules of the object, not the stages it
    > passes during its lifetime.

    These aren't stages. These are objects representing a thing in particular contexts. The underlying thing exists separately from the objects that only represent behaviours. From your POW you only see the door as one thing, when really it is many and may be many more tomorrow. Let's say the door goes to the dump and a scientists wants to track how long it takes to decompose. Or perhaps the door is a collectible and it is resold on ebay. The doorness is transcendent. So you can't represent anything by objects alone, unless you stick to a very small world. Unfortunately, this exactly what happens over time as systems evolve and change.
    If you separate what an object IS from how an object is used you have partitioned your domain into something slowly evolving (the entity essence) and other more dynamic objects (implementing business rules and, btw, they may have persistence state too). I mean, a document is a document, with its chapter, paragraphs, images and so on. A business rule is how a certain company decide to manage the approval. You should design Document object thinking only at its structure and consistency rules. These (alomst) never change. Guido
  49. If you separate what an object IS from how an object is used
    The only thing an object is is identity. Everything else comes from how it is used and that will evolve as its ecosystem requires.
  50. If you separate what an object IS from how an object is used

    The only thing an object is is identity. Everything else comes from how it is used and that will evolve as its ecosystem requires.
    Humm. I would add the object interacts, it is not used. And also pay attention to what the object can do. The object will then evolve with ecosystem or die. William Martinez Pomares
  51. Ok.
    Valraven: It seems there is a confusion between life and life cycle. I think Guido refers to "life" as the behavioral rules of the object, not the stages it passes during its lifetime.

    Now, we have a reality here: the Relational model was created to model data entities and their relations, while the object orientation was created to model real life entities, which includes behavior.

    There is an actual IM problem when we talk about Object Databases. If a DB is going to store objects: are we going to store only the data from objects, or are we going to store also their behavior?

    If we are going to store only the data, there should be a way to "complete" the object with its actual missing behavior part when loaded, or even when searched for. What do I mean? Objects have public and private attributes. To look for door with an open state I would look for objects that return true when asked obj.isOpen(), if I'm not able to access the private state attribute. Note I'm not breaking encapsulation just for the sake of searching an object.

    But to have the above thing, I need to store the complete object, not only the data. Thus, it wouldn't be a database, but an "objectbase", where complete instances live. That sounds more like JavaSpaces, isn't it, but with the store and search capabilities potentiated. Now, it also implies the object must be something generic, which is not: you have Java Objects, Ruby Objects, .NET Objects, etc. So, you cannot store "objects" with behavior as a generic thing.

    If we go back to the first idea to save only data, and to complete it with behavior when loaded into my application, we have problems with encapsulation and we limit the objects potential. We also are tight coupling the application language to the repository. So Relational DBs are generic, but Objects DBs are language specific. HUmmm.

    So, in short, objects are to model the business rules and may need to be persisted. Relational entities are to model data and their relationships. Two separate things, two separate ways to persist in a repository. Two ways to search, organize and model. The discussion should then be directed to:

    1. When should I use each?
    2. How can I combine each?
    3. What should I expect from an Object Database?

    William Martinez Pomares.
    Nice post, really. I think that waiting for the holy grail the only reasonable word (that works) is compromise. We should assume (I mean coexist with the idea) that making an object persistent means to take a snapshot of its state (in GoF sense) and "serialize" it to secondary storage. This secondary storage allows you to search for snapshots that satisfy criteria. Looking at the problem from this perspective isn't that painful encapsulation breaking because you are not accessing THE object but only a snapshot of its (internal) state. And this compromise works in a lot of cases. Guido
  52. So, in short, objects are to model the business rules and may need to be persisted.
    The only way out of this is a services perspective. You can't only use an OODBMS because that limits who can access the objects to specific language bindings. You can use a RDBMS only because the behavior is in the apps. You can't move the behavior to the apps because it won't scale and it again limits new creators of behaviors. You create a service interface based on protocols so any language/app can access the service. It's CORBA all over again. So nothing really works.
  53. Typically objects are much better at modeling things in the real world (like business entities etc.,) than tables and rows because the real world is made up of objects, not tables and rows


    I don't think the relational model encourages you to think about data in terms of rows and columns, so much as predicates - statements of fact, with a placeholder in each column for nouns, adjectives, etc. In this sense the relational model is very much about the real world, each "row" is a statement, with a context provided by its table (which concept is lacking in object models usually).

    And if as the previous poster observed (quite incitefully) that our persistent objects are just bags of fields, then it's hard to see where the advantage is there...

    OK,
    entities are (almost) always something close to real world things.
    The pb is that looking at a relational model of the world you miss the behavioural part of an entity that is tightly connected to intrinsic nature of the entity itself.
    This behavioural part must not be confused with the "way entities are used in a particular application, in a particular time frame", these are the business rules.
    Obviously, if the persistent classes are modeled as plain vanilla bag of attributes you are simply seeing a world of things without life; puppets that you can stretch beyond any (intrinsics) physic rule.

    Guido.
    It (Relational Model) also makes you have to think about joins and "where is this located" and what columns are used for what and when.
  54. Over the past 20 years I’ve had several opportunities to develop applications using an ODBMS – Gemstone with Smalltalk and ObjectStore with Java. The good news is that they are very fast and elimination of the O-R mapping obstacle makes the design much simpler and easier to implement. There are 2 problems I see. The first is the general resistance in most organizations to using an ODBMS. Most of that stems from fear and ignorance, but there is one objection which is valid. OBDMS like network (e.g. CODASYL) databases are less flexible when data access patterns do not align with the schema. An RDBMS performs reasonably well when you do ad hoc queries that the database designer did not anticipate. Also, adding new columns to existing tables is easy and has minimal impact. No re-orgs are required. Not so with the ODBMS products I’ve used. Generally we’ve found that the ODBMS functions well in high-transaction systems, as a middle tier cache, high level knowledge management and graph and network based problems. The database of record and the operational data store (Inmon) if we have one always ends up being an RDBMS. The second problem is the pre-processing and other hoops we have to go through to implement persistence. Pre-processing using byte code injection has complicated debugging. Sometimes it is accidentally skipped. There is also the issue of insuring that the objects actual storage area is where you want it. In ObjectStore with the Javlin layer, an object is stored with the first persistent object that references it at the time it is stored. You have to be aware of that and control it to make sure that the objects are stored where you want them. That is not exactly transparent. A lot of physical schema knowledge is required. You also have to decide what is transient in terms of persistence versus the language transient marker. They won’t always be the same. There used to be a 3rd valid objection – the lack of standards for access and query, which was probably the major factor insuring that ODBMS did not catch on when RDBMS’s were still in their infancy. I’d like to see ODBMS support built into the JVM and a schema with which you could declare where you wanted your classes stored. Additional hints could be supplied in annotations. No pre-processing or byte code injection. Now that would be transparent. ODBMS vendors could implement the actual storage mechanisms and management and admin tools.
  55. Do you people think Object Databases CAN go Relational themselves? Please have a look at http://tob.ableverse.org and its open source killer application http://www.webofweb.net