InterSystems has released Cache 2007, a post-relational database

Discussions

News: InterSystems has released Cache 2007, a post-relational database

  1. InterSystems has released Cache 2007, their object-oriented database system, which they term a "post-relational database." Cache is unique in that it allows the use of SQL to query an object database. There are two primary aspects to the release: Zen, designed specifically for browser-based application development, and Jalapeno, which focuses on Java database applications. A "post-relational database" is, to InterSystems, a database that combines objects and SQL with an underlying multistorage engine. Zen is an add-on for Cache that manages the presentation layer for data. It's a framework that's been targeted for maximum development speed, includes a library of pre-built components, and was designed for i18n from the beginning. It includes declarative security features, including database integration, page level, and component level access. Jalapeno focuses on Java database applications. It attempts to address some perceived weaknesses of EJB and Hibernate: EJB and Hibernate are based on object-relational mapping. Jalapeno wanted to get rid of it, because ORM has a significant impact on development and runtime. In addition, EJB and hibernate are more database-centric rather than Java-centric. One of Jalapeno's goals was to turn that around, make something Java-centric instead of database-centric. In Jalapeno, the developer goes to his IDE, and starts writing Java code, defining Java classes for the entity model, some of which go into the database, some which may not. At some point, the developer marks as persistent the ones that go into the database. Sooner or later, the compilation occurs. The addin for the IDE gets notified, and it looks to see which classes are persistent, examining them, and creating cache database equivalents of those classes as well as the code needed to connect the application to the database. A month later, if the data needs to change (rename a property, move one)… the plugin applies the same refactoring to the database so there's no unload/reload step. There are two deployment options: from a Java perspective, all access to the database is through the generated Java objects. There's a direct automatic connection between the object and the cache database. The developer can also store the objects via JDBC to any compatible database. Thus, Jalapeno can serve as an ORM layer itself. Rich domain models are created through object database modeling. The object model is translated into cache, so that object validations translate across .NET, Java, or any supported language. Any object class that can be defined can be accessed through SQL. Howver, a single class might not be represented as a single table, because what can be done in a single class hierarchy might require multiple tables to have an accurate relational view.

    Threaded Messages (16)

  2. I'm a little confused by this claim. Here at GemStone, we are hard at work on providing a very seamless capability to view object data through relational API's. However, any way you slice it, an SQL query must SOMEHOW be mapped to underlying object namespaces, properties, collections, and methods. I guess if you are overlaying a SQL query-only engine on existing object technology, you'd call it "Relational/Object" mapping. GemFire provides querying on our distributed object data fabric today with a robust implementation the Object Data Management Group's Object Query Language (OQL) specification. Next year we will be merging our relational caching and continuous query technology (Real-Time Events) with the object caching technology, at which point SQL queries against objects through a GemFire provided R/O query engine will become available. The R/O overhead in the implementation currently being developed is very low compared to the work of actually executing the query (particularly if distributed)--essentially mapping the SQL to OQL and allowing us to leverage technology we've been improving since the days of GemStone's Facets object database. BTW, Cache certainly isn't the first to do this. I believe that GigaSpaces provides some JDBC capabilities into their object cache today. Cheers, Gideon GemFire--The Enterprise Data Fabric http://www.gemstone.com
  3. Some applications will use a distributed fabric to store state that isn't backed by a database or which was loaded from one or more databases and mapped to a common schema stored in the fabric. So, queries wouldn't be against the various databases, they'd be against whats in the fabric. I think we'll all have JDBC access to data kept in a cache fabric in the near future. What more interesting is which query language to support it? SQL is obviously the main candidate even though it lacks objecty features as the third party support for it (widgets, consoles, reporting tools and so on) basically makes it almost unbeatable.
  4. I'm not a Cache expert, but learned a bit about it on a past project. Although marketing folks would like you to believe that it is a true OODB that can persist Java objects, the Cache database is essentially a collection of sparse arrays of delimited fields. It was developed for the M (aka "MUMPS") language, primarily to store records with variable number of fields that were needed for clinical results. Intersystems has since put a great marketing spin on it, and it pretty much dominates the healthcare industry As a general purpose database, it is really quite primitive, especially in terms of multi-user support, transaction isolation and recovery, and scalability, but performs reasonably well for smaller, less interactive, niche application areas like health care. Not sure what's new in Jalapeno, but unless they have made some major improvements in the past few months, there is really nothing to get excited about. Incidentally, I had also worked with GemstoneJ a few years ago, and was very impressed with its seamless integration with the Java VM and blazing performance even in a distributed configuration. On the other hand, ad-hoc query support was non-existent, reporting was a nightmare, and there were almost no tools for administration. While both databases are interesting, and can perform well in their own little worlds, I don't see either one becoming a mainstream relational database killer any time soon.
  5. Although marketing folks would like you to believe that it is a true OODB that can persist Java objects, the Cache database is essentially a collection of sparse arrays of delimited fields.
    Ouch. I typed www.theserverside.com in my browser, but somehow I ended up on www.thedailywtf.com instead.
  6. I'm not a Cache expert, but learned a bit about it on a past project.
    Thank you for the information!
  7. Post-Relational?[ Go to top ]

    Isn't this a hierarchical database? And wouldn't it then be better described as Pre-Relational? ;-D
  8. How about a real OODB... db4o[ Go to top ]

    In that article, it looks like you still have to annotate the heck out of your objects and even include dependencies to Cache. import com.intersys.pojo.annotations.CacheClass; import com.intersys.pojo.annotations.Index; @CacheClass(name="Person",primaryKey="ID",sqlTableName="PERSON") @Index(description="Name Index on Person table",name="PersonIndexOne",propertyNames={"name"},sqlName="PersonIDX") public class Person { public String name; public String ssn; public String telephone; } If you used db4o, it truly supports plain old java objects and your class would look like this: public class Person { public String name; public String ssn; public String telephone; } And it has full querying capabilities, ACID transactions, and all the other good stuff you'd expect in a database.
  9. Re: How about a real OODB... db4o[ Go to top ]

    Travis, It's a good idea to identify yourself as a db4o employee. Just add a sig (use Firefox signature plug-in) with a link to db4o. Peace, Cameron Purdy Tangosol Coherence: The Java Data Grid
  10. Seems there is quite a bit of FUD here about Cache and complete misunderstandings. We put it in our labs and development environments for thorough testing before we chose it. 1. The SQL you use to get data out of Cache contains/object schema handles. That is all the binding you need to get data out of Cache. Then you just iterate over your java objects (typed objects if using Java 5). 2. Jalapeno is updated version of their Java binding with support for Java 5 annotations. It also works with Java 1.4 as you do not have to annotate your objects. You annotate for customization, like to stipulate what you want for 1->many mappings. 3. Scalability: I achieved 10-20,000 transactions per second over JDBC on a Powerbook G4 laptop going against a 500 GB database pulling records 2MB in size with zero optimization. We will be using it for 10+ terabytes of data and several hundred thousand users. Our bottlenecks have been on the JVM side. As far as healthcare goes. I have worked in the healthcare and bio-informatics industries and the data needs are quite diverse, some being quite extensive. To dominate that is quite impressive alone. It is not a niche database because of that. They are used in other sectors. 4. Some Features we like: It is a multi-faceted datbase, OODB, Multi-dimensional DB etc. What they have on the backend is an array structure (sparse) that allows them to project out a particular view of the data: Object, relational, multi-dimensional, etc. This is an important reason we chose it. More than one way to view our data quickly and extensibility for future requirements as change is the one guarantee we generally have in our project(s). 5. Refactoring: to be able to refactor your persistent objects and have IDE sync it with the database seamlessly. A huge feature. No updating of mapping files. Just do your normal refactoring like always. 6. We have a very large (and increasing) quite non-trivial domain model and trying to map that to a relational model would be quite painful and take a long time(300+ entities to start with . . . . project is ongoing). I won't get into the O/R impedance mismatch as there are quite a number of articles and books on the subject. 7. Standards support/ learning curve. XML, Web Services, etc. Java 5 support is nice. But there is also SQL which most developers are familiar with. They are busy at work also adding facets of EJB 3 support. For our team there was a very small learning curve to get up and running. 8. Excitement: our team has licenses to commercial databases and still choose Cache. I think it can replace any RDBMS today. The issue is not technology, it is architect and developer mindset IMHO. A lot of people think Relational first when it comes to solutions (new or old). The willingness to try something outside of the norm is a hurdle not easy to overcome. There are a number reasons Post/Non-relational databases are not as widely known and used. Pain of schema evolution used to be a large, factor, but that was years ago and most vendors have quite good solutions now. As always, use what you have proven works for your requirement/timeframes/etc. For us, the speed of development and overall feature set makes that Cache.
  11. fud ??[ Go to top ]

    I think Howard D'Souza gave a pretty fair discription of cache. I am working with it right now on a old system using the sparse arays (globals in cache jargon) and it has no redeeming qualities at all it is somthing that should have been forgotten a long timeago. And cache object script is a language that makes perl seem elegant and simple. I know how it works people invest a lot of time in learning it (as did I) and it can be dificult to face the fact that that time was not very well spend and that the database world has moved on.
  12. Re: fud ??[ Go to top ]

    I'd be interested to know what versions, language bindings and systems you are using. We don't use any Cache object script and don't have to. We've also only been using the Java/Pojo binding in the 5.x series on Linux/Mac/Windows/AIX and have not experienced any of the issues previously described in our design/development/testing, so we came to different conclusions.
  13. old application[ Go to top ]

    I am using 5.1 on my workstation but the db server currently runs 4 something. I am filling a relational olap data-warehouse from a mumps style application and the objects or jdbc are of no use in accessing raw globals with records that are strings that are separated by ^ and 4 different styles to store date's. Globals are in essence very inflexible and you can only search them in 1 way (depth first) so they are more one-dimensional than multidimensional ;-). The internals (objects are compiled to mumps) are enough to put me of cache.
  14. Scalability: I achieved 10-20,000 transactions per second over JDBC on a Powerbook G4 laptop going against a 500 GB database pulling records 2MB in size with zero optimization.
    Hmm ... that is 20-40GB/s. Pretty amazing job for a G4. You had a 500GB in-memory cache, right?
  15. Pretty amazing jobPretty amazing job[ Go to top ]

    Yes indeed pretty amazing. I know from experience that it takes a lot longer to load the same data in a cache database than a mysql database with myisam tables so cache is not so exceptionally fast as the marketing says.
  16. Why[ Go to top ]

    I'm no Cache expert either, my only experiences with it have been the hardcore marketting folk at intersystems trying to sell it to me. From everything I've seen (and heard) I just dont see the point. I dont see anything Cache offers that you couldnt do with any db backend + hibernate for example, and you wouldnt be locked into some wierd proprietary OO db speak. The tools looked like a nightmare to use too. Its been around for ages and perhaps once upon a time it may have offered some benefit, but in todays world ORM tools are fast, free, reliable, common, and you can levereage whatever db backend you choose. I asked the sales rep repeatedly why I should choose Cache over Oracle/SQLServer+hibernate and after being handed around a few different sales reps and technical reps I got the answer "performance", but nobody would show me any performance data at all. I'm sorry but for that kind of $$ I'm not about to take the sales rep's word for it.
  17. 1.   Performance: My Powerbook G4 had 2GB of memory and was 1 of several clients on a clean network with high throughput, same target environment for production deployment. The database it self was on a backend IBM Power 5 based servers with enough physical memory to put it entirely in memory if needed and 1 attached SAN. If you query the entire database, of course it will go into memory. We did not need it. I used JMeter for my test harness. The focus was to architect the application, system, and deployment architectures for scalability and throughput. We found the bottlenecks at the JVM level as it was 32 bit. 64 bit testing is upcoming and we are looking for great improvement. 2.    Why. There has been a lot of debate about the Object Relational Impedance mismatch over the years(a few: Ambler, Wikipedia, OODBMS Articles). Surely O/R mapping helps the issue. But why stop there? Why have mutliple models: Domain model, Logical Data Model, Physical Data Model when you can have just one model? Most projects that use relational databases have Data Architects and/or DBAs. As a contrast, most Post-Relational/OODBMs don't require DBAs. There is also no gap (read: politics, design theory, turf wars, deployment etc.) between architects,developers and DBAs. I have used all manner of object databases (Objectivity, ObjectStore,etc.)and relational databases (Oracle,DB2, Sybase, MySQL, etc.) and found Cache quite good for what we are doing because it is not just an object database or an sql database. I don't listen to sale jobs as well which why I mentioned lab, development and testing. 3.    Hibernate, JDO, etc. is manual and in some cases automated mapping to relational ends (a few OODBMS may also be supporting some jdo/ejb3). But how many projects still enlist the Data Architect/DBA to build/optimize/maintain the schemas vs. using the generated schemas? How does that work in enviroments with the need for complex schemas like OLAP? We've found limitations/frustrations in O/R mapping frameworks/tooling managing large and/or complex schemas. Some of the runtime speed comes from the removing of layers like O/RMs for instance who need to manage a lot of abstraction (classes, metadata, etc). The design/development speed comes from automatically shoving the schemas in the database in minutes and have it match my domain model 1:1. 4.    Tooling. The Database console is a web interface. I really like that aspect. We partnered with Intersystems and said we want all java tools, not the development studio they offer if that is what you speak of. We use just the java jdbc driver, persistence annotations and the Web Interface. We also collaborated on the upcoming plugins for NetBeans, Eclipse and Intellij. They have been a great development partner as we tend to question everything and have high demands. 5.    Vendor Lock-in. The point here is architecture and design. We are not tied to any database per se. You design your data access layers for abstraction and you also put that requirement on your database technology as well. But the level of abstraction is completely dependent on the need. Coupling will occur somewhere, point is to put at a place that won't be soo painful later when/if you have to change/extend later. With Cache, we have a very simple data access layer and Java 5 annotations where we want them.