Interesting debate on API-less clustering and correctness

Discussions

News: Interesting debate on API-less clustering and correctness

  1. Billy Newport and Ari Zilka from Terracotta have started a very interesting debate over the feasibility and applicability of traditional clustered hashmaps (like JCache implementations) vs. the newer API-less clustering solutions that are emerging in the market.

    The discussion was started by "Object Identity, Tradition and DSO - Part 2", which explained Terracotta's approach to "API-less clustering solutions," arguing that object identity was a crucial factor lost in clustered hashmap solutions:
    Traditional API-based clustered caches present developers with an API, typically a hashmap. Java developers who wish to use this API to distribute their domain objects are faced with a difficult task. It is impossible to efficiently distribute a natural domain model: inevitably, the model has to be able to slice itself apart and glue itself back together using relational keys. This means more developers have to write and maintain a lot of extra code just to make their domain distributable. And just writing that extra code forces Java developers to think like relational database designers.
    Billy Newport responded with "Opinion on APIless caching," stating the case for solutions that expose mechanisms to the developer, so the developer can tune for optimum performance:
    The more I think about this the more I don't think it's going to work for the mass market. I say this because most people have difficulty with multi-threading programming. It's hard, it's always been hard. We offer multi-threading to WebSphere customers with the WorkManager APIs and so long as it stays simple then customers are happy but it's easy to get tied up in dead locks etc also, scaling and bottleneck issues. It's hard to write multi-threading code that scales vertically. Microsoft are just getting Windows to do this now. It is absolutely non trivial. Operating system vendors and database vendors have large investments in this kind of code. Why do we think customers can now write perfect multi-threading code that can be transparently distributed using a non invasive approach?
    Following Billy's post, a Terracotta engineer suggested API-less solutions were better because they left the distribution mechanism to those who do it better, much like has been seen in the J2EE space (where replicating everything that EJBs do is hard - but not always necessary):
    With an API, you are leaving that heavy-lifting to the developer. There is no performance advantage to an API when trying to cluster those changes. Performance is arguably ONLY achieved by an APIless solution (again, the example of GC comes to mind). I have trouble believing that with my operating system, for example, I as the user should be keeping track of and moving the most popular blocks of files into cache to avoid I/O. Another example: when is the last time you wrote code to optimize what data or instructions were in your CPU's L2 cache when writing a Java app? Seems like an API forces you to ask the wrong person w/ the wrong tools to get the task done.
    Billy responded to this (in his "Final word on caching with APIs and no APIs") by pointing out that the "knobs and special... clauses" (using SQL as an example) allowed very high performance given a skilled developer.

    The final word from the API-less advocates came in the form of "Scalability vs. Correctness," asserting four points, including the original assertion that "object identity is the problem":
    1. API-based caching helps with only part of the problem
    2. APIs are unnecessary AND harmful because they add complexity (much like memory management used to be an API but has been factored away)
    3. API-based caches violate basic engineering principles: tuning during design is just too hard
    4. API-based caches do not address the complexity introduced by caching the database
    What do you think? Others have suggested that object identity across a cluster is the ultimate problem of clustering solutions, yet developers seem to manage without it, in many cases (and there's a very active market for tools based on clustering APIs). Is it as important as some make it out to be?

    Threaded Messages (46)

  2. O/R mapping[ Go to top ]

    In many real-world J2EE projects, some kind of O/R mapping layer is going to be used (Hibernate, entity beans, home-grown, whatever) and the real issue then becomes the relationship between that O/R mapper and the caching product (API-based or not). The Terracotta blog does not discuss this aspect much unfortunately. Despite claims to the contrary, neither is any O/R mapping tool nor any caching solution of my knowledge fully "tranparent". Typically, an O/R mapping tool supports some kind of pluggability with a cache but that seems to be mostly limited to lookups by primary key. For any set-oriented query (e.g. expressed with Hibernate's HQL), the O/R mapping library will just bypass the cache, generate some SQL and hit the database. It seems to me that for actually taking advantage of the cache in non-trivial cases, as the cache does not have an SQL interface, one must write the query programmatically with the Collections API or use some vendor-provided API that simplifies that. Then if the data's not cached, use the O/R mapping to query the database and map the results into objects... and put these objects in the cache. I don't know if any solution on the market can actually achieve much better than that.
    I'm still a tad unconvinced on the API-less promise as usually in such cases the reality turns out to be that intrinsic complexity of the problem space has not been eliminated but shifted someplace else (e.g. using locks correctly as opposed to using the hashmap correctly). Just like one could argue probably that O/R mapping tools alleviate the O/R impedance mismatch in some ways... while introducing a new set of challenges such as "how do I integrate this thingie that has its own concepts of sessions, detached objects, transactions, query language etc with my distributed cache ???"
  3. O/R mapping[ Go to top ]

    Typically, an O/R mapping tool supports some kind of pluggability with a cache but that seems to be mostly limited to lookups by primary key. For any set-oriented query (e.g. expressed with Hibernate's HQL), the O/R mapping library will just bypass the cache, generate some SQL and hit the database.
    Hibernate also supports cached queries, here are some quotes from their docs:
    20.4. The Query Cache
     Query result sets may also be cached. This is only useful for queries that are run frequently with the same parameters.
    ....
     Note that the query cache does not cache the state of the actual entities in the result set; it caches only identifier values and results of value type. So the query cache should always be used in conjunction with the second-level cache.
    ....
     If you require fine-grained control over query cache expiration policies, you may specify a named cache region for a particular query by calling Query.setCacheRegion().
  4. O/R mapping[ Go to top ]

    In many real-world J2EE projects, some kind of O/R mapping layer is going to be used (Hibernate, entity beans, home-grown, whatever) and the real issue then becomes the relationship between that O/R mapper and the caching product (API-based or not). The Terracotta blog does not discuss this aspect much unfortunately. Despite claims to the contrary, neither is any O/R mapping tool nor any caching solution of my knowledge fully "tranparent".

    I just thought I would suggest something here. Terracotta has hortizontal object clustering / caching. Terracotta has vertical JDBC caching in an event-based manner (snooping the changes to the DB from underneath.) Put the two dimensions of I/O together and you can update objects when the DB changes and update the DB when objects change. SQL would be localized inside the OR-Mapper and not visible to the end user developer anywhere in the business app. The only requirement is Hibernate-like configuration. Stay tuned.
    I'm still a tad unconvinced on the API-less promise as usually in such cases the reality turns out to be that intrinsic complexity of the problem space has not been eliminated but shifted someplace else (e.g. using locks correctly as opposed to using the hashmap correctly). Just like one could argue probably that O/R mapping tools alleviate the O/R impedance mismatch in some ways... while introducing a new set of challenges such as "how do I integrate this thingie that has its own concepts of sessions, detached objects, transactions, query language etc with my distributed cache ???"
    Amen and Halleluiah w.r.t. the HQL construct and the more general concept of shifting the burden. You are focusing on exactly the appropriate challenges for the OR-mappers and the API-based caches. They are most often assembled piece-part to solve this problem. Please drop me an e-mail and we can connect. I will try to convince you about the capability to deliver transparency even around OR-mapping and I would love your feedback.
  5. re: O/R mapping[ Go to top ]

    -- plug --
    At the risk of going off topic - it might be worthwhile pointing out that we have an RC-1 for a new product, called Jofti (http://www.jofti.com) , that does supply an index and query interface to some of the popular caches.


    This does not really add much to the current discussion - but does reside in the non-trivial Cache usage space that Alain is referring to.
    -- end of plug --

    Slightly more on topic I have to agree with Billy, it is about developer control and forces that produce APIs to start with.

    I noticed from the Terracotta site that a developer's simple usage of shared objects was starting to run into issues where locking had to be introduced in order to make the code function correctly (see
    http://www.terracottatech.com/forums/viewtopic.php?t=5 ).

     I am curious as to how many changes one would have to make to use this sort of thing with the read/write lock type idiom or other concurrent structures that did not use synchronisations directly.

    IMHO this sort of thing seems like the solution is pushing its way up into the code and developers have to be aware of the implications of using it (which does not seem too far off an API).
  6. re: O/R mapping[ Go to top ]

    -- plug --At the risk of going off topic - it might be worthwhile pointing out that we have an RC-1 for a new product, called Jofti (http://www.jofti.com) , that does supply an index and query interface to some of the popular caches.This does not really add much to the current discussion - but does reside in the non-trivial Cache usage space that Alain is referring to.-- end of plug --

    Very kewl.
    I noticed from the Terracotta site that a developer's simple usage of shared objects was starting to run into issues where locking had to be introduced in order to make the code function correctly (see http://www.terracottatech.com/forums/viewtopic.php?t=5 ). 

    Please read the support entry more carefully. The person was trying to cluster something our first beta release didn't support. There are, logically speaking, applications that no one can cluster, API-based or otherwise. If my main () for example does:
    int i = 1;
    and then exits, why cluster main() or i or anything? Over-simplified perhaps but close to what the support thread was talking about, BTW.
  7. O/R mapping[ Go to top ]

    Hi Alain.
    I'm still a tad unconvinced on the API-less promise as usually in such cases the reality turns out to be that intrinsic complexity of the problem space has not been eliminated but shifted someplace else (e.g. using locks correctly as opposed to using the hashmap correctly).

    Yes, this is very true and an important point. There is a subtle distinction in Terracotta's notion of 'transparency' that is sometimes lost: Terracotta DSO is transparent to your domain model, but it certainly is not transparent to _you_, the developer. Writing a distributed application is inherently a complex, and Terracotta doesn't claim that we can eliminate that complexity. What we do claim is that we give you a better, declarative way to manage that complexity.

    As you say, in a sense this really is just shifting the problem around. However, Terracotta's claim is that shift yields two huge benefits:

    1) You don't have to pervert your domain model to accommodate the cache. You don't have to tune for the cluster at design time. This is the key point I was trying to get across in part 2 of "Object Identity, Tradition and DSO.&quot

    2) A huge performance boost. We don't have to java.io.Serialize because we can send only the deltas. Some benchmarks of DSO and other leading cache products are in the works and the preliminary results are astoundingly in favor of DSO.
    ...Then if the data's not cached, use the O/R mapping to query the database and map the results into objects... and put these objects in the cache. I don't know if any solution on the market can actually achieve much better than that.

    Yup, that's definitely a key problem. The really exciting thing for us at Terracotta is that our server has both a clustered cache and a state-of-the-art JDBC cache. We've only scratched the surface of what we can do to integrate the two; we think that in the medium term we can do a lot to address the problems you describe. We may yet build you that silver bullet. :)

    Cheers,
    -p


    ----
    Patrick Calahan
    Product Manager
    Terracotta, Inc.
  8. 2 cents[ Go to top ]

    If someone tell you that any application that can run on one machine can run efficiently on a cluster without special hardware (RDMA), don't believe them. It's true for certain workloads, but it's not true in general.

    My money is on Billy :)
  9. 2 cents[ Go to top ]

    I am confused. How does special / faster HW change the playing field for API-based vs. API-less clustering?
  10. Confused also[ Go to top ]

    It doesn't but it'd sure be nice for either, no? :)

    Billy
  11. 2 cents[ Go to top ]

    I have tried to post a reply, but in vain. TSS gives J2EE a bad name :-) !
  12. now it works?[ Go to top ]

    I should have known ..
  13. 2 cents[ Go to top ]

    I am confused. How does special / faster HW change the playing field for API-based vs. API-less clustering?

    To me it's relevant.

    I figure one of the main selling points of the "API-less" approach is that is that existing applications don't have to be modified.

    But the lack of an API does not make this possible for all applications. If an application's workload has heavy data sharing the application will be slow on a cluster, unless the cluster has a low-latency interconnect (and you have a number of other ducks all in a row.)

    Guglielmo
  14. 2 cents[ Go to top ]

    UDAPL is something I've been looking at for a while now. It's C only but it works on either ethernet or infiniband which is necessary as an IB only technology has no mass market appeal.

    Billy
  15. 2 cents[ Go to top ]

    UDAPL is something I've been looking at for a while now. It's C only but it works on either ethernet or infiniband which is necessary as an IB only technology has no mass market appeal.Billy

    Clusters (stateless ones) are already less popular than SMPs (since they are so cheap now) and clusters with coherent caches even less popular. But the customers who really need IB will use it. There is value in providing systems for those users also. And eventually it could pick. The Mellanox IB on-board silicon is $69, and you can buy it already integrated in blade servers, e.g. APPRO sells them for $4-5000 per blade.

    I think what's needed here is an integrator company. Nobody wants to do integration, but that would actually be a high-margin business. You can sell them the chassis, the blades, the OS, and the JVM. We can all learn from IBM ;)
  16. Please correct me if I wrong but:

    When we are talking about data clustering it's all about data consistensy model. The more restrictive model you choose the less performance you get. Which model to choose depends on application nature. So, you need an API to formalize your model right? You need an api to deal with distributed shared memory (distributed hasmap in simplest case). You can't get something by paying nothing :). You need an api to do threading, you need an api to do IO, you need an api to deal with clustered (distributed) memory. That's it end of story.

    It's just an illusion (till network interfaces becomes as fast as memory) or dirty PR when someone yields "We can do transparent clustering for any application at almost no penalty". The key word here is almost. There is may be almost no penalty in presentation. But this almost may grow a lot (and it will) in your particular usecase...

    Just a simple example... Suppouse you have to do multithreaded programming in hardware environment which does not support syncronization at hardware level... What you gonna do? (Yes there is algorithm available which is proved to be correct to deal with, but suppose you're not aware of it, aren't you?)... What if transparent engine does not support all the neccessary functions to address you particular usecase? What if data consistensy model it uses is much more restrictive than you really need?

    All I'm trying to say is that some sort of API is unavoidable and strong math behind it is a must.

    Even fairly simple and elegant api like (javaspaces/tupple spaces) has a lot of implementation gotchas and yet to be done things.
  17. The point here is more about clustering (data+CPU slicing) then just data-slicing.
    With java you don't need to use an API to synchronize: you might do declarative synchronization, then it's up to the JVM to try the most optimized way to do it.
    Declarative approach moves the burden to the undelaying JVM.
    To me the point here is: are we at a point where the JVM has enough control to do _effective+efficent_ clustering?
  18. Hi Simone,

    You are not alone ! ;o)

    Here is my point of view. On the one hand, you can effectively offer some dedicated API (grey box, but still a box as any library providing an API) to let the developer manage the object synchronisation. On the other hand, you can leave it to an underlying framework (black box, JVM based or not) that would be setup by the administrator.

    Nevertheless both the developer and the administrator have to know what kind of application they are developing, including what constraints in term of latency, network load, etc... to ensure proper quality of service.

    So in some particular cases (such as very bad network conditions, etc...) you can imagine that a closed dedicated framework might not behave very well (as far as this situation was not taken into account by the framework developers), and that an API solution would be the only one to give the appropriate freedom. Some years ago you really could not rely on the network to be efficiently available, and your strategy to do data sharing was absolutely different, as losing network connection with a 9600 bps modem was absolutely not exceptional. So of course you have to know the limits of the blackbox, but this is very true for the grey box also.

    But if the blackbox covers 95% or more of the normal situation (i.e. proper bandwidth and CPU on the nodes) there is no reason why the administrator would not be able to tune the application behaviour, changing the replication strategy if necessary. The framework might be intelligent enough to gracefully and automatically change this strategy when the network conditions degrade or get better, for instance. These kind of strategies would take quite a long time to implement and test by a team of developers, so why not buy a product off the shelf that is proven to do it right in 95% of the case ? As far as the price is reasonable of course ;o)

    So in the enbd it is up to the application architect to ensure that the scope of the application is within these 95% boundaries.

    To make a comparison, 8 years ago i was doing C++ database access using plain ODBC, which lead to high maintenance costs (i.e. when database schema changes). Nowadays i use object-relational frameworks (EJB, Hybernate, JDO), and have found the need to use the vendors extensions or special optimisation features only in a few cases.

    Of course when the performance is critical, sometimes you need to use these special features. But i guess that sooner or later these features will either be manageable by the application administrator at deployment-time or runtime. As another example, see the various indexing capabilities of a RDBMS. No developer work needed to speedup a particular request, just build an index on the criteria and run !

    Even better, the runtime strategy can be auto-tuned by the system himself, such as the JVM GC behaviour. Of course these optimisations might not be the best ones 100% of the time, but neither is the developer's ones.

    I really believe this idea of auto-sharing java objects is brilliant :o)

    My 2 cents,

    Christian
  19. My experience with custom built (or cutomer-taylored) frameworks shows that API-less caching is more effective as the framework/environment uses a more abstract/higher level approach.
    We did both instance-oriented caching and set-oriented caching in a completly transparent way. That worked 100% where everything was being delegated to the framework, on the opposite where the developer had control we had to introduce an API.

    The more some details are delegated to the framework and the more we can think of how to handle them.
    With Java we delegate a lot of things to the JVM (pointers, for instance) and so we can probably achieve a much higher caching/clustering performance then in other environment.

    Also a couple of simple considerations:
    Maybe using RMI we could create special stub/skeletons that are capable of migrating automatically.
    And RMI development involves some kind of synchronization, thus maybe special JVMs might be able to partition RMI objects among the cluster so to minimize bandwidth and maximize CPU utilization
    But then, maybe, special JVMs would be able to do the RMI-work without all the API-oriented interfaces and lookups...
    And then the only real need would be synchronization, which, alone, is much easyer the synchronization+clustering....
  20. http://research.sun.com/techrep/1994/smli_tr-94-29.pdf

    From the abstract:
    We look at a number of distributed systems that have attempted to paper over the distinction
    between local and remote objects, and show that such systems fail to support basic requirements
    of robustness and reliability. These failures have been masked in the past by the small size of the
    distributed systems that have been built. In the enterprise-wide distributed systems foreseen in the
    near future, however, such a masking will be impossible
  21. http://research.sun.com/techrep/1994/smli_tr-94-29.pdfFrom the abstract:
    We look at a number of distributed systems that have attempted to paper over the distinctionbetween local and remote objects, and show that such systems fail to support basic requirementsof robustness and reliability. These failures have been masked in the past by the small size of thedistributed systems that have been built. In the enterprise-wide distributed systems foreseen in thenear future, however, such a masking will be impossible

    I have been WAITING for someone to cite this Waldo paper as a reference. I have heard this argument, mostly from BEA folks, who have yet to help customers with the core scalability issues around clustering.

    In short, RMI's remote exception gets thrown when a developer is explicitly invoking BUSINESS LOGIC on a distributed deployment and that invocation fails for one of many possible reasons. But the entire point of API-less or API-based clustering is that if the state and coordination information could be invisibly shared (OR w/o clustering, if I could just get a giant single VM), I would not have used RMI in the first place. I would not have worried about pieces and parts of my business logic that need to exists as stubs and run in a central location, etc. As soon as I start using RMI within a single application, the VM and container have failed to scale in ways my application demands and I am now, as a developer, doing someone else's [abstractable] job by myself. Citing RMI's RemoteException as proof that clustering cannot be transparent just fundamentally misses the point that clustering in this case is not a business need or concern but an IT artifact. Terracotta, and our competitors are not competing with RMI, web services, and the like, we are helping them get used appropriately.
  22. I have been WAITING for someone to cite this Waldo paper as a reference.

    I am happy to oblige ;)
    As soon as I start using RMI within a single application, the VM and container have failed to scale in ways my application demands and I am now, as a developer, doing someone else's [abstractable] job by myself.

    For certain workloads the developer needs to understand the difference between a method call and a network roundtrip, or the application will crawl. So hiding the difference in this case is a disservice.
  23. Object Identity[ Go to top ]

    In my opinion caching object identities is meaningless for 95% of all major applications. Caching by its very nature most of the time is caching of data.
    As soon as you start creating cached object identities, they of course must supply full XA transaction support across your entire cache and in essence do all the stuff and more that something like the terracotta product gives you.
    This is not only overkill for a vast amount of applications. Far worse, it introduces yet another infrastructure component to complicate systems management.
    Yet this essentially solves a problem that is already solved:
    Want to cache mostly static data: Use a distributed hash map
    Want to cache dynamic data: Use your database (the one you already payed for). The odd thing is that your cache will often be faster this way than by introducing any other product. And here is another thing: You don't need 2PC, you don't need to bother how to tune the "locking schemes" on your distributed objects.
  24. Challenge for the API-based crowd[ Go to top ]

    I would like to put out a call for source code to any sample application that folks feel will prove their point (those who are asserting that API-based clustering is a necessity). Maybe we are asking to be beaten over the head here, but I think seeing physical examples will be very productive.

    If you produce source that works with JCache, for example, I will alter it to work with TC and we can diff.

    Thanks...
  25. Challenge for the API-based crowd[ Go to top ]

    I would like to put out a call for source code to any sample application that folks feel will prove their point (those who are asserting that API-based clustering is a necessity).

    I don't think it's a necessity. Go ahead and use TC over UDP for all possible applications, and watch what happens.
  26. This is not only overkill for a vast amount of applications. Far worse, it introduces yet another infrastructure component to complicate systems management. Yet this essentially solves a problem that is already solved:Want to cache mostly static data: Use a distributed hash mapWant to cache dynamic data: Use your database (the one you already payed for).

    Karl, if a database were able to provide the throughput and performance that applications needed, then nobody would buy any of these useless cache thingies, and all these caching software companies would be out of business. (We're now wrapping up our 15th consecutive profitable quarter, so don't get your hopes up. ;-)

    The simple truth is that database latencies are unacceptable for many applications, and database throughput is totally inadequate for many applications. Databases do not scale to the needs of modern distributed and grid applications. Period.

    If you're still working on PetStore or PetShop or whatever and MySQL is good enough for you, then more power to you, don't waste your time with this stupid cache stuff. But don't insult the intelligence of people who have already found that a database isn't fast enough, and can't handle anywhere near enough load for their applications.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Infinite linear scalability of data throughput and capacity
  27. Scaling[ Go to top ]

    Databases do not scale to the needs of modern distributed and grid applications. Period.If you're still working on PetStore or PetShop or whatever and MySQL is good enough for you, then more power to you, don't waste your time with this stupid cache stuff. But don't insult the intelligence of people who have already found that a database isn't fast enough, and can't handle anywhere near enough load for their applications.
    Amen to that, I would just add that quite often, databases do not scale beyond certain limits at acceptable cost. Not everyone can afford a 128-CPU Solaris monster box with associated Oracle license, especially if similar or better RAS properties can be achieved by clever use of caching at much much lower cost... And indeed at the high end, some applications can probably not meet their requirements altogether without some serious amount of caching. This is also, incidentally, why Tangosol is doing well and why new players are entering the market (Terracotta and IBM being recent examples) with different approaches. May a thousand flowers bloom. I wish them all well.
    Infinite linear scalability of data throughput and capacity
    Nice one :-)
  28. But don't insult the intelligence of people who have already found that a database isn't fast enough, and can't handle anywhere near enough load for their applications.

    Go easy on Karl - he was porting the JDK the linux for a long time when Sun was pretending that Linux didn't exist.
  29. Go easy on Karl - he was porting the JDK the linux for a long time when Sun was pretending that Linux didn't exist.

    I just found it funny that he'd be suggesting to use the database more when all of our customers are desperately trying to use their databases less ;-)

    Peace,

    Cameron Purdy
    Tangosol Coherence: If you used Coherence, you would be clustered by now.
  30. I just found it funny that he'd be suggesting to use the database more when all of our customers are desperately trying to use their databases less

    It's okay. Actually, I think I got confused with a different Karl, so I am not going to object any more ..
  31. infinite scalability[ Go to top ]

    Tangosol Coherence</a>: Infinite linear scalability of data throughput and capacity

    "So I have just one wish for you--the good luck to be somewhere where you are free to maintain the kind of integrity I have described, and where you do not feel forced by a need to maintain your position in the organization, or financial support, or so on, to lose your integrity. May you have that freedom."

    http://www.physics.brocku.ca/etc/cargo_cult_science.html
  32. Databases are supposed to be designed to efficiently handle caching and data consistency and yadda yadda yadda ("yadda" is a highly technical term).

    Now massive development effort is going in to putting increasingly complex layers between the developer and the database in order to hide its complexities and overcome its bottlenecks.

    Doesn't this indicate that something is wrong? It seems like either (1) RDBMSes are "broken" for modern development, or (2) modern developers don't know how to effectively use databases?

    Why isn't someone writing a ACID-compliant database that is amenable to object-oriented data access and storage instead of finding more and more ways to stick more stuff between the application code and the RDBMS?
  33. The most common issue is that by and large RDBMS systems are designed to run on a single host. To get really good performance out of them directly people tend to either a) stick them on huge boxes, b) partition their data so they're using several seperate physical databases, or c) replicate the databases in some manner.

    By contrast something like Coherence is inevitably distributed. Your load is distributed across the machines in your cluster - if you have 10 machines your load is spread over 10 machines. It's rather rare to see a database truly running distributed on 10 machines but with something like Coherence it's S.O.P.
  34. The most common issue is that by and large RDBMS systems are designed to run on a single host. To get really good performance out of them directly people tend to either a) stick them on huge boxes, b) partition their data so they're using several seperate physical databases, or c) replicate the databases in some manner.

    ... or d) Buy infiniband adapters and a switch, and replace the big box with a few blade servers. It's supported in both Oracle and DB2.

    And you get the additional benefit of improved availability, and this means you can spend LESS on your components. You don't need hot-swappable cpus, etc.

    But most DBAs still don't know about these things, and that puts a real damper on it. But it's all here.
  35. It's not just that most DBAs don't know about these things. Factor in the cost of Infiniband and then factor in the cost of licensing Oracle and/or DB2 to do this. Then get your network people to buy into a very custom network with servers colocated physically very close to one another.

    By comparison, you can use a distributed cache and literally buy a bunch of cheap linux boxes and let your network guys network them normally.

    The minute you throw in special hardware requirements then you can do almost anything you want. I know mainframes that basically let you wire buses directly to each other.

    But this will never be mass market. The interesting solutions (to me) are those that get great performance and scalability on commodity hardware.
  36. The interesting solutions (to me) are those that get great performance and scalability on commodity hardware.

    Infiniband is cheap. Few people use it, but most of the people that use it buy hundreds or thousands of nodes. NSA buys thousands of opteron/infiniband blades. An investment bank which was private for a very long and finally went public buys thousands of the same.

    Plus, if you get infiniband adapters you can stop buying fibre channel adapters.
  37. I think you're missing the obvious point here. Joe blow can buy 10 linux blades and hook 'em up via Ethernet using strictly commodity stuff. And end up with a distributed cache that blows the doors off of standard RDBMS setups.

    You could do the same with an Infiniband setup. You'd have adapters that are far from commodity. The hookups wouldn't be ethernet, so it would be a highly specialized portion of your company's network ops team. Given that it's so rare compared to Ethernet the TCO is going to be _way_ higher.

    I'm not saying you can't create such a solution. Clearly you can, and in my original post I was careful not to say it was impossible. But it's far from the norm. Why? For the simple reason that for many applications you can get the same performance and scalability effects much, much cheaper using standard stuff that everybody understands.

    I'm sure the NSA uses things like Infiniband, and I know banking apps that can benefit as well. But by and large those are very specialized apps that need every little erg of power they can get out of their hardware. But below that extreme there are many people who 1) have serious scalability problems with their database, and 2) could cheaply use a distributed cache on a bunch of cheap, ethernet-networked machines and solve their problems.
  38. Specialized hardware[ Go to top ]

    its right here, and both Tangosol and TerraCotta are partners.

    http://www.azulsystems.com

    --bob (working @azul, TAB member of TC)
  39. I'm sure the NSA uses things like Infiniband, and I know banking apps that can benefit as well. But by and large those are very specialized apps that need every little erg of power they can get out of their hardware. But below that extreme there are many people who 1) have serious scalability problems with their database, and 2) could cheaply use a distributed cache on a bunch of cheap, ethernet-networked machines and solve their problems.

    I know. I am not talking about those easy apps.
  40. there are many people who 1) have serious scalability problems with their database, and 2) could cheaply use a distributed cache on a bunch of cheap, ethernet-networked machines and solve their problems.

    I know. I am not talking about those easy apps.

    Which easy apps were you talking about then? ;-)

    What NSA apps do they waste my tax dollars on that actually require infiniband? From what I've seen, 90% of government hardware and software purchases are complete waste. I was at a certain government agency that was sending tens of millions of dollars to Oracle a year in order to license Apache software.

    Since I have some experience with infiniband, I can tell you that not a lot of Java apps even benefit from it. The generic TCP/IP bindings drop the throughput of 10Gbps infiniband down into the 2Gbps range, making it not much better than a $20 GigE card, but at 100x the loaded per-port cost. The latency goes up significantly, too.

    Peace,

    Cameron Purdy
    Tangosol Coherence: With Cost-Optimized Parallel Query for Java Data Grids
  41. Which easy apps were you talking about then? ;-)

    They are not easy, they are hard, and when you encounter them, you'll know ;-)
    What NSA apps do they waste my tax dollars on that actually require infiniband?

    They don't use for OLTP - only for computations. Supercomputers have kept infiniband alive so far, after the companies that initially created tried to kill for lack of interest in clusters. But where else can you get RDMA so cheap with 30Gbps bandwidth?
    Since I have some experience with infiniband, I can tell you that not a lot of Java apps even benefit from it.

    Definitely, I mean, imagine some guy serializing a Java object over an IB link :)

    But IT can benefit from being able to deploy databases on a cluster rather than an SMP. You can probably look up the math.
    The generic TCP/IP bindings drop the throughput of 10Gbps infiniband down into the 2Gbps range, making it not much better than a $20 GigE card, but at 100x the loaded per-port cost. The latency goes up significantly, too.

    Four letters: R D M A
  42. There's a lot of discussion in this thread about the practical aspects of clustered caches vs clustered databases and the costs of the hardware etc etc required for them...

    But in the long haul, are caches and traditional RDBMSes both necessary?

    It seems to me that an RDBMS could "absorb" the cache functionality if it became much more "object aware" and more easily distributable.

    That being said, a distibuted cache could absorb so of the fuctionality of the RDBMS and potentionally make it uncessary.

    It just seems like having both is overkill. The data ends up existing in too many places, resulting in too much work being done to keep them all in sync. In addition, having so many layers makes the application more complex.

    Why can't we have a single transactional persistence mechanism that will run in-process, scaled linearly by adding more nodes, and interacted with using a Hibernate-like API?
  43. It seems to me that an RDBMS could "absorb" the cache functionality if it became much more "object aware" and more easily distributable.

    that's probably why Oracle bought TimesTen

    http://www.oracle.com/timesten/
  44. It seems to me that an RDBMS could "absorb" the cache functionality if it became much more "object aware" and more easily distributable.

    that's probably why Oracle bought TimesTen

    Except that it's not "object aware" and it's not distributed. ;-)

    Peace,

    Cameron Purdy
    Tangosol Coherence: Applications with 100% Availability since 2003
  45. But in the long haul, are caches and traditional RDBMSes both necessary?It seems to me that an RDBMS could "absorb" the cache functionality if it became much more "object aware" and more easily distributable.

    You can manage objects as BLOBs in databases today or use ADTs (awkward data types). But, that doesn't mean there should be one solution for all data management concerns. We live in a age where organizations are being drowned in a deluge of information. At times, you want to persist this information and sometimes you just want to analyze fast moving data or act on it instantenously. There are lots of cases where you don't want an transactional system (ACID) - for instance, when dealing with market data in financial trading systems, stale data is worse than no data.
    Hey, The RDBMS vendor strategy has been " once size fits all" - well, it doesn't.
    This Jim gray's article nailed it for me ... http://www.acmqueue.com/modules.php?name=Content&pa=showpage&pid=293

    We think of caching not just as a layer on top of your RDB to make it faster, but as a first class architectural component for your middle tier for managing and distributing OPERATIONAL data very efficiently. It is about data arriving from any number of sources - increasingly these sources will be streaming in nature.

    - Jags Ramnarayan
    GemStone Systems
    GemFire - The Enterprise Data Fabric
    ( http://www.gemstone.com )
  46. We think of caching not just as a layer on top of your RDB to make it faster, but as a first class architectural component for your middle tier for managing and distributing OPERATIONAL data very efficiently. It is about data arriving from any number of sources ...

    .. all that, and I'd add that it's about having the ability to do it at in-memory speeds ("zero-latency") without trading off the qualities of availability and reliability that make the database a very safe choice.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Transactional Caching
  47. .. all that, and I'd add that it's about having the ability to do it at in-memory speeds ("zero-latency") without trading off the qualities of availability and reliability that make the database a very safe choice.

    If you're cache is distributed then you aren't guaranteed in-memory speeds, because the object might be in-memory on another system.