Blog entry: Clustering - EJBs vs JMS vs POJOs

Discussions

News: Blog entry: Clustering - EJBs vs JMS vs POJOs

  1. Blog entry: Clustering - EJBs vs JMS vs POJOs (23 messages)

    Pragmatic Problem Solving, a blog from a Terracotta employee, has posted "Clustering - EJBs vs JMS vs POJOs," a discussion of the pain points in clustering EJB, JMS, and plain old java objects. Of course, Terracotta's product makes a point of trying to solve these pain points, but that doesn't invalidate the blog's information. Speaking as an employee of a services company (i.e., not Terracotta), he describes the initial problem:
    When I started work, it was very obvious to me that there needed to be an effort to redesign the existing architecture of the system. It was a typical consumer facing application serving web and mobile clients. The entire application was written in Java (JSE and JEE) in a typical 3-tier architecture topology. The application was deployed on 2 application server nodes. All transient and persistent data was pushed into the Database to address high availability. As it turned out, as the user-base grew, the database was the major bottleneck towards RAS and performance. We had a couple of options: pay a major database vendor an astronomical sum to buy their clustering solution, or redesign the architecture to be a high-performing RAS system. Choosing the first option was tempting, but it just meant we were pushing the real shortcomings of our architecture "under the carpet", over and above having to spend an astronomical sum. We chose the latter.
    Speaking of clustering EJBs, he points out some real problems with JNDI and the information around it:
    Some of the known shortcomings of EJBs are that they are too heavy-weight and make you rely on a heavy weight container. EJB3 has somewhat changed that. In terms of clustering, we found the major pain point to be the JNDI discovery. If you implement an independent JNDI tree for each node in the cluster, the failover is developer's responsibility. This is beacause the remote proxies retrieved through a JNDI call are pinned to the local server only. If a method call to an EJB fails, the developer has to write extra code to connect to a dispatcher service, obtain another active server address, and make another call. This means extra latency. If you have a central JNDI tree, retrieving a reference to an EJB is a two step process: first look up a home object (Interoperable Object Reference or IOR) from a name server and second pick the first server location from the several pointed by the IOR to get the home and the remote. If the first call fails, the stub needs to fetch another home or remote from the next server in the IOR and so on. This means that the name servers become a single point of failure. Also, adding another name server means every node in the cluster needs to bind its objects to this new server.
    [Editor's note: would JNDI federation help with this? Isn't part of the role of JNDI to return valid references to remote objects, no matter where they are?] Shifting to JMS:
    The application developer had to make sure that all of these blocks were coherent across the cluster. The developer had to define points at which changes to objects were shipped across to the other node in the cluster. Getting this right was a delicate balancing act, as shipping these changes entailed serializing the relative object graph on the local disk and shipping the entire object graph across the wire. If you do this too often, the performance of the entire system will suffer, while if you do this too seldom, the business will be affected. This turned out to be quite a maintenance overhead. Dev cycles were taking longer and longer as we were spending a lot of time maintaining the JMS layer. Adding any feature meant we had to make sure that the cluster coherency wasn't broken. ... The irony is that every block in our technology stack was written in pure JAVA as POJOs, and yet there was a significant overhead to distribute and maintain the state of these POJOs across multiple JVMs. One can argue that we could have taken the route of clustering our database layer. I will still argue that doing so would have pushed the problems in our architecture under the DB abstraction, which would have surfaced later as our usage grew.
    of course, as an employee of Terracotta, he's familiar with what Terracotta does to address this problem, by providing clustering at the JVM level:
    Terracotta allows you to write your apps as plain POJOs, and declaratively distributes these POJOs across the cluster. All you have to do is pick and choose what needs to be shared in your technology stack and make such declarations in the Terracotta XML configuration file. You just have to declare the top level object (e.g. a HashMap instance), and Terracotta figures out at runtime the entire object graph held within the top level shared object. Terracotta maintains the cluter-wide object identity at runtime. This means obj1 == obj2 will not break across the cluster. All you need to do in the app is get() and mutate, without an explicit put(). Terracotta guarantees that the cluster state is always coherent and lets you spend your time writing business logic. ... All the pain points I mentioned above are turned into gain points by Terracotta: no serialization, cluster-wide object identity, fine grained sharing of data and high performance.
    JBoss addresses some of the clustering problems addressed in this entry through the use of JGroups, which manages the JNDI problems brought up early in the blog, and it's arguable that modifying the actual architecture to support a clustering JVM is less desirable than using a transparent layer to locate clustered instances. What do you think of the two approaches?

    Threaded Messages (23)

  2. [quote]Terracotta is the only technology available today that lets you distribute POJOs as-is.[/quote] Is this claim accurate? I wonder if Cameron or someone could briefly explain the differences between Terracotta and something like Coherence.
  3. [quote]Terracotta is the only technology available today that lets you distribute POJOs as-is.[/quote]

    Is this claim accurate? I wonder if Cameron or someone could briefly explain the differences between Terracotta and something like Coherence.
    Yes this claim is accurate. I think what the blog means is that if you need to augment your class definition with new interfaces or edit object relationships so that those relationships survive a marshalling / unmarshalling set of events; especially if you have to call out to proprietary API's that require more libraries before your application can continue to compile / run, then you are no longer POJO's "as -is". Since Terracotta replicates objects at the heap-level you make performance-oriented optimizations as opposed to changing the model for other reasons. That's all he is saying (at least in this sentence).
  4. Terracotta is the only technology available today that lets you distribute POJOs as-is.


    Yes this claim is accurate. I think what the blog means is that if you need to augment your class definition with new interfaces or edit object relationships so that those relationships survive a marshalling / unmarshalling set of events; especially if you have to call out to proprietary API's that require more libraries before your application can continue to compile / run, then you are no longer POJO's "as -is".
    Acoording to your definition, Ari, then JBoss Cache's PojoCache module should be the first in the market that has this capability. :-) We don't require extra interface for your POJO either. You can use either annotation or exteranl xml to mark your POJO to perform byte code weaving (as you do). And although a different approach, PojoCache also do field level replication and preserve object relationship across cluster as well. Currently, we have used PojoCache in JBoss AS http session clustering (field-level granularity replication) starting in 4.0.4. - Ben Wang JBoss Cache
  5. How much?[ Go to top ]

    This is the 2nd time you promote a Terrocatto blog ad. How much does this kind of ad cost? Can you provide me with a quote, please? Thanks, Andreas
  6. Re: How much?[ Go to top ]

    This is the 2nd time you promote a Terrocatto blog ad. How much does this kind of ad cost? Can you provide me with a quote, please?

    Thanks,
    Andreas
    :) The question is, do you _want_ an ad like this one?
  7. Re: How much?[ Go to top ]

    Actually no. But I'm sure they pay for it! :-) -- Andreas
  8. Re: How much?[ Go to top ]

    Actually no. But I'm sure they pay for it! :-)
    -- Andreas
    Ad, hmph! Terracotta had nothing to do with this, apart from the posting in the first place. I thought the content was interesting, and the fact that it was a Terracotta employee was secondary.
  9. I think you will find that most mature application server platforms offer a JNDI solution that underneath supports transparent fail over at the object reference. They also have various mechanisms to reduce the overhead in lookups and bindings. An important point to consider in any such analysis is whether application modules deployed today are truly distributed & partitioned across hardware and processes. More than likely you will see the same enterprise application deployed as a whole (web app, ejb, rar) on each node within a cluster with all components collocated thus making is extremely unlikely that fail-over would in fact be required. I hope we would all agree that it is much easier to write a system/application/component today that explicitly supports clustering of relevant data via a contractual (plugin/extension) interface that could be layered on a distributed or local cache solution than to attempt to understand the various execution and state points that require replication across processes for a large enterprise application. I think the technology being promoted is very interesting but it would be much better if the company concentrated on providing add-ons for standard integration points such as HTTP session management and Spring, and less on the underlying technology itself. The technology should be viewed more as a enabler, and competitive edge, for rapid delivery of integrations and not as the key selling point. If the company can continue to quickly deliver to market new and extended integrations with a high degree of scalability and reliability then they have proven the solution to be superior. In my opinion they more the concentrate on the transparency part they more they will run into architects with a large list of questions and concerns and unwilling to take the risk in a production environment especially when they are more than likely unsure of the quality and execution behaviors of their existing own applications. This reminds me of a posting I made a while back on AOP (see end postings) where I raised the issue with need to understand the implementation (code: fields, execution flow,...) of a component rather than the contract (interface, lifecycle). http://weblogs.java.net/blog/kgh/archive/2006/03/aop_madness_and.html Kind regards, William Louth JXInsight Product Architect CTO, JInspired "Java EE tuning, testing, tracing, and monitoring with JXInsight" http://www.jinspired.com
  10. If the company can continue to quickly deliver to market new and extended integrations with a high degree of scalability and reliability then they have proven the solution to be superior.
    The question here if market actually need any more integrations? I've seen a lot of examples when people don't want anything new at all. Same happens to AOP. However this does not mean that there is no people who are able to get it and take an advantage of the new technology.
  11. I think the technology being promoted is very interesting but it would be much better if the company concentrated on providing add-ons for standard integration points such as HTTP session management and Spring, and less on the underlying technology itself. The technology should be viewed more as a enabler, and competitive edge, for rapid delivery of integrations and not as the key selling point. If the company can continue to quickly deliver to market new and extended integrations with a high degree of scalability and reliability then they have proven the solution to be superior. In my opinion they more the concentrate on the transparency part they more they will run into architects with a large list of questions and concerns and unwilling to take the risk in a production environment especially when they are more than likely unsure of the quality and execution behaviors of their existing own applications.
    Completely agree. Transparency, drop-in, and such claims are hard to quantify and contextualize. Our goal is to solve problems for developers, and not to claim that developers are no longer needed. It is absolutely vital that we work within the community and support the components of the application stack that people wish to share across JVMs. This doesn't mean we aren't unique and that POJO clustering is only for software vendors--quite the opposite. What it does mean is that most developers need to know that the popular frameworks and containers are currently integrated with and constantly working with Terracotta to deliver more and more drop-in / out of the box value for sharing application state w/ less effort. Simultaneously, I am not going to stop anyone who wants to embrace a Terracotta-specific model. Thanks for the feedback! Very insightful
  12. I don't see any explanation of the 'pain-points' of using JMS. There is only some mention of the pain of maintaining some a custom 'JMS layer'. Was this 'JMS layer' necessary? Was it well designed? I can't count the number of times I've seen people add layers to a design that add no value and often become a maintenance nightmare. I've had to surgically remove a few. This post only speaks to the pain points of using their JMS layer. This is a typical story of how things go when an architecture is cobbled together with no well-thought-out design and new technologies are randomly applied to solve problems instead of using brains.
  13. Choosing the first option was tempting, but it just meant we were pushing the real shortcomings of our architecture "under the carpet", over and above having to spend an astronomical sum. I don't see how selecting the new technology is any less "brushing under the carpet" nor "as astronomical sum" than the DB clustering solution is. I'm not suggesting Terracotta costs more, or even the same, as DB Clustering, but it seems pretty much like a black magic layer to rely upon, just like the DB Clustering system is. They're both transparent, both outside of the application, and both proprietary and suffer from vendor lock in. I'm sure it's a fine system, worth considering, etc. But to handwave away vendor supplied clustering on the DB in favor of vendor supplied clustering at the JVM when they both suffer from "under the carpet"-ness and "astronomical sum"-ness doesn't seem particularly fair to me.
  14. Choosing the first option was tempting, but it just meant we were pushing the real shortcomings of our architecture "under the carpet", over and above having to spend an astronomical sum.
    I don't see the problem of pushing both options though. Certainly, at JBoss, we do the same with JBoss Cache. I.e., we use JBoss Cache in our clustering stack (e.g., http session, ha-jndi, and ejb3 sfsb and entity) but also we push it as a standalone. After all, clustering includes client side failover and loadbalance as well, not just state replication. :-) -Ben Wang JBoss Cache
  15. Choosing the first option was tempting, but it just meant we were pushing the real shortcomings of our architecture "under the carpet", over and above having to spend an astronomical sum.


    I don't see the problem of pushing both options though. Certainly, at JBoss, we do the same with JBoss Cache. I.e., we use JBoss Cache in our clustering stack (e.g., http session, ha-jndi, and ejb3 sfsb and entity) but also we push it as a standalone. After all, clustering includes client side failover and loadbalance as well, not just state replication. :-)

    -Ben Wang
    JBoss Cache
    Yes Ben. Clustering is a lot more than state replication. But loadbalancing and client side failover sounds like a part of highly available state. Please enlighten me as to how our ability to fire wait() and notify() across VM's is not more than "state replication"? Let's not do this sort of thing, please. I am not saying the post Joe covered here on TSS is flawless and Terracotta is the only solution that works. I am just trying to provide what I think are facts. Facts are important. In your next response, you assert, for example, that I was inaccurate in that your product meets our definition but I just checked your documentation for the current and next releases of your product and there are still these "Trees" and import statements and putObject() calls that, even if inserted via AOP, fail to pass the definition. Now, I am sure you will respond to this trying to show how I am wrong, but I am trying to stick to facts so, if I am wrong, I apologize but my intention is to add value when interacting in the community. --Ari
  16. Yes Ben. Clustering is a lot more than state replication. But loadbalancing and client side failover sounds like a part of highly available state. Please enlighten me as to how our ability to fire wait() and notify() across VM's is not more than "state replication"?
    You already rely on byte code modification and synchronous network operations, so I have a hard time understanding why you would suggest that wait() and notify() are substantially different from any other operation. Are you suggesting that there is there something special about these particular method calls, other than the fact that they are final+native? Let me ask it a different way, since there are at least two obvious ways to implement the functionality using byte code modification: If I call wait / notify via reflection, will it still work? Peace, Cameron Purdy Tangosol Coherence: Java Clustered Cache
  17. Are you suggesting that there is there something special about these particular method calls, other than the fact that they are final+native? Let me ask it a different way, since there are at least two obvious ways to implement the functionality using byte code modification: If I call wait / notify via reflection, will it still work?

    Peace,

    Cameron Purdy
    Tangosol Coherence: Java Clustered Cache
    Cameron, I am not sure I understand your point. 1. There is something special about it. Wait() and Notify() can't just be instrumented as point cuts. You have to know which threads to notify, which objects are being waited on, etc. and you need to know about object identity mappings from one JVM to another. 2. It is very powerful to be able to use the native (not meaning OS-native but native to the language) constructs for signaling. 3. Yes, Terracotta can work with wait() and notify() called via reflection. In general, we work with reflective field sets. 4. calling wait() and notify() via reflecion when synchronization is required is probably a VERY BAD idea...can't think of why one would do it.
  18. Choosing the first option was tempting, but it just meant we were pushing the real shortcomings of our architecture "under the carpet", over and above having to spend an astronomical sum.


    I don't see the problem of pushing both options though. Certainly, at JBoss, we do the same with JBoss Cache. I.e., we use JBoss Cache in our clustering stack (e.g., http session, ha-jndi, and ejb3 sfsb and entity) but also we push it as a standalone. After all, clustering includes client side failover and loadbalance as well, not just state replication. :-)

    -Ben Wang
    JBoss Cache
    You are right, Ben. There is nothing wrong in pushing both options, namely cluster you database as well. The point I was trying to make was that if the need for scaling out the DB layer is a side effect/consequence of pushing too much transient state into the DB, then you need to look at your architecture. If the volume of your business persistent data (system of record style information) drives this decision, then that is an entirely different discussion. Regards, Kunal Bhasin.
  19. Heap Trick[ Go to top ]

    Seems that the Terrcotta Server which mediates calls between cluster nodes is a single point of failure. Physician cluster thyself?
  20. Re: Heap Trick[ Go to top ]

    Seems that the Terrcotta Server which mediates calls between cluster nodes is a single point of failure. Physician cluster thyself?
    Andrew, You are correct sir. It could be a SPoF, and is, thus, clustered. Thanks for pointing out that vital piece of information for everyone. --Ari
  21. None of the statemens regarding EJB, JNDI and JMS in this article were true even when I worked on WebLogic at BEA quite some time ago. Frankly, I find this trend to make untrue assumptions and then use them to support a following discussion about how vendor's "solution" solves to a non-existing problem a bit disturbing. Regards, Slava Imeshev
  22. None of the statemens regarding EJB, JNDI and JMS in this article were true even when I worked on WebLogic at BEA quite some time ago.

    Frankly, I find this trend to make untrue assumptions and then use them to support a following discussion about how vendor's "solution" solves to a non-existing problem a bit disturbing.

    Regards,

    Slava Imeshev
    Slava, You are correct. Weblogic is far more powerful than most developers realize. And, in all the time I have spent with BEA engineers, I have never found them unable to find a way to build an application using Weblogic. Very smart people. Still, we should never assume that there is no room for improvement. --Ari

  23. Slava,

    You are correct. Weblogic is far more powerful than most developers realize. And, in all the time I have spent with BEA engineers, I have never found them unable to find a way to build an application using Weblogic.

    --Ari I think you're misreading what I have said. I have said that those "problems" listed in the original posting don't exists now. They were solved by products such as WebLogic long ago. Presenting this as features only available to smart (I agree here :) WebLogic engineers doesn't look good to me... Slava


  24. Slava,

    You are correct. Weblogic is far more powerful than most developers realize. And, in all the time I have spent with BEA engineers, I have never found them unable to find a way to build an application using Weblogic.

    --Ari


    I think you're misreading what I have said. I have said that those "problems" listed in the original posting don't exists now. They were solved by products such as WebLogic long ago. Presenting this as features only available to smart (I agree here :) WebLogic engineers doesn't look good to me...

    Slava
    Slava, The intention of my post wasn't to attack any vendor nor was it to make *untrue* assumptions. The discussion is my personal view, based on my experiences. Note that these experiences were from early 2001 on, and I realize that a lot of technologies have improved since then. But some of the fundamental overheads in clustering like serialization still exist in JMS, EJBs etc. I just wanted to share my experience and show how I feel Terracotta solves some of the problems I experienced and alleviates certain overheads. Regards, Kunal.