Terracotta's Scalability Story


News: Terracotta's Scalability Story

  1. Terracotta's Scalability Story (8 messages)

    Open Terracotta is a product that delivers JVM-level clustering as a runtime infrastructure service. It is Open Source and available under a Mozilla-based license. Open Terracotta provides Java applications with a runtime environment that allows developers to trust critical parts of heap as reliable and capable of scaling through shared access across multiple machines. The technology hinges on a clustering server that keeps a shared view of objects across JVMs. The key question around scalability of a Terracotta-based application can only be answered by analysis of the architecture and the alternatives. Terracotta uses bytecode instrumentation to adapt the target application at class load time. In this class loading phase it extends the application in order to ensure that the semantics of the Java Language Specification (JLS) are correctly maintained across the cluster, including object references, thread coordination, garbage collection etc. Another important thing to mention is that Terracotta does not use Java Serialization, which means that any POJO can be shared across the cluster. What this also means is that Terracotta is not sending the whole object graph for the POJO state to all nodes but breaks down the graph into pure field-level data and is only sending the actual "delta" over the wire, meaning the actual changes - the data that is "stale" on the other node(s). This article goes into more detail.
  2. break the failover mechanism by restarting all servers?
  3. I don't know much about jboss cache. But... I've talked to a few admins of large clusters and here is what I've heard: * They tend to take down "parts of a cluster" say 5 or 10 nodes at one time to do maintenance and upgrades. * With systems that rely on replicating data to a spare node one can end up accidentally wiping out parts of the state by accidentally taking out the active and passive of the buddies. * Provisioning is very difficult when a client and a server are squished together. What size machines, what kinds of networks, and how many? So they end up having to by higher end machines and networks for all nodes. * They have complained that peered and buddy systems are at risk of cascading failures All that said, I have also heard people who feel strongly in the other direction. I've heard: * buddy system is "Simpler" because I don't have to start a separate server * peer to peer is new and client server is old * At some theoretical point the server could become the bottle neck. etc... I guess the answer is do both and you'll satisfy the most people.
  4. Thanks Steve, for that elaborate response. I have a general question about Teracotta and general failover with HTTPSession putAttribute() method. One thing I like about teracotta is the ability to resynchronize state from the server only when the backup servers need to. Hence if the primary never fails there will be no need to replicate. Plus only the last state is replicated instead of replicating on every call to setAttribute(). Using AOP would it not be possible to update the servercopy on every setAttribute on objects that need failover and replicate that state on the backups on every "getAttribute()". This method might not be as good as terracotta's intention but would still be great to have for simpler needs. Sameer
  5. ...a HTTPSession clustering which does not require that objects require to implement the Serializable interface. Should have looked before posting. Sameer
  6. Not sure how other buddy replication mechanisms work, but in JBoss Cache, sys admins are able to provide "hints" on how buddies are selected (details in the JBC user guide). This allows them to identify which nodes are buddies to other nodes, so with this information, taking down parts of a cluster won't end up wipping out parts of state. Cascading failures too is something JBoss Cache addresses by using lazy gravitation of data. Just because a node drops out, you won't end up with a storm of network traffic migrating data - instead, this only happens when data is accessed/requested. The biggest advantage (IMO) of a p2p structure over a client server one is the self-healing nature of the cluster. Nothing about news and olds, just a very tangible benefit that clustering "just works" as nodes are added/removed.
  7. Benchmarks[ Go to top ]

    I've seen the demo of Terracotta 2 weeks ago. Very impressive. It should be interesting to see a comparison with a simple application. A sort of OO7 benchmark for ODBMS. Ciao, Luca Garulli RomaFramework.org AssetData.it
  8. I think the Terracotta guys are driven a new generation of computing. They are not inventing the wheel of course, the Terracotta's transparent synchronization approach reminds me the ObjectStore times when a read/change to a persistent object in the client was notified to the ObjectStore server. Of course the technique was different, ObjectStore used memory page faults to detect field changes (using C++), of course the Terracotta way, the bytecode instrumentation (enhancement, weaving, dynamic modification, too many terms) is cleaner, we live in the Java era. JDO is similar in the persistent ORM world (without network). My product, JNIEasy, uses a similar approach in the Java-native integration world, the native memory is like the Terracotta's server: any Java field change is automatically notified to the native memory and any Java field read is really read from native, if the native memory is modified, much like "a native client", the Java memory is modified when the field is requested, then a Java POJO is very much the same as the C++ object attached. I think the bytecode instrumentation has been largely ignored and unjustly criticized (for instance, JDO is technically superior to JPA thanks to the bytecode enhancement), if the behavior is well documented is a gift and a strong key of a new "pervasive computing" era. Terracotta is preparing a new distributed JVM fork, sure ... :) Anyway I see a performance problem with Terracotta with the fine grain (per field) notification when several fields are going to be modified simultaneously, too many separated calls to the server, isn't it. Good work Jose M. Arranz Innowhere S.S. S.L.
  9. Anyway I see a performance problem with Terracotta with the fine grain (per field) notification when several fields are going to be modified simultaneously, too many separated calls to the server, isn't it.

    Good work

    Jose M. Arranz
    Innowhere S.S. S.L.
    Good question. Terracotta does not push each field one by one. It pushes batches of field-level changes on thread-memory boundaries (a term I just made up, but basically what it sounds like in that we honor the memory model). --Ari