Tangosol Announces the Immediate Availability of Coherence 3.1

Discussions

News: Tangosol Announces the Immediate Availability of Coherence 3.1

  1. Tangosol, Inc. has announced the immediate availability of Coherence 3.1.

    Coherence is used for in-memory distributed data management, clustered caching, and data-intensive grid computing for mission critical enterprise Java technology applications.

    Coherence 3.1 Data Grid Features

    Applications built with Coherence can exhibit continuous availability, to the point of surviving datacenter failure without service interruption or degradation. The new data grid capabilities include:

    Coherence RealTime Desktop can provide immediate access at the desktop to the most up to date information, including the very latest transactions, even if that information consists of massive amounts of highly dynamic data.

    As part of a Coherence Data Grid 3.1 solution, the Coherence RTD module resides on the desktop, and assures continuous availability of 100% up-to-date information, updated instantaneously with each transaction, totally transparently to the user. The information that is maintained on each desktop can be tailored to each individual user, providing access to only that data that is relevant to that user's scope of responsibility, and notifying the user automatically of information updates in which the user would have interest.

    To achieve scalability into the tens of thousands of desktops, and to insure optimum performance at every desktop, client connections are load-balanced across the entire Data Grid using Coherence*Extend, a facility for securely accessing a Data Grid using the standard Java Message Service (JMS).

    Coherence Data Grid Agents and Coherence Aggregate Functions open up the compute and data resources of the entire Data Grid to an application's custom processing requirements. Using these capabilities, an application can execute data-related processing in parallel across the entire Data Grid, achieving limitless linear scalability for both data management and application processing. Coherence automatically, dynamically and resiliently partitions data management responsibilities over the hundreds or thousands of servers in a Data Grid, and uses its knowledge of that data distribution to achieve 100% locality of processing across the Data Grid. That means that an application can process entire data sets in parallel without any data actually crossing the wire. Coherence supports custom Data Grid Agents, and these agents can be targeted to any specific sub-set of the data, to a query, or to the entire data domain. These new capabilities have been used already in such applications as real-time risk and distributed trading systems.

    The Data Grid Aggregate Functions capability in Coherence 3.1 takes this parallel processing concept to its natural conclusion for algorithms such as Monte Carlo simulations: Each server in the Data Grid will process in parallel the data that is available locally, and then the partial results from each server are rolled up into a final aggregate result, achieving the maximum theoretical throughput as defined by Amdahl's Law. Applications such as algorithmic trading and real-time risk can plug in their own custom aggregate functions to achieve massive throughput in grid environments.

    Coherence Distributed Work Manager provides a grid-enabled implementation of the same standard CommonJ Work Manager API provided in BEA WebLogic and IBM WebSphere. Using the Coherence Distributed Work Manager, an application can submit a collection of work that needs to be executed, and the Coherence Distributed Work Manager will automatically distribute the work for execution across the entire grid, using as many servers as are available and needed for the work items submitted. The Coherence Distributed Work Manager is limited only by the total number of execution threads that can be allocated across the entire grid.

    Distribution of work items can be tailored to prioritize the selection of servers within the grid that will provide the optimum processing performance based upon such considerations such as locality of data or access to a particular EIS, gateway or mainframe service needed by that work item.

    Added Functionality to Existing Features

    JMX Management Services Extended to Cover Coherence*Web and application MBeans Coherence*Web is an out-of-the-box HTTP Session management solution that provides linear scalability of HTTP Session management up to hundreds of servers, providing instant and transparent failover with no loss of session data. In Coherence 3.1, the JMX Management services available within Coherence now expose a unified cluster-wide view all of the management and monitoring capabilities of Coherence*Web. Also starting with Coherence 3.1, applications can expose their own management and monitoring information through Coherence, providing a unified operational view of the entire application.

    New Coherence Third Party Integration

    BEA WebLogic Portal Integration delivers Wide Area Network (WAN)-capable clustered session management and caching for BEA WebLogic Portal applications. Whenever Coherence is present, BEA WebLogic Portal will automatically offload all portal and personalization information caching to Coherence, freeing up JVM resources for portal applications and enabling portal caches. Working in combination, BEA WebLogic Portal and Coherence also provide technology that can fill in performance and capability gaps for organizations building WSRP-federated Portal applications, enabling such applications to efficiently share large amounts of workflow, documents and live data in real time across any number of federated portlets. Tangosol has developed a blueprint for this implementation which is available from the BEA dev2dev website and www.tangosol.com in Getting Peak Scalable Performance from Federated Portals and WSRP.

    Hibernate Integration adds true distributed transactional caching to Hibernate-based applications, with no code changes. Automatic Cache Loaders for Hibernate are also included.

    BerkeleyDB Integration can improve Coherence performance for data sets too large to fit completely in memory.

    Threaded Messages (44)

  2. Roll your own in a few weeks[ Go to top ]

    FYI, you can roll your own clustered cache (and more) in a few weeks by utilizing the totem protocol. I have an Apache-licensed implementation here.

    Since it's token-based it's not recommended for large clusters, but if you are clustering for availability, with 3-4 nodes, it's great. And it's easy to code against because all the messages are totally ordered.

    Guglielmo
  3. Roll your own in a few weeks[ Go to top ]

    Guglielmo
    Your EVS4J project is nice but I don't think it's so easy to turn that into a distributed clustered cache, with nice doc etc as Tangosol has (and many others have - I'd love to see some standard benchmark around this field one day) - left aside support, training, etc -
    Simply take the various cache topologies, disk overflow features, or distributed queries, and in Tangosol 3.1 the continuous query (CEP like) things. It is much more than a reliable multicast protocol.

    Or am I missing something?

    One for Tangosol team: given that 64bit allows for huge heaps, would you advise to use Tangosol with large heaps (say 250 gig) or with disk overflow? Do you foresee some change in your space given this fact (issue with feeding the cache, actual need for a sophisticated cache except for write propagation, etc)
  4. giant distributed caches[ Go to top ]

    One for Tangosol team: given that 64bit allows for huge heaps, would you advise to use Tangosol with large heaps (say 250 gig) or with disk overflow? Do you foresee some change in your space given this fact (issue with feeding the cache, actual need for a sophisticated cache except for write propagation, etc)

    We're seeing larger heaps than with the JDK 1.2 / 1.3 era JVMs, especially with products like Azul (pauseless GC) and WebLogic Realtime (JRockit).

    However, the way to achieve a 250GB cache is still to spread it over multiple JVMs. For example, it's easy to do just by starting up 25 JVMs with 10GB each or 100 JVMs with 2.5GB each, etc.

    Obviously, many apps don't need to cache the entire 250GB in heap, so the overflow functionality can be very useful. (In our 3.1 release, we also added an option to use BerkeleyDB for disk overflow.)

    Also, there's a pretty good set of ideas around organizing the caching at:

    http://wiki.tangosol.com/display/COH31UG/Coherence+3.1+Home

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  5. giant distributed caches[ Go to top ]

    We're seeing larger heaps than with the JDK 1.2 / 1.3 era JVMs, especially with products like Azul (pauseless GC) and WebLogic Realtime (JRockit).However, the way to achieve a 250GB cache is still to spread it over multiple JVMs. For example, it's easy to do just by starting up 25 JVMs with 10GB each or 100 JVMs with 2.5GB each, etc.

    No it's not, each VM competes for OS resources like threads for gc for example. There's a practical limit as to how many VMs you can run on a given box, but really there's no reason why 1 VM can't handle this size, eg. SPECjbb like this one 360GB http://www.spec.org/jbb2005/results/res2006q1/jbb2005-20051221-00055.html. Yes it's a benchmark but it proves that it can be done, you just need to architect for it.

    Granted in any deployment the hardware config matters, but don't go starting new VMs just to lower the cache size, you'd just be moving the bottleneck elsewhere.
  6. giant distributed caches[ Go to top ]

    No it's not, each VM competes for OS resources like threads for gc for example. There's a practical limit as to how many VMs you can run on a given box, but really there's no reason why 1 VM can't handle this size

    I'm always up for learning new stuff. I haven't seen very large heaps being used successfuly in production, but if there are ways to do it, I'd like to learn about them.

    The largest heaps I've seen personally are Azul-based heaps of close to 100GB. Those weren't in production, though. (We were doing testing on an Azul cluster.)

    Regarding the large caches that our customers have, they are generally spread over quite a few commodity blades, e.g. 120 blades running 2 JVMs each.
    Granted in any deployment the hardware config matters, but don't go starting new VMs just to lower the cache size, you'd just be moving the bottleneck elsewhere.

    It's not about "starting new VMs just to lower the cache size", but rather about "scaling an in-memory data management system across a large number of nodes". By having more servers in the data grid, you're getting more an more processing capability available for searching, managing and crunching the data.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  7. giant distributed caches[ Go to top ]

    The largest heaps I've seen personally are Azul-based heaps of close to 100GB. Those weren't in production, though. (We were doing testing on an Azul cluster.)Regarding the large caches that our customers have, they are generally spread over quite a few commodity blades, e.g. 120 blades running 2 JVMs each.

    A potential issue with huge heaps (say, hundreds of GB or even > 1 TB) is the time it would take to fill up the cache! Especially if the data is loaded from "expensive" sources - say, over a slow link or from a legacy system you're not allowed to load too much, or if the raw data goes through a complex aggregation process before being stored in the cache. I recently read an article about a massive data mining application used by an intelligence agency. As the only option (for performance reasons) was to cache the whole data set in RAM, they had acquired a monster supercomputer with 1 TB of RAM I believe. But booting the machine and loading the full data set into RAM took one whole week!

    Now, that might be acceptable because the machine has rock-solid reliability. But imagine doing that in a JVM and on commodity hardware. I know JVMs have come a long way and don't crash *that* often, but in any case if a crash required a MTTR of one full week, it would be advisable to split the data set across multiple machines and JVMs to ensure acceptable availability and fault-resilience.
  8. giant distributed caches[ Go to top ]

    A potential issue with huge heaps (say, hundreds of GB or even > 1 TB) is the time it would take to fill up the cache!

    To pre-load, you can do parallel loading. Or you can do lazy loading on demand. Or mix the two.
    Especially if the data is loaded from "expensive" sources - say, over a slow link or from a legacy system you're not allowed to load too much, or if the raw data goes through a complex aggregation process before being stored in the cache.

    Yup. Those all make it much more complicated.
    But booting the machine and loading the full data set into RAM took one whole week! Now, that might be acceptable because the machine has rock-solid reliability. But imagine doing that in a JVM and on commodity hardware. I know JVMs have come a long way and don't crash *that* often, but in any case if a crash required a MTTR of one full week, it would be advisable to split the data set across multiple machines and JVMs to ensure acceptable availability and fault-resilience.

    With Coherence, it doesn't matter if a JVM dies. There's no data lost, even when the data is split (dynamically partitioned) across the machines. (This is accomplished by the use of data redundancy, i.e. turning the data grid into a giant in-memory "RAID array" of objects.)

    Our goal is a sub-second server-level MTTR, i.e. the impact to the application of a server dying should take less than one second from the data grid to recover from.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  9. giant distributed caches[ Go to top ]

    BTW, I have a question: what is the byte overhead of keeping an object cached?

    In my last job I did some measurements of the actual size of java objects and it's scary ...

    What's the space overhead due to Coherence?

    Guglielmo

    Enjoy the Fastest Known Reliable Multicast Protocol with Total Ordering

    .. or the World's First Pure-Java Terminal Driver
  10. giant distributed caches[ Go to top ]

    With Coherence, it doesn't matter if a JVM dies. There's no data lost, even when the data is split (dynamically partitioned) across the machines. (This is accomplished by the use of data redundancy, i.e. turning the data grid into a giant in-memory "RAID array" of objects.)Our goal is a sub-second server-level MTTR, i.e. the impact to the application of a server dying should take less than one second from the data grid to recover from.
    Sure, I'm aware of that. My point was that the alternative suggested earlier in the thread (one single monster JVM with hundreds of GB of heap) raised issues that do not occur with a typical Coherence solution that partitions and replicates the data across multiple machines / JVMs. Parallel data loading is also much more practical/efficient in this scenario.
  11. giant distributed caches[ Go to top ]

    Sure, I'm aware of that. My point was that the alternative suggested earlier in the thread (one single monster JVM with hundreds of GB of heap) raised issues that do not occur with a typical Coherence solution that partitions and replicates the data across multiple machines / JVMs. Parallel data loading is also much more practical/efficient in this scenario.

    Oops, it looks like I misunderstood and got it backwards ;-)

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  12. A potential issue with huge heaps (say, hundreds of GB or even > 1 TB) is the time it would take to fill up the cache!
    To pre-load, you can do parallel loading. Or you can do lazy loading on demand. Or mix the two.
    Especially if the data is loaded from "expensive" sources - say, over a slow link or from a legacy system you're not allowed to load too much, or if the raw data goes through a complex aggregation process before being stored in the cache.
    Yup. Those all make it much more complicated.
    But booting the machine and loading the full data set into RAM took one whole week! Now, that might be acceptable because the machine has rock-solid reliability. But imagine doing that in a JVM and on commodity hardware. I know JVMs have come a long way and don't crash *that* often, but in any case if a crash required a MTTR of one full week, it would be advisable to split the data set across multiple machines and JVMs to ensure acceptable availability and fault-resilience.
    With Coherence, it doesn't matter if a JVM dies. There's no data lost, even when the data is split (dynamically partitioned) across the machines. (This is accomplished by the use of data redundancy, i.e. turning the data grid into a giant in-memory "RAID array" of objects.)Our goal is a sub-second server-level MTTR, i.e. the impact to the application of a server dying should take less than one second from the data grid to recover from.Peace,Cameron PurdyTangosol Coherence: Clustered Shared Memory for Java

    Let's be honest here . . . Although partitioned caching can assure continued availability when any given node crashes, there is still a significant impact when a node is lost--it is not as simple as the stated "less than one second for the data grid to recover." The clustered system needs to create a new backup for data that is no longer redundant, and if that's 100GB of data, you can expect a lengthy period of degraded overall system performance. Throw in any kind of network difficulties, and the entire system could grind to a halt. We have to consider these possibilities very carefully for any mission-critical application that uses partitioned caching.

    Cheers,

    Gideon
    www.gemstone.com
  13. Let's be honest here . . . Although partitioned caching can assure continued availability when any given node crashes, there is still a significant impact when a node is lost--it is not as simple as the stated "less than one second for the data grid to recover." The clustered system needs to create a new backup for data that is no longer redundant, and if that's 100GB of data, you can expect a lengthy period of degraded overall system performance.

    This is a bit of a worst-case scenario, but it doesn't have to be this bad. In fact this is why I prefer more/smaller JVMs' than fewer/bigger - because startup and recovery impacts are bigger the bigger your JVMs are.

    If you're talking 100GB of data, putting that only on, say, 3 servers means you're going to pay a penalty if one goes down, a great deal of data is going to have to flow if you lose one. Coherence seems to be intelligent about recovering in these cases (e.g. it won't halt and try to backup all 33GB or whatever all at once but will stream it), but you still will eat some resources while this is happening and have a window of vulnerability.

    But if what if you've got that 100GB in 10 machines, or 20 truly modest ones? Then the impact of losing any given node isn't going to be that great, because each machine is only carrying 1/10th or 1/20th of the entire dataset (not factoring in backups of course). In fact on bigger clusters losing 1 machine is no big deal and a bit of a non-event.

    This is a rough clustering equivalent to the RAID concept - Coherence makes it easy to cluster lots of smaller commodity boxes, which can have several advantages over going with a small number of monster boxes.
    Throw in any kind of network difficulties, and the entire system could grind to a halt.

    Well, that's kinda true of any system. You're right in general that there are tradeoffs and you do want to be careful, but it's not all doom and gloom. There are some surprisingly good options that balance performance, scalability, and recoverability pretty well.
  14. Let's be honest here . . . Although partitioned caching can assure continued availability when any given node crashes, there is still a significant impact when a node is lost--it is not as simple as the stated "less than one second for the data grid to recover." The clustered system needs to create a new backup for data that is no longer redundant, and if that's 100GB of data, you can expect a lengthy period of degraded overall system performance.
    This is a bit of a worst-case scenario, but it doesn't have to be this bad. In fact this is why I prefer more/smaller JVMs' than fewer/bigger - because startup and recovery impacts are bigger the bigger your JVMs are.If you're talking 100GB of data, putting that only on, say, 3 servers means you're going to pay a penalty if one goes down, a great deal of data is going to have to flow if you lose one. Coherence seems to be intelligent about recovering in these cases (e.g. it won't halt and try to backup all 33GB or whatever all at once but will stream it), but you still will eat some resources while this is happening and have a window of vulnerability.But if what if you've got that 100GB in 10 machines, or 20 truly modest ones? Then the impact of losing any given node isn't going to be that great, because each machine is only carrying 1/10th or 1/20th of the entire dataset (not factoring in backups of course). In fact on bigger clusters losing 1 machine is no big deal and a bit of a non-event.This is a rough clustering equivalent to the RAID concept - Coherence makes it easy to cluster lots of smaller commodity boxes, which can have several advantages over going with a small number of monster boxes.
    Throw in any kind of network difficulties, and the entire system could grind to a halt.
    Well, that's kinda true of any system. You're right in general that there are tradeoffs and you do want to be careful, but it's not all doom and gloom. There are some surprisingly good options that balance performance, scalability, and recoverability pretty well.

    I have to agree with all of the above! We must simply be careful not to assume that the impact of failure is only a one second hiccup. The use-cases we've found most suitable to large VM's generally involve large data sets against which queries need to be run on a variety of (often unpredictable) dimensions. Here you are generally left with the options of either putting all the data in a very big VM, creating copies of large portions of the data in many VM's, or severely degrading query performance by trying to apply joins accross VM's. The first solution uses the least amount of resources for the best performance, but does leave you with a longer load/recovery time.

    Cheers,

    Gideon
    www.gemstone.com
  15. Let's be honest here . . . Although partitioned caching can assure continued availability when any given node crashes, there is still a significant impact when a node is lost--it is not as simple as the stated "less than one second for the data grid to recover." The clustered system needs to create a new backup for data that is no longer redundant, and if that's 100GB of data, you can expect a lengthy period of degraded overall system performance.

    First, the point of grid-partitioned in-memory data is to avoid having a single server manage 100GB of data. In fact, that's why we invented it.

    Second, Coherence typically detects node failure within a few milliseconds, but due to the fault tolerant algorithms being used, the failover can take longer (e.g. up to a second) to occur. The failover processing itself occurs in-memory, so it is almost instantaneous.

    Third, while the new backups have not yet been created at this point, the system does continue to run while those backups are being created (asynchronously, of course).

    Regarding the claim about a "lengthy period of degraded overall system performance", that may have been true in certain cases with older releases of Coherence, but I think you'll be pleasantly surprised by the improvements in our 3.1 release.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  16. Let's be honest here . . . Although partitioned caching can assure continued availability when any given node crashes, there is still a significant impact when a node is lost--it is not as simple as the stated "less than one second for the data grid to recover." The clustered system needs to create a new backup for data that is no longer redundant, and if that's 100GB of data, you can expect a lengthy period of degraded overall system performance.
    First, the point of grid-partitioned in-memory data is to avoid having a single server manage 100GB of data. In fact, that's why we invented it.Second, Coherence typically detects node failure within a few milliseconds, but due to the fault tolerant algorithms being used, the failover can take longer (e.g. up to a second) to occur. The failover processing itself occurs in-memory, so it is almost instantaneous.Third, while the new backups have not yet been created at this point, the system does continue to run while those backups are being created (asynchronously, of course).Regarding the claim about a "lengthy period of degraded overall system performance", that may have been true in certain cases with older releases of Coherence, but I think you'll be pleasantly surprised by the improvements in our 3.1 release.Peace,Cameron PurdyTangosol Coherence: Clustered Shared Memory for Java

    I agree that in many use-cases the advantages of many smaller processes unified into a single logical cache are clear both from a runtime efficiency and hardware ROI perspective. There are many patterns, however, where larger processes provide an advantage (see an example in my post above). Indeed we have even run into use-cases lately that benefit from applying partitioned caching to a large number of 64-bit processes in order to create multi-terabyte logical caches.

    For failover times, every distributed caching product should be very fast once failure is confirmed. The real difficulty is in determining whether a pause in communication is due to a full-fledged system, process, or hardware component failure rather than a temporary overload of some resource or something like a lengthy GC. How to react to different scenarios is usually best determined by system administrators, and thus we must provide them with tuning options (e.g. retry attempts, retry intervals, etc) and configurable algorithms.

    Regarding the statement about having “invented” partitioned caching . . . this seems like a bit of a stretch (or perhaps just Marketing :-). Most of the theory underlying partitioned caching has been around since before many of us were born, and real implementations (though not productized) have existed since before GemFire or Coherence were even conceived.

    Cheers,

    Gideon
    GemFire--The Enterprise Data Fabric
    http://www.gemstone.com
  17. [..] we have even run into use-cases lately that benefit from applying partitioned caching to a large number of 64-bit processes in order to create multi-terabyte logical caches.

    Yes, that's one of the reasons why we invented it.
    Regarding the statement about having "invented" partitioned caching . . . this seems like a bit of a stretch (or perhaps just Marketing :-). Most of the theory underlying partitioned caching has been around since before many of us were born, and real implementations (though not productized) have existed since before GemFire or Coherence were even conceived.

    Coherence 1.2 (summer 2002) included a cache service we called "distributed cache", because it "distributed" the data evenly across all the cluster members, while eliminating single points of failure by the strict application of redundancy. Since the term "distributed cache" was so widely used in the industry, we later changed our terminology to "partitioned cache", which we believed at the time to be our own made-up term for marketing purposes (although Google shows that there were previous uses of the term, related primarily to hardware/EE concepts.)

    Any way you look at it, what Coherence offers is completely unparalleled: A dynamically and transparently partitioned single-system image that resiliently expands and contracts with the size of the cluster on which it runs. Our customers have called it a "clustered RAID of objects", which is a pretty good way to explain it: We provide dynamically load-balanced redundancy of data, and we do it automatically and transparently to utilize the full extent of resources (e.g. servers) provided to Coherence. Battle-proven in some of the world's largest-scale applications for four years.

    So, to answer your question:
    . . . this seems like a bit of a stretch (or perhaps just Marketing :-)

    Yes, we selected the term "partitioned cache" for marketing reasons, and it has served us well.

    However, as you'll note from the press release (2006), we're light-years beyond just doing cache partitioning (2002). Our customers are running real-time scale-out applications today that are reliably modeling and processing massive volumes of live data in parallel across data grids of hundreds of servers. Parallel query. Parallel aggregators. Data grid agents. Work management. Real time complex event processing (CEP). Continuous query (CQC). Real time desktops (RTD). Today. In production.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  18. Yes, that's one of the reasons why we invented it.

    Given the overlap with what we do and likely others in the space, it will be good to understand what the exact nature of the invention is (is there a patent issued or filed?), because we also introduced the first version of our Distributed HashMap technology back in 2002 - map entries spread across many processes, dynamic rebalancing of buckets, bucket splitting, at most single network hop guarantee, yada, yada.

    Our technology borrows concepts that are quite well known -(1) the many DSM (Distributed shared memory) projects that were implemented in the 1990's (here is a link for the curious -- http://www.ics.uci.edu/~javid/dsm.html)
    (2) from our own orthogonally persistent object database, where we can manage data pages across many shared memory segments.

    regards,
    -- Jags Ramnarayan

    http://www.gemstone.com
    GemFire - The Enterprise Data Fabric
  19. Coherence Partitioned Caching[ Go to top ]

    .. we also introduced the first version of our Distributed HashMap technology back in 2002 - map entries spread across many processes, dynamic rebalancing of buckets, bucket splitting, at most single network hop guarantee, yada, yada.

    Hi Jags, after we pioneered the partitioned cache capabilities in our Coherence 1.2 release -- and not coincidentally following Gemstone's downloading of Coherence after agreeing to our evaluation use license -- I had a number of conversations with Bruce Schuhardt (engineering) and Mike Nastos (bizdev) from Gemstone. Gemstone was very interested in our innovations, including the potential for licensing our innovations.

    Nonetheless, while I do find the reference to your "distributed hashmap" implementation to be interesting, what Coherence offers is completely unparalleled: A dynamically and transparently partitioned single-system image that resiliently expands and contracts with the size of the cluster on which it runs.

    Furthermore, with this release, Coherence dramatically extends those capabilities to support scalable and reliable Data Grid-wide data processing, parallel calculation and parallel aggregation. These unparalleled capabilities also represent pioneering work by Tangosol.

    Our customers' many successes with Coherence and our software's wide adoption speak for themselves.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  20. Yes, that's one of the reasons why we invented it

    Cameron, let me first say that I didn’t mean to upset you or challenge all of the hard work you’ve done to create a great product. We all compete in the marketplace with excellent products built by talented engineers, and we each have very strong advantages built on both our own ideas and customer feedback.

    As my colleague notes, however, the research and development that led to partitioned caching implementations on the market today (GemFire, Coherence, and others) is derived from years of hard work by brilliant visionaries older than ourselves. Even though GemStone owns many patents and has many more pending in our enterprise quality data fabric product suite, we know that we stand on the shoulders of many other true distributed computing pioneers. I cannot resist adding, though, that many of the engineers at GemStone have been with the company and working on this class of problems since the early 1980’s.

    Is your product a good one? Sure. Is your product “unparalleled”? Absolutely not. The truth is that you have advantages and more exposure in certain use-cases, and others have advantages and more exposure for others. After reading your last post it is very hard for me to resist countering all of your points with examples of GemFire’s features, capabilities, and happy customers, but this isn’t the place to do it. We will see you in the marketplace and let our prospects decide!

    Cheers,

    Gideon
    http://www.gemstone.com
  21. giant distributed caches[ Go to top ]

    It's not about "starting new VMs just to lower the cache size", but rather about "scaling an in-memory data management system across a large number of nodes". By having more servers in the data grid, you're getting more an more processing capability available for searching, managing and crunching the data.

    and of course more servers require more licenses and more license = more $$$, which from a buisness standpoint may make sense (for the vendor) but technically there's no reason why a single JVM can't scale up to the limits of the hardware. If there is, the JVM vendors need to know about it.

    Cameron,

    yesterday at 9:15 PM I get mail that says "Terracotta Ships Version 1.5 - API-less Clustering And Caching", and at 9:50 PM one that says "Tangosol Announces the Immediate Availability of Coherence 3.1". Coincidence? hmmm. So i went over to the Terracotta site blog.terracottatech.com, read the latest entry (signed "ari") which ends with this note "Clustered Hashmaps are thinly veiled databases. We need to stick to Java when we are working with objects and databases when working with business data. Don't you agree?"

    care to comment? (please let's keep this technical, both of you provide compelling products but I think users deserve to know the difference(s))
  22. giant distributed caches[ Go to top ]

    Cameron,yesterday at 9:15 PM I get mail that says "Terracotta Ships Version 1.5 - API-less Clustering And Caching", and at 9:50 PM one that says "Tangosol Announces the Immediate Availability of Coherence 3.1". Coincidence? hmmm.
    Actually, it probably was a coincidence. Our 3.1 release had been in development for over a year, and the GA went out on 18 February:

    http://tangosol.com/news.jsp

    We saved the announcement until the day of the Wall Street show. It's just a clever marketing ploy ;-)
    So i went over to the Terracotta site blog.terracottatech.com, read the latest entry (signed "ari") which ends with this note "Clustered Hashmaps are thinly veiled databases. We need to stick to Java when we are working with objects and databases when working with business data. Don't you agree?"

    care to comment? (please let's keep this technical, both of you provide compelling products but I think users deserve to know the difference(s))
    Sorry, I honestly don't know much about their solution, other than the "API-less" claim. I went to the marketing presentation that they did at the recent show on Wall Street, and (to me) it sounded like they're trying to do what Coherence does, but by using AOP instead of providing an API.

    It's not surprising that other companies are trying to emulate our success -- Coherence has achieved over a thousand successful production deployments including in many of the world's major banks and financial services firms, Tangosol has more than doubled revenues five years in a row, and we've run sixteen straight quarters of profitability.

    Even though we have managed to win most of the market share today, there is still intense competition in this market, and there is a lot of money being invested in it. To cite your example, Terracotta just took $13.5mm in VC money from Goldman Sachs, but they are just one of a dozen small competitors in this space.

    In the end, we know that competition is good for the customer, so I think that anyone that has selected Java as the basis for their enterprise infrastructure has to be pretty happy to see this occurring. I'm sure that Terracotta is a startup with promising technology and some great people, so I don't want to diminish the work that they're doing in any way. Ben Wang (jboss) seemed to think that the technology in Terracotta is similar to JBossCacheAop:

    http://www.theserverside.com/news/thread.tss?thread_id=36243f

    Unfortunately, to date the only reference to their use is in Goldman Sachs (see above), and according to the marketing presentation at the Wall Street show, it is not in production use, so that severely limits the amount of knowledge that I and others have about them. From published information, the technical approach that Terracotta uses is client/server, relies on byte code manipulation to track object changes, uses XML descriptors to define data synchronization points, and communicates object deltas (which, similar to JDO 1.0, is possible by tracking the object changes using byte code manipulation). Theoretically, it should have efficiency advantages if you have a very large object graph (actual “hard” Java references among a huge number of Java objects) and you frequently access that graph but only change small bits of information within it. According to the presentation by Goldman Sachs at the Wall Street show, they were able to get a proof-of-concept running but the scalability was "unacceptable".



    From what I saw, the most interesting (promising) part was the ability to navigate large graphs of objects, and (using the byte code manipulation features) have those references be resolved on demand, thus loading the object graph as needed across the network. It's similar to how the old C++ OODBMS solutions worked, except they tended to use memory page faults (kind of like a giant swap file).

    Assuming that my technical information is correct (and I'm sure Ari will be glad to jump in to correct me if I am wrong), IMHO the major differences are:

    - Tangosol Coherence is truly peer-to-peer. It runs completely inside the application, with no extra JVMs or machines necessary. All servers are "hot" (active) all the time; it's NOT hierarchical, master/slave, etc. Servers can be added / removed / killed at any time, and Coherence will dynamically and resiliently load balance the data management across whatever machines are available.

    - Tangosol Coherence has an API, and doesn't use byte code manipulation to obviate the use of the API.

    - Tangosol Coherence is much more feature rich, and those features are production-proven.

    - Tangosol Coherence is designed explicitly for limitless horizontal scalability, including support for optimistic transactions, true dynamic and resilient partitioning of data sets, load balancing, parallel query, indexing, iterative cost-based query evaluation, parallel execution of data operations and parallel aggregation support. Coherence has a number of very large deployments, particularly for use with compute grids (like DataSynapse) and in extremely large J2EE applications.

    - Tangosol Coherence provides WAN clustering support. One of our Wall Street customers lost an entire datacenter due to a fire last year, and a business critical system running Coherence kept running without interruption because it had been clustered (on many machines) across two datacenters.

    The "ilities" (High Availability, reliability, performant scalability, serviceability, manageability, etc.) are claimed in all companies' marketing literature, which is what marketing is supposed to do, I suppose. Tangosol has the distinct advantage in that most of the firms on Wall Street (and many on Main Street ;-) are already using our software in production, and so by experience and word of mouth it has established itself as a very trusted solution.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  23. giant distributed caches[ Go to top ]

    Coherence vs Terracotta ? Let me think of some unfair characterization that would irritate everyone... Oh yes, what about this ?

    - Coherence is the world's most expensive hashmap.
    - Terracotta is the world's most expensive AOP interceptor.

    I'd say something bad on ObjectGrid too but I have a policy of only criticizing the stuff that people can actually download.

    For the Americans: :-)
  24. giant distributed caches[ Go to top ]

    - Coherence is the world's most expensive hashmap.
    - Terracotta is the world's most expensive AOP interceptor.

    I looked at the Terracotta flash demo on their site. It's very nicely done. When we launched Coherence 1.0, we had the clustered drawing app and the IM chat as our demos, but we didn't have Bob's voiceover on a cool flash presentation ;-)

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  25. giant distributed caches[ Go to top ]

    Unfortunately, to date the only customer account that we've encountered them at is Goldman Sachs (see above), which limits the amount of knowledge that I have about them. From what they've told me, the technical approach that Terracotta uses is client/server, relies on byte code manipulation to track object changes, uses XML descriptors to define data synchronization points, and communicates object deltas (which, similar to JDO 1.0, is possible by tracking the object changes using byte code manipulation). It should have efficiency advantages if you have a very large object graph and you frequently access that graph but only change small bits of information in it. According to the VP from Goldman Sachs, they were able to get their proof-of-concept to scale up to two machines, but hit a serious scalability wall beyond that.

    Actually, I believe you are under NDA with Goldman Sachs and should not be saying most of what you are. If I were to violate my NDA, I think there would be some VERY IMPORTANT scalability information that folks on TSS who have been interested in Terracotta vs. Tangosol would want to hear.
    Assuming that my technical information is correct (and I'm sure Ari will be glad to jump in to correct me if I am wrong), IMHO the major differences are:- Tangosol Coherence is truly peer-to-peer.

    Here's where I will start and stop, Cameron. We've had this discussion before--blind assertions are not in the spirit of this forum. For example, "infinite scalability" and "truly peer-to-peer" seem a tad misleading as product claims, especially because you are calling Terracotta "master / slave". If Tangosol were to copy everything everywhere, as you imply when you compare to us as "master / slave", you would bottleneck on the network after just a couple of servers. But Tangosol does not bottlneck. This is because you have the ability to copy things only to particular machines thus providing "hybrid peer-to-peer" and you look JUST LIKE a master/slave architecture except you provide the master in-process. Let's not mislead folks, now. Capabilities such as "I can lose any machine and not lose data" while also providing "infinite scalability" are just impossibilities when put together and come down to lots of double-talk.

    Goldman came out to Web Services on Wall Street and clearly said, "Terracotta transparency works and scalability is very close FOR ONE OF OUR LARGEST APPLICATIONS." Let's not hide that statement from this audience, please.
  26. giant distributed caches[ Go to top ]

    For example, "infinite scalability" and "truly peer-to-peer" seem a tad misleading as product claims, especially because you are calling Terracotta "master / slave".

    This is a quote from the Terracota "Technology deep dive" section of the Terracotta web site:

    "The Terracotta server. At the heart of the Terracotta architecture sits the Terracotta Server". There is also mention of a "hot back-up" server that can be employed. But there is no mention of adding extra servers for scalability.

    I think Cameron lapsed a bit too far into marketing speak and you're right to criticize that, but I think the essence of his arguments are accurate. The terracotta design appears to require a special Terracotta server for it to work, and I haven't been able to find any throughput or scalability information on the product anywhere.

    Coherence on the other hand really is peer-to-peer, and I've witnessed it scaling out to over 50 machines. No server is required and really does scale as claimed.
     If Tangosol were to copy everything everywhere, as you imply when you compare to us as "master / slave", you would bottleneck on the network after just a couple of servers. But Tangosol does not bottlneck. This is because you have the ability to copy things only to particular machines thus providing "hybrid peer-to-peer" and you look JUST LIKE a master/slave architecture except you provide the master in-process.

    This is not an accurate description of how Coherence works, well at least not how the distributed cache works (replicated is another matter). Coherence is true peer-to-peer. They appear to do some dynamic mastering for coordination purposes, but data does _not_ have to flow through a master. For example, if you have 50 JVMs in a Coherence cluster, there's no single node or set of nodes who are coordinating all the updates.
     Let's not mislead folks, now. Capabilities such as "I can lose any machine and not lose data" while also providing "infinite scalability" are just impossibilities when put together and come down to lots of double-talk.

    Like I said, Cameron is deep in his marketing role right now, but while the words are over the top the meaning is accurate. Coherence scales exceptionally well. I've personally taken it up into the dozens of JVMs realm spread over many machines and it scaled as claimed.

    I've also tested the fault tolerance and it too works as advertised. With the distributed cache you can specify how many "backups" you want (which are determined dynamically). More backups give you more data security but will have a perf and space penalty. But, for example, if you specify one backup you can lose an entire JVM w/ zero impact to the cluster.
  27. giant distributed caches[ Go to top ]

    blind assertions are not in the spirit of this forum. For example, "infinite scalability" and "truly peer-to-peer" seem a tad misleading as product claims

    I just want to add that even in the best case (a perfect product) scalability is a property of the workload (the nature of the workload, the level of sharing etc.) and the hardware.

    As far as Cameron's claims go, they are not really misleading, but it's just that the properties of a clustered cache are not really desirable, and something undesirable which scales well doesn't become more desirable.

    What people need is horizontal scalability of the database. That's because when you move data from the database into a cache you get totally different concurrency control properties, if I can even go so far as to call that concurrency control, and of course distributed queries (joins?) OVER IP BASED NETWORKS is an undesirable approach with undesirable results.

    When you look at Coherence you can tell right away that they did not escape these problems because they have lots of different "kinds" of caches depending on what kind of workload you have, and when people have lots of solutions to the same problem it's because they are all approximate solutions.

    As far as TC goes, as I have explained in previous posts the behavior of a java application which is transparently persisted is completely different concurrency-wise from business logic running against an RDBMS. And of course it must be far from being transparent when it comes to performance.

    The right way to scale databases or jvms is only one these days: Infiniband. The Pathscale IB adapter can plug directly (HTX slot) into an opteron processor, providing <2us RDMA latency. The kernel is not involved, resulting in 0 load on the processor.

    Guglielmo

    Enjoy the Fastest Known Reliable Multicast Protocol with Total Ordering

    .. or the World's First Pure-Java Terminal Driver
  28. giant distributed caches[ Go to top ]

    As far as Cameron's claims go, they are not really misleading, but it's just that the properties of a clustered cache are not really desirable, and something undesirable which scales well doesn't become more desirable.

    That's quite a claim. Fortunately, we seem to find a huge number of customers that desire clustered caches, data grids, information fabrics, etc.

    In fact, I'm here sitting in an investment bank listening to a description of how they're going to use our software, and how it solves literally dozens of their problems that they can't solve with anything else.
    What people need is horizontal scalability of the database.

    What people really need is horizontal scalability of the data. Horizontally scaling the database is one way that some applications can achieve that.
    That's because when you move data from the database into a cache you get totally different concurrency control properties ..

    Like locking and transacions? ;-)
    .. of course distributed queries (joins?) OVER IP BASED NETWORKS is an undesirable approach with undesirable results.

    Could you be more specific?
    When you look at Coherence you can tell right away that they did not escape these problems because they have lots of different "kinds" of caches depending on what kind of workload you have, and when people have lots of solutions to the same problem it's because they are all approximate solutions.

    Or maybe it's because our different configurable topologies solve different problems, yet all through the same simple programming model. For example, replication pushes data to all interested parties, while partitioning is used to load-balance the data across all servers, yet both are transparently accessed via the same API.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  29. giant distributed caches[ Go to top ]

    The right way to scale databases or jvms is only one these days: Infiniband. The Pathscale IB adapter can plug directly (HTX slot) into an opteron processor, providing <2us RDMA latency. The kernel is not involved, resulting in 0 load on the processor.

    And if you're performing great on IP? And not even close to saturating the network?

    Don't get me wrong, I think the high speed interconnects have their place, but it's still a niche not mainstream. Products like Coherence are used for mainstream application development and deployment. A financial services company can just go out and deploy Coherence on plain commodity hardware and networks. An infiniband solution is going to be very high cost and involve tons of networking people and specialized setups. If you're building a SAN cluster, chances are you'll use Infiniband or Fibrechannel. But if you're building an application cluster plain old IP does just fine.
  30. giant distributed caches[ Go to top ]

    A financial services company can just go out and deploy Coherence on plain commodity hardware and networks. An infiniband solution is going to be very high cost and involve tons of networking people and specialized setups. If you're building a SAN cluster, chances are you'll use Infiniband or Fibrechannel. But if you're building an application cluster plain old IP does just fine.

    I know. What I am saying is that it's about time that IB becomes mainstream. It's cheap. Mellanox sells $69 nic chips.
  31. giant distributed caches[ Go to top ]

    What I am saying is that it's about time that IB becomes mainstream. It's cheap. Mellanox sells $69 nic chips.

    I'd say that that's highly unlikely. There's a huge investment in Ethernet and IP that goes way beyond the cost of a NIC.

    The reality is that Infiniband is not a cost effective option for mainstream cases. Which negates the rest of your post entirely, at least for mainstream clustering and caching needs.
  32. giant distributed caches[ Go to top ]

    I'd say that that's highly unlikely. There's a huge investment in Ethernet and IP that goes way beyond the cost of a NIC. The reality is that Infiniband is not a cost effective option for mainstream cases. Which negates the rest of your post entirely, at least for mainstream clustering and caching needs.

    It's cost-effective, Mike.

    It's wasting all these cpu cycles on context switches which is not.

    Guglielmo

    Enjoy the Fastest Known Reliable Multicast Protocol with Total Ordering

    .. or the World's First Pure-Java Terminal Driver
  33. giant distributed caches[ Go to top ]

    It's cost-effective, Mike.It's wasting all these cpu cycles on context switches which is not.

    Guglielmo, what kind of distributed system requires such low latency that multi-threading is prohibitive?

    Perhaps what you really want is parallel sysplex, which makes the IB latencies look rather glacial ;-)

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  34. giant distributed caches[ Go to top ]

    It's cost-effective, Mike.

    It's wasting all these cpu cycles on context switches which is not.

    You're joking, right?

    I've seen Coherence used to cluster dozens of machines - let's call it 30 just for yucks. They're somewhat arbitrarily allocated generic Linux boxes using the company's standard network infrastructure. If you need to grow the grid you make a regular IT request for a machine in a given data center, and you get another generic Linux box hooked into the generic network.

    That's cost effective.

    Networking boxes with Infiniband? Hah! We're no longer talking about using standard-well known stuff. Now you're talking about special networks, special operations staff that know about it, etc etc. There are multiple levels of headaches involved here at nearly every level of your hardware and software stack.

    Cost isn't a measure just of how much a NIC costs.
  35. giant distributed caches[ Go to top ]

    It's cost-effective, Mike.It's wasting all these cpu cycles on context switches which is not.
    You're joking, right?

    I am definitely not kidding.

    Guglielmo
  36. <blockquoteEven though we have managed to win most of the market share today, there is still intense competition in this market, and there is a lot of money being invested in it. To cite your example, Terracotta just took $13.5mm in VC money from Goldman Sachs, but they are just one of a dozen small competitors in this space.
    Just to set the record straight, Gemstone is by no means a "small" competitor--we are probably 6 or 7 times bigger than Tangosol and with a 25 year history in object technologies. Furthermore, we have a much broader view of the distributed caching space, offering native C++ caching, XML-optimized caching, comprehensive web-services interfaces, shared-memory solutions, extreme high-performance JDBC-based continuous querying, and superior stability and performance.

    It is somewhat amusing that after Cameron has for some time derided the idea of "Data Fabric"--which Gemstone has pioneered--Tangosol is all of a sudden claiming to have invented it! It is always so easy for marketing departments to throw around claims . . .

    One thing I will agree with is that competition is great for the market. It drives innovation, creates price-pressures, and generally only helps consumers. But don't always believe the hype.

    Cheers,

    Gideon
    www.gemstone.com
  37. Roll your own in a few weeks[ Go to top ]

    GuglielmoYour EVS4J project is nice but I don't think it's so easy to turn that into a distributed clustered cache, with nice doc etc as Tangosol has (and many others have - I'd love to see some standard benchmark around this field one day) - left aside support, training, etc -Simply take the various cache topologies, disk overflow features, or distributed queries, and in Tangosol 3.1 the continuous query (CEP like) things. It is much more than a reliable multicast protocol.Or am I missing something?

    I was specifically addressing my message to people who have one specific need, and are happy to get that for free.

    I would also guess that evs4j performs better than coherence for workloads with 100% sharing and all nodes sending and receiving.

    Guglielmo
  38. Roll your own in a few weeks[ Go to top ]

    I was specifically addressing my message to people who have one specific need, and are happy to get that for free.

    There probably are companies that write their own caching to save money, but the truth is that to build any real functionality takes a lot more than a week, and in the industries we work in, companies are loathe to invest their best employees' time to write, test, debug and maintain something that they can get "off the shelf" from a trusted and established vendor like Tangosol, and they also realize that there is huge risk involved with building their own distributed data management system. (Companies also tend to avoid writing their own databases, application servers, messaging backbones and operating systems.)

    However, it's a lot of fun to work with distributed systems, so having worked in this area now for a number of years, I can totally understand why you would enjoy writing a "totem" implementation and why people would enjoy writing distributed caches or other distributed services. I find working on these systems to be a rewarding challenge, and it would be hypocritical of me to discourage others from doing the same.

    Personally, I hope that in five years we can all look back and wonder that applications were ever built without clustering and scale-out distribution of load in mind.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  39. Roll your own in a few weeks[ Go to top ]

    Actually when I said build your own, I didn't really mean "build coherence on your own". I meant "build a horizontally scaled application on your own."

    Guglielmo

    Enjoy the Fastest Known Reliable Multicast Protocol with Total Ordering

    .. or the World's First Pure-Java Terminal Driver
  40. Congratulations + Desktop Query[ Go to top ]

    It has been a pleasure seeing Cameron & Co. build-up and scale out both the technology and business.

    Large corporations place a high premium on the many other aspects of a software solution than just price (free) such as manageability,usuability, applicability, extensibility.... Tangosol appears to improve all areas with each product release.

    I recently integrated JXInsight with Coherence's Distributed WorkManager implementation which is detailed in an Insight article on our website.
    http://www.jinspired.com/products/jdbinsight/coherencework.html
    Setting up a the distributed execution of work items was achieved in a matter of minutes.

    ------------------------------------------------------------

    Hi Cameron,

    The RealTime Desktop feature looks very interested and could solve a large performance problem we have detected for a global services and software corporation. Can you tell me how is this priced? What are the units costs for each desktop installation? I assume the customer must already have a Java messaging infrastructure. Is this correct?


    Kind regards,

    William Louth
    JXInsight Product Architect
    CTO, JInspired

    "J*EE tuning, testing and tracing with JXInsight"
    http://www.jinspired.com
  41. Congratulations + Desktop Query[ Go to top ]

    It has been a pleasure seeing Cameron &amp; Co. build-up and scale out both the technology and business.

    Large corporations place a high premium on the many other aspects of a software solution than just price (free) such as manageability, usability, applicability, extensibility.... Tangosol appears to improve all areas with each product release.

    Thank you. We certainly are working to improve all of these areas with each release.
    The RealTime Desktop feature looks very interested and could solve a large performance problem we have detected for a global services and software corporation. Can you tell me how is this priced?

    I am embarassed to say that I'm not certain how it is priced. (All of our other product pricing information is published on our web site.) Since it's such a new thing, only a few companies have rolled it out at this point, and their agreements with us already covered it.

    As soon as I get the information back on price, I'll post it (and it will be on our web site as well).
    I assume the customer must already have a Java messaging infrastructure. Is this correct?

    Yes. To date, we have several deployments on Tibco EMS and a small number on WebLogic JMS. Do you know what messaging infrastructure the customer has?

    In the next release, the requirement for a messaging infrastructure goes away completely. And since the APIs are all the same regardless of whether the code is running over a messaging infrastructure, running without a messaging infrastructure, or running inside the data grid, the application development work won't have to wait and it won't have to change when the new capabilities are released.

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  42. Desktop Query[ Go to top ]

    I am embarassed to say that I'm not certain how it is priced. (All of our other product pricing information is published on our web site.) Since it's such a new thing, only a few companies have rolled it out at this point, and their agreements with us already covered it.As soon as I get the information back on price, I'll post it (and it will be on our web site as well).
    Just FYI: it does not seem to be described in the 3.1 user guide yet, although there is contents on the continuous query/data fabric/extend-JMS feature set that is the foundation of the RealTime Desktop I guess.
  43. Desktop Query[ Go to top ]

    Im told that you need a "Local Edition" license for each client on top of the Enterprise Edition for your grid (assuming you want to use Extends-JMS with the JavaBean event notification on the clients). I might be mistaken but ContinuousQuery over Extend-JMS would seem to be much less useful without the JavaBean event notification. You would have to poll your NamedCache data to detect changes. Is that correct Cameron?

    cheers
    craig
  44. Desktop Query[ Go to top ]

    Im told that you need a "Local Edition" license for each client on top of the Enterprise Edition for your grid (assuming you want to use Extends-JMS with the JavaBean event notification on the clients).

    The client access license used to be called "Local Edition". In 3.1, the new "Coherence Realtime Desktop" includes the Continuous Query Cache on the desktop over the Coherence*Extend which uses JMS to connect into a Coherence Enterprise data grid.

    (That was a run-on sentence.)
    I might be mistaken but ContinuousQuery over Extend-JMS would seem to be much less useful without the JavaBean event notification. You would have to poll your NamedCache data to detect changes.

    Exactly. That is why Continuous Query Caching would not work without event support.

    In terms of the query, event and Continuous Query features, there's a lot more information at:

    http://wiki.tangosol.com/display/COH31UG/Provide+a+Queryable+Data+Fabric
    http://wiki.tangosol.com/display/COH31UG/Deliver+events+for+changes+as+they+occur
    http://wiki.tangosol.com/display/COH31UG/Continuous+Query

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  45. What are the requirements for using Coherence HTTP Session replication in an existing servlet application? Let's say there is an existing servlet application with a big object tree stored as a single attribute in HttpSession. The application has a single call session.setAttribute(), when object tree is initialized. Do we have to modify the code to call setAttribute() after every change? Valery