Discussions

News: OpenSymphony Announces OSCache 2.0

  1. OpenSymphony Announces OSCache 2.0 (31 messages)

    OpenSymphony has announced OSCache 2.0 final. This releases features an overhauled codebase which enabled very many enhancements and bugfixes. The new features include: JavaGroups 2.1 support for clustering, JMS clustering support, many more configuration options, and a large performance enhancement. TheServerSide.com uses OSCache.

    New Features

    - Now supports JavaGroups version 2.1.
    - JMS Clustering support has been added [Romulus Pasca].
    - Clustering code has been refactored. As a result of this, some of the clustering configuration has changed since beta 1 - please see the updated clustering documentation for details.
    - Performance enhancement: When running under JRE 1.3.x, the LRUCache will now attempt to use the Jakarta commons collections SequencedHashMap. If the commons-collections.jar is not present then the code resorts to using a LinkedList and a warning is logged. Note that under JRE 1.4.x and higher the commons-collections.jar is not required.
    - Config.getProperties() method added.

    Links

    Visit the OSCache Homepage: http://www.opensymphony.com/oscache

    Download OSCache 2.0

    OSCache 2.0 Change Log

    OSCache 2.0 Beta Announcement

    Threaded Messages (31)

  2. OpenSymphony Announces OSCache 2.0[ Go to top ]

    TheServerSide.com uses OSCache

    Just for caching HTML pages, or for more?
  3. OpenSymphony Announces OSCache 2.0[ Go to top ]

    We use it for caching at the edge (the pages themselves). Since we use Tangosol Coherence for our data cache, we don't have a need for anything fancy.

    We don't go crazy with <os:cache> tag's around the site, rather we went through and found good candidates (data doesn't change often, takes effort to generate, etc).

    Dion
  4. OpenSymphony Announces OSCache 2.0[ Go to top ]

    Release 2.0 looks nice. They've even integrated JavaGroups for clustering support.

    OS Cache 2.0 Easily share live data across a cluster!
  5. sig theft[ Go to top ]

    Corby: OS Cache 2.0 Easily share live data across a cluster!

    OK, I get the hint. I'm updating my sig.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  6. sig theft[ Go to top ]

    All I have to say is that theres abit of Larry in all of us! ;-)

    Cheers,

    Smythe
  7. sig theft[ Go to top ]

    Coherence: Clustered JCache for Grid Computing!

    There you go! When you say "for Grid Computing", are you simply noting that clustered caches in general make grid computing easier, or are there special features in Coherence that facilitate grid computing?

    OS Cache 2.0: Clustered Proprietary API's for Grid Computing!
  8. sig theft[ Go to top ]

    Corby: There you go! When you say "for Grid Computing", are you simply noting that clustered caches in general make grid computing easier, or are there special features in Coherence that facilitate grid computing?

    Yes, both. We work with companies like Platform and DataSynapse to do the provisioning and management side of the grid computing, and our software was initially used as the state management within Java-based software running in the grid. Starting with release 2.0 or 2.1 (I can't remember which) we supported grid invocation (targetable executable agents) for building SOA apps for grid deployment. The agent invocation supports fire&forget, request/response, async reponse (poll), etc., so it's very powerful/flexible. The state management is the most popular aspect, though, because it's very hard to do correctly, and we actually provide coherent caches and concurrency control and transactions (even 2PC) across the grid with no single point of failure and very high scalability (e.g. our partitioned cache scales linearly to the extent of the switched fabric!)

    We have never really focused on the web caching side, which is where OS Cache and Chutney and SpiderCache etc. have been focused. (It's kind of like on TheServerSide.com site, where the data is managed by Coherence and the web pages and fragments are cached by OSCache.) No one seems to mind if their web server farm cache is a little bit incoherent, because they're dealing with Internet latencies at that point anyway, and the data is all non-transactional (it's "just" content). Generally, what that means is they aren't willing to pay for the 100% uptime with 0% data loss and transactional/coherency guarantees for their web caches. In those cases, we see lots and lots of OSCache getting used. (Yes, lots of our customers use OSCache. No, we don't mind. ;-)

    For performance (the "is caching data actually worth it" question), I'll give you an example from some customer stats I got yesterday from a rather successful ecommerce web site: They switched their session management from a high-end ****** database to Coherence, and their session access times have dropped from over 200ms (database) to under 1ms (clustered cache), without losing transactional or concurrency control QoS, and actually resulting in higher up-time. In other words, even though it is "just" a cache, they cannot get old/stale data, they do not lose any data if a server dies, and the session accesses are over 300x as fast.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  9. First, please forgive me if I'm forwarding my question to a wrong forum. I'm already involved in a "clustering a web app" project and I'm considering Coherence and OSCache for HTTP session clustering (over 5 LAN nodes). Now, Am I doing well choosing about those services/tools for achieving my project? Should I consider another tools? I'm talking about a 15Kb size HTTP session (rather big, isn't it?) on WebLogic 6.1
    Sorry again for fooling this thread but the point is that it fits me a lot of in my actual work.

    PS: Cameron, your software is quite good (great quality, in deed!) but I should consider several parameters working for my client (money, for example)


    Thanks in advance, guys.
    Best Regards.
    Jesus.
  10. Any idea about[ Go to top ]

    Any one can compare SpiritCache2.0 (spirit-soft.com), Coherance and OSCACHE2.0
  11. Any idea about[ Go to top ]

    SpiritCache2.0 (spirit-soft.com), Coherance and OSCACHE2.0

    The first difference is that OS Cache is open source. That is a pretty nice feature for a lot of people. For clustering, OS Cache uses JavaGroups, an open source "reliable multicast" implementation ("reliable" means it retries packets if they get dropped), not entirely dissimilar from PGM (Pretty Good Multicast). OS Cache has a relatively simple API to use and a lot of built in features for supporting web page (.jsp and fragment) caching, and you have the sources if you want to add something that it doesn't support. I've read that the OS Cache implementation plans to support the JCache spec once it's done.

    SpiritCache was written by James Strachan, now with Core Developer Network, previously of Jelly / JDOM (or DOM4J?) / etc. fame, a Java guru (and Mac fan ;-) in his own right. It implements the preliminary JCache spec and is based on JMS messaging, and supports multi-tier hub/spoke architectures. I've never used it, but several of our customers have, and I've heard some good things about it. (I'll send James a link to this dicussion.)

    Since I know something about it ... Coherence implements the preliminary JCache spec and is based on a peer-to-peer clustered implementation from the ground up so that all services (clustering, caching, grid) have no single-point-of-failure or single-point-of-bottleneck, and uses an ATM-model clustered datagram-based protocol that dynamically switches between unicast and multicast to ensure that it makes optimal use of network bandwidth. (Using the ATM model, it is very scalable and it even clusters on WANs.) In addition to replicated caching (both optimistic and pessimistic concurrency models), it offers transparent cluster-partitioned caches, near caches (including seppuku and versioning support), write-through and write-behind caches, distributed (parallel) cache queries (with declarative indexing and cost-based query optimization), transactional caching, cluster-wide concurrency control, the support for grid computing, etc. IMHO The main difference is that Coherence was designed to actually manage data in the cluster, not just cache it, so that applications can actually count on the cache to be correct. Several of our customers rely on Coherence to keep their apps running when their databases fail or go down for maintenance, for example, or they use write-behind caching which means that the up-to-date transactional data is in the cache, and has not yet even been written to the database, which is why we stress reliability/availability so highly. Regarding session management, we have a session management upgrade in our upcoming 2.3 release that will be deployed in a 100+ server cluster this quarter. (I'm not saying which app server either ;-)

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  12. You don't have to say...[ Go to top ]

    /CP/
    Regarding session management, we have a session management upgrade in our upcoming 2.3 release that will be deployed in a 100+ server cluster this quarter. (I'm not saying which app server either ;-)
    /CP/

    In light of the fact that you're not mentioning the app server, I'm almost certain it's [you-know-what]!

    Hint: The chicks really dig it cuz it's sexy :-P

    Raffi
  13. Any idea about[ Go to top ]

    <Cameron>
    Coherence implements ... all services (clustering, caching, grid) have no single-point-of-failure or single-point-of-bottleneck...
    </Cameron>

    As a discloser, I work on product called xNova™. One of 15 system services we provide is distributed cache service for Java and .NET (also multicast/unicast based).

    I have a question about _single-point-of-bottleneck_ in Coherence Cache. As far as I know, Coherence comes with Versioned-Near-Cache which is a distributed map fronted by local cache and it uses version numbers to update/invalidate references on all nodes across the cluster whenever modification happens.

    My question is: "If coherence needs to multicast/unicast messages to all nodes in the cluster, wouldn’t the slowest node take the longest time to reply and, therefore, create the _single-point-of-bottleneck_?"

    Can you clarify where I am wrong?

    Thanks,
    Dmitriy Setrakyan
    xNova™ - Reusable System Services for Java and .NET
  14. Any idea about[ Go to top ]

    Dmitriy: My question is: "If coherence needs to multicast/unicast messages to all nodes in the cluster, wouldn’t the slowest node take the longest time to reply and, therefore, create the _single-point-of-bottleneck_?"

    The answer is that it depends on the operation, because that would have to be an operation that is a request/reply (synchronous blocking) that goes to all members in the cluster, and that is extremely rare -- and AFAIK never for caching. For example, locking does not involve multicast or multi-member buy-in, so it would not be affected. Neither replicated nor partitioned cache access involve multicast. My understanding is that about the only that this could affect is a long (massive) and rapid series of updates to a replicated cache with more than two members, since those updates need to be pushed to all members through the respective issuers of the data; note that even the responsibility for the updates is distributed across the cluster to load balance those operations. What will happen initially is that data will begin to be acknowledged by the faster members as it is being received, letting subsequent update operations commence, since not all members have to immediately acknowledge the update for it to be a success (just enough to ensure survivability in the cluster). However, after a while the slowest member could theoretically become so backed up that it would affect the memory utilization of the other faster members in the cluster (because they have to retain all packets that have not been acknowledged to ensure reliable in-order delivery), in which case they would have to throttle their non-critical communications (actually, it throttles the threads that are generating the traffic.) (I didn't write the code behind what I'm describing, so I may not be 100% correct in the implementation, although I was involved with peer review of big portions of it, so I think I'm correct.)

    In practice, we basically never see this situation, but we can definitely force it to occur in our stress tests. If you can force it to happen, when you grab a stack trace from the JVM you'll see that the client (non-critical) threads will be blocked in a call to PacketPublisher.drainOverflow() or something like that. The tolerances are configurable for what point it enters various stages of overflow, so you can balance between worst-case excessive memory utilization and throttling. The other reason I think that we don't see this is that the CPU cycles used by Coherence are generally pretty low. Accepting a replicated cache update will use very few cycles; there's very little that the member has to do. So you'd probably have to have some pretty slow machines with an amazingly fast network (like an old Sun box on a gigabit switch. ;-) We stress test the replicated cache with 48 nodes, from 400Mhz Sun slugs up to about 3Ghz Pentium 4 speed demons, with all 48 nodes running 50 threads doing bulk cache operations for days on end, and we test it on several linked switches (and even a hub in order to force a large number of packets to be lost.)

    One other thing: The clustered protocol we use is completely asynchronous in nature, so packets do not have to be acknowledged (ACK or NACK) immediately. This allows for very efficient burst mode, which often will compensate for relatively slow members. For example, we've seen a slow Sun box back up (get behind on) close to 10,000 packets (or messages? I can't remember which) and then process them in a fraction of a second because of the weird thread scheduling that Solaris has.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  15. Any idea about[ Go to top ]

    Hm…
     
    I am still a little confused. I understand that locking does not require all-node participation because with distributed map architecture you probably only lock data on the cluster node that is responsible for storing that data. I think picture changes a little bit when you modify data that can potentially reside on all nodes in the cluster. I am still not clear about Versioned-Near-Cache.
     
    If I understand, Versioned-Near-Cache is distributed map fronted with local (or near) caches on all cluster nodes and it ensures coherency of front local caches by sending/receiving change-version notifications (correct me if I am wrong). Now in the cluster of N nodes, if you modify data on Node 1, all other N-1 nodes must be notified that this data changed so the local caches can mark it as dirty. If Node 1 does not wait for replies from N-1 other nodes before it completes the transaction (i.e. asynchronous mode), then for a short period of time some nodes in the cluster may be unknowingly caching stale data. Now suppose that a client initiated an update on Node 1 and then tried to read the data on Node 2. Does it mean that this client may get stale data from Node 2?
     
    I would like to stress that my question only refers to scenarios in which any data may or does reside on all N nodes in the cluster and client can not afford to read stale data.
     
    Thanks again,
    Dmitriy Setrakyan
    xNova™ - Reusable System Services for Java and .NET
  16. Any idea about[ Go to top ]

    I believe in a distributed system such as this, there is _always_ a window where someone can see stale data. And there's a tradeoff with how big of a window you're willing to live with vs. the performance of the cluster.

         -Mike
  17. Any idea about[ Go to top ]

    <Mike>
    I believe in a distributed system such as this, there is _always_ a window where someone can see stale data. And there's a tradeoff with how big of a window you're willing to live with vs. the performance of the cluster.
    </Mike>

    Hm...

    There are cases when it is acceptable to read stale data, for example when you have read-committed isolation level or in auto-commit mode. However, often it is not acceptable to have windows of “unknown”. For example, if you perform multiple reads in one transaction and you want all of them to be consistent with each other (i.e. serializable isolation level), then you would need all-or-none effect. So, maybe Tangosol checks whether data was changed by some other node (this most likely involves network trip) even for serializable read-only optimistic transactions... then it would answer my question: "reads get somewhat slower but one-node-bottleneck is removed."

    Regards,
    Dmitriy Setrakyan
    xNova™ - Reusable System Services for Java and .NET
  18. Any idea about[ Go to top ]

    \Dmitriy Setrakyan\
    There are cases when it is acceptable to read stale data, for example when you have read-committed isolation level or in auto-commit mode. However, often it is not acceptable to have windows of ?unknown?. For example, if you perform multiple reads in one transaction and you want all of them to be consistent with each other (i.e. serializable isolation level), then you would need all-or-none effect. So, maybe Tangosol checks whether data was changed by some other node (this most likely involves network trip) even for serializable read-only optimistic transactions... then it would answer my question: "reads get somewhat slower but one-node-bottleneck is removed."
    \Dmitriy Setrakyan\

    What you're proposing would nullify the point of having any sort of local cache.

    Instead of thinking in terms of purest ACID guarantees and a single RDBMS server, think in terms of a distributed system which is similar in some ways to resources coupled via XA. Such a system guarantees that all participants will stay in sync, but there is still some lag time involved there. There is a reduction in your ACID guarantees (corresponding to a Serializable Isolation level), but this is a tradeoff for greatly improved performance. This is especially critical if you're speaking of a multi-node cluster where the number of nodes is 3 or greater - synchronizing on the level you're talking about, and effectively attempting to guarantee "simultaneity" of updates - would imply a very large performance hit.

    Cameron I'm sure can talk in greater detail on how Coherence does it, but in general terms you don't go for a Serializable goal in distributed systems. The costs are just too high.

         -Mike
  19. Any idea about[ Go to top ]

    Let me chime in for a second....

    <Mike>
    but in general terms you don't go for a Serializable goal in distributed systems. The costs are just too high.
    </Mike>

    We have several major equity trading companies in Japan using our products and having "synchrononous" (serializable isolation level behavior) DB-backed cache was an absolute requirement from get-go for very obvious reasons... Performance that we provide with our cache is of an order of magnitude (at the very least) better than non-cache (direct DB trips). So, I would say that there's an ample amount of use cases for scenarios such as mentioned in the previous post, at least from what I can judge base on our customers.

    Regards,
    Nikita.
    Fitech Labs.
  20. Any idea about[ Go to top ]

    \Nikita Ivanov\
    We have several major equity trading companies in Japan using our products and having "synchrononous" (serializable isolation level behavior) DB-backed cache was an absolute requirement from get-go for very obvious reasons... Performance that we provide with our cache is of an order of magnitude (at the very least) better than non-cache (direct DB trips). So, I would say that there's an ample amount of use cases for scenarios such as mentioned in the previous post, at least from what I can judge base on our customers.
    \Nikita Ivanov\

    There are cases where you do need synchronous behavior, but you try to avoid them if possible because they are pretty high in cost. Perhaps not as a high as a full RDBMS access, but still high :-) In my own work our synchronous headache is synchronous disk writes for mirrored disks that are seperated by several kilometers.

    Also, I noticed I was a little inaccurate on Serializable vs. "stale data" in this thread (I'll blame it on answering on little sleep). What I really was thinking but didn't convey well is that you can get Serializable like semantics - not too hard if you using versioning - but "stale" data is a bit fuzzier. A distributed caching system can make sure that transactional updates arrive are done consistently as they propogate (e.g. if you changed 4 things, all 4 changes get done consistently so you're not reading partial state). But, at the same time, the _global_ state of the entire cluster may be slightly out of whack at any given time due to propogation delays (e.g. Node A may be updated and ready with new data slightly before Node B and Node C is).

        -Mike
  21. Any idea about[ Go to top ]

    <Mike>
    What I really was thinking but didn't convey well is that you can get Serializable like semantics - not too hard if you using versioning...
    </Mike>
     
    That is true. Versioning or any other type of invalidation message.
     
    <Mike>
    ... but "stale" data is a bit fuzzier. A distributed caching system can make sure that transactional updates arrive are done consistently as they propagate (e.g. if you changed 4 things, all 4 changes get done consistently so you're not reading partial state). But, at the same time, the _global_ state of the entire cluster may be slightly out of whack at any given time due to propagation delays (e.g. Node A may be updated and ready with new data slightly before Node B and Node C is).
    <Mike>
     
    Of course. However nodes B and C must have some way to check whether data they access is or has been updated within another transaction on Node A – whether this check is local within VM or has to query other nodes is up to implementation.
     
    The point I was trying to convey is that with “optimistic serializable transactions” one cannot commit without ensuring that other corresponding resources (cluster nodes in this case) do agree – hence “prepare” stage. If this step was not required and all optimistic transactions “were” allowed to commit before getting agreements from other participating nodes – then OptimisticLockFailureException would never be thrown. The truth is that for “mutating serializable optimistic transactions” synchronous behavior _is_ required... whether it uses TCP, multicast, or many unicast messages to resolve concurrency issues is based on implementation.

    I would like to reiterate again: Serializable != Heavy. This statement is true for cache (at least in our product) and false for database. In our system for “read-only” transactions we resolve all “serializable” issues within VM (no network trips) which is extremely light weight. I specifically put emphasis on “read-only serializable transactions” since the most natural application of distributed caches is within “read-mostly” systems and great majority of transactions _are_ “read-only”. Thus user should be able to run database transactions in light-weight “read-committed” mode while enjoying full benefits of “serializable” behavior at cache level and still significantly winning in performance. For “mutating transactions” in any product cache operation will be somewhat heavier than plain JDBC update since it also requires coherency management across all cluster nodes (details are mainly implementation specific).

    Regards,
    Dmitriy Setrakyan
    xNova™ - Reusable System Services for Java and .NET
  22. Any idea about[ Go to top ]

    Dmitriy: I would like to reiterate again: Serializable != Heavy. This statement is true for cache (at least in our product) and false for database.

    The example you gave was reading three keys in a short period of time and then committing. That is a very simple example, but what if there is a search through the cache based on criteria other than the key? (Meaning that *any* insertion/deletion/update could theoretically invalidate the transaction.) What if those three reads are simply staggered over a period of time? How do you guarantee serializable semantics for that without excessive copying and/or excessive frequency of rollback if changes are occurring?

    I'm not suggesting that you do not answer these questions well with your software; I'm simply pointing out that there are many more complex problems than the one you highlighted, and there are serious trade-offs in solving those problems.

    Dmitriy: for “read-only” transactions we resolve all “serializable” issues within VM (no network trips)

    Yes, that is how Coherence works for the same situation.

    Dmitriy: I specifically put emphasis on “read-only serializable transactions” since the most natural application of distributed caches is within “read-mostly” systems and great majority of transactions _are_ “read-only”. [...] For “mutating transactions” in any product cache operation will be somewhat heavier than plain JDBC update since it also requires coherency management across all cluster nodes (details are mainly implementation specific).

    While read-only and read-mostly applications are the most common types, and benefit greatly from caching, the biggest gains in scalable performance are to be had by managing the read/write transactions in the cluster. It's a fairly logical conclusion: Since read/write transactions are the most expensive, they are the most in need of a scalable solution.

    However, doing high-frequency (and especially high-concurrency) read/write transactions in a cluster-replicated cache model will scale poorly. That's because the scalability of a replicated cache is expressed in inverse correlation to the size of the cluster and the (frequency * size) of updates to the cache.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  23. Any idea about[ Go to top ]

    I hope no one minds that we're now so far off the original topic ... but hopefully this subject matter is somewhat relevant.

    Dimitriy: I am still a little confused. I understand that locking does not require all-node participation because with distributed map architecture you probably only lock data on the cluster node that is responsible for storing that data. I think picture changes a little bit when you modify data that can potentially reside on all nodes in the cluster.

    Right. That is the replicated cache example I gave. In our replicated cache, by default all data in the cache is replicated to all nodes that join the clustered service that contains the cache. All modifications are then committed to all nodes by the issuer. The issuer is whatever node in the cluster is responsible for updating the other nodes for a particular key. (That way in-order delivery guarantees coherency because the sequence of udpates for a particular key comes from only one node. That prevents the requirement of a total ordering protocol, for example, which is a technical PITA and not very scalable.)

    Dimitriy: I am still not clear about Versioned-Near-Cache. If I understand, Versioned-Near-Cache is distributed map fronted with local (or near) caches on all cluster nodes and it ensures coherency of front local caches by sending/receiving change-version notifications (correct me if I am wrong).

    That's pretty close. We define a near cache as a combination of a local cache and a distributed cache. There are a number of trade-offs in terms of coherency, scalable performance, latency, complexity, etc. when deciding which type of near cache to use. We support an expiration-based near cache (what I sarcastically call a "dumb near cache" ;-), a seppuku-listener-based near cache (where the entries commit suicide based on cache events) and a versioned near cache, that uses a replicated cache of data version information to provide both concurrency control and invalidation (by doing both off the same cache, we can assume the sequence provided by in-order delivery of messages, and thus guarantee coherency.) The most efficient of these, and least coherent for read/write data, is the expiry-based model. The only one that is guaranteed coherent (you cannot read stale data when using concurrency control) is the versioned near cache. The seppuku implementation suffers a small window of incoherency, which may be reasonable for read-mostly / optimistic data but is not used for transactional purposes because of the chance of reading stale data.

    Now in the cluster of N nodes, if you modify data on Node 1, all other N-1 nodes must be notified that this data changed so the local caches can mark it as dirty.

    In the current version, in the versioned case, the version indicator for the new data is replicated to all nodes, so yes, that is correct. Similarly, the seppuku event is delivered to all nodes, so again it is correct. The downside with this implementation is that only one or two nodes will often have the data to discard, so it is wasteful of bandwidth and processing. As a result, in the 2.3 release, we have expanded the event interfaces in order to be able to sign up for different granularity of events, including the ability to filter the events pre-delivery (cutting the network out entirely for useless events) and the ability to listen to specific keys, as well as "lite" events that only carry key and action (insert/update/delete) information. The near cache in 2.3 will be able to thus listen to just the events it needs to manage its own coherency, and should scale even better than it does today (it is already very scalable.)

    Dimitriy: If Node 1 does not wait for replies from N-1 other nodes before it completes the transaction (i.e. asynchronous mode), then for a short period of time some nodes in the cluster may be unknowingly caching stale data. Now suppose that a client initiated an update on Node 1 and then tried to read the data on Node 2. Does it mean that this client may get stale data from Node 2?

    Without any concurrency control or transaction management, yes you can obviously read within the "time window of update," meaning stale data. With pessimistic concurrency transactions and/or concurrency control, the versioned near cache guarantees that you cannot read stale data.

    Dimitriy: I would like to stress that my question only refers to scenarios in which any data may or does reside on all N nodes in the cluster and client can not afford to read stale data.

    Right. In a distributed system, stale data refers to data that it is changing or has changed elsewhere; so the only way to prevent stale data is to prevent those changes, which means concurrency control. I think the previous sentence is worth reading a couple of times -- the only way to prevent reading stale data is to prevent changes altogether for the data in question. (Optimistic transactions avoid this little issue by avoiding demanding coherency until the prepare phase, which is when the actual locks are requested.)

    Mike: I believe in a distributed system such as this, there is _always_ a window where someone can see stale data. And there's a tradeoff with how big of a window you're willing to live with vs. the performance of the cluster.

    Very true; the only caveat I would add is our ability to assure coherency when using concurrency control.

    Dimitriy: There are cases when it is acceptable to read stale data, for example when you have read-committed isolation level or in auto-commit mode. However, often it is not acceptable to have windows of “unknown”. For example, if you perform multiple reads in one transaction and you want all of them to be consistent with each other (i.e. serializable isolation level), then you would need all-or-none effect. So, maybe Tangosol checks whether data was changed by some other node (this most likely involves network trip) even for serializable read-only optimistic transactions... then it would answer my question: "reads get somewhat slower but one-node-bottleneck is removed."

    Well, we support both pessimistic and optimistic concurrency model for serializable transactions, but I would personally suggest that you avoid them ;-) because you can imagine how expensive it is to "shut everyone out" (pessimistic) of the data scope of the transaction, or even "copy everything on read and lock/verify later" (optimistic).

    We do suggest optimistic/read-committed if possible, although it does shift some burden to the application developer. In our 3.x series, we have some improvements scheduled for transaction management that should cut resource usage for some of the more elaborate schemes, and make transaction management more scalable. All that said, we have no customer complaints on the current transactional implementation, including the J2CA adapters for WebLogic, WebSphere and JBoss (I think those are the ones we support ... I can't remember for sure about WebSphere.)

    I'm a big fan of transactional caching functionality, and I hope to be directly involved with our CMP EJB caching project. I think a "drop-in" cache for app servers that makes CMP EJBs truly scalably performant in a cluster is a home run, just based on the number of customers already asking for it. We have most of the pieces already, and our customers already wire stuff like this together, but I'd like to make it an "out of the box" feature like what we're doing with distributed session management in 2.3.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  24. OT? you're kidding...[ Go to top ]

    <cpurdy>I hope no one minds that we're now so far off the original topic ... but hopefully this subject matter is somewhat relevant.</cpurdy>

    you're kidding aren't you Cameron, this is one the most fascinating and well behaved technical conversations on TSS for ages, and no mentions of JBoss/Java Republic/policals etc... heaven !-)

    oh shit, I hope I haven't let the cat out of the bag...
  25. Any idea about[ Go to top ]

    I think the point was somewhat muddied...

    The transaction point is pretty simple. Following are the schematic examples of "read-committed" and "serializable" transactions:

    1. "read-committed" scenario.

    // Only schematic.

    cache.startTx("read-committed");

    value1 = cache.get(key1);
    value2 = cache.get(key2);
    value3 = cache.get(key3);

    cache.commitTx();

    Notice that reads are NOT consistent so in most cases you can’t do any operations involving all 3 values (or any two for that matter).

    2. "serializable" scenario.

    // Only schematic.
    cache.startTx("serializable");

    value1 = cache.get(key1);
    value2 = cache.get(key2);
    value3 = cache.get(key3);

    cache.commitTx();

    // Do data computations/processing...

    Notice that since all 3 GETs are guaranteed to be consistent - all 3 "values" can be involved in further computations/processing.

    What we found in our customers’ systems is that many cache transactions are of 2nd type, i.e. "serializable", since usually more than one value needed from cache (from DB) to perform some business operation, and in most cases such GETs must be consistent.

    Of course there are many scenarios too where "read-committed" is sufficient. And implementation of "serializable" behavior is probably very different in different products, but I highly doubt one can build anything beyond a toy with just "read-committed" support only. We actually architected xNova cache service so it involves absolutely no network trips for read-only serializable transactions. We _do_ recommend our customers to use them since there is virtually no extra performance hit for read-only serializable behavior in our system (and not that big of a hit for mutating transactions either).

    From my prospective, the "serializable" performance is one of the most important cache characteristics since "read-committed" behavior involves almost no logic at all.

    Btw, we also resolve locks only at prepare stage of cache transactions and optimistically prevent transactions from happening if serializable behavior cannot be guaranteed. I also agree with the point about having ability for the user to define data topology, i.e. on which node(s) every piece of data may reside... we actually are already implementing this functionality to be released in upcoming 2.0 version of our product.

    Thanks for the "great" technical discussion!

    Regards,
    Dmitriy Setrakyan
    xNova™ - Reusable System Services for Java and .NET
  26. JCache APIs?[ Go to top ]

    Will it support the JCache API set?

    Billy, IBM
    (Blog)
  27. OS Cache[ Go to top ]

    Congrats guys! Great project....
  28. What do I use...?[ Go to top ]

    In our company we expect to create a system that will support 4500 concurrent users. The database server does not have enough juice to handle that many concurrent connections. I see caching as the only way to go.

    Do I cache data at the JSP level (with hand crafted expiry when ever the underlying data is updated), or do I cache data at the DB level...

    The JSP approach seems poor from a maintainability perspective (What happens when I add a new JSP for updating the underlying data...) and the Data Cache approach seems expensive!

    We have decided to do JSP caching of select components in the page and to let the data be slightly outdated 'sometimes'. Worst case, we think of providing a link to the user that will expire all the session caches and rebuild all data anew... (Wouldn't try that with transaction systems though).

    Is this approach simply a practical approach, or is this one more case of "fools rush in where ..."?
  29. What do I use...?[ Go to top ]

    Pradeep,

    The more toward the "front" you can cache, the easier it is. See if caching the JSPs will do the trick. The key is that the further back toward the transactional and persistent tiers that you have to cache, the more expensive and complex it can be.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  30. What do I use...?[ Go to top ]

    If you are going to cache data that only authorized user can view, an important consideration is security. If your security layer is a bit deeper or has hooks that go beyond the initial layer, you might have to let the request penetrate deeper.
  31. What do I use...?[ Go to top ]

    Very right - security is a concern and that is why we were planning to use Session based caches. If someone changes the user permissions while that user is on the site, we can let the user hit the dirty cache... But our controller will need to verify submitted data against rights before updating the DB. The app has a 25:1 kind of read:write ratio and this approach seems acceptable right now.

    Since our team has not really implemented serious caching before, we are really working out more details of what could go wrong if we do too much caching - for instance we thought of leaving pages on the browser cache with the last-modified header and then gave up because we were not sure how the various animals out there would handle it...

    I think we will stick to JSP caching as it looks simpler currently (Thanks CP for that one).
  32. JCache JSR?[ Go to top ]

    Are the Open Symphony guys part of the JCache JSR?

    Cool stuff though guys. I look forward to playing with it.


    Hans