Gathering Competing Opinions on Terracotta's Ehcache and BigMemory

Discussions

News: Gathering Competing Opinions on Terracotta's Ehcache and BigMemory

  1. Terracotta's announcement about their new release, BigMemory, with BigMemory's ability to manage heaps as big as 350 gigs in size, has got those in the Java community who are interested in scalability and performance talking.

    Like I often do, I pinged Joseph Ottinger for his opinion on the announcement. Joe's the former Editor-in-Chief, and he's currently with GigaSpaces, so he's insightful with regards to what's of interest to the TheServerSide.com community, while at the same time being able to bring a competitive insight into the discussion.

    I should probably say that while Joe works for GigaSpaces, he's a bit of a loose cannon when it comes to his opinions, so don't take this as an official position paper from his current employer. These are purely Joe's thoughts on the topic. 

    Terracotta and BigMemory: A Competing Opinion

    Threaded Messages (54)

  2. Nice to See...[ Go to top ]

    Nice to see your name in the by-line once again Joe! Here's hoping this isn't just a one-off.

  3. "There are a number of flaws in this approach"

    Ok, I just finished reading the article and I didn't notice any flaws as article indicated. For example, saying that key/value only caches are flaw is hardly so, since they could simply be targeting the market of people whose problems have key/value behaviour, and there are lots of problems with such patterns. Not a flaw IMO, just a specific solution for specific problem.

    Also, you seem to indicate that some kind of GC-like behaviour they are doing is somehow bad or something, which is not the case because if they system works correctly, and it can be benchmarked to show performance gains for specific customer problem, and if I buy the license/support from them (or whatever sales model they have) then I simply do not care how they did it. If this solution helps me solve my problem, and I can benchmark the system before making the purchase to see the evidence that it is so, that is all the evidence I need.

    Regarding the profiling/management of cache behaviour, if they do not already have support for it, I am sure there will be in the future since that is generally support function which is not so hard to implement if you have "kernel" of the idea up and running.

    To conclude: I don't consider things you described as "flaws" and if BigMemory works and is able to solve my particular problem, then it is definitely option to consider for any future decision making process.

  4. Dude, if it works for you, then by all means use it - note that I was not saying "avoid this tripe," because it's not tripe, and it's not something to be avoided. however, it's being presented as a generic catch-all, and that's not actually accurate.

    It doesn't avoid GC problems for the JVM; it avoids JVM GC pauses for ehcache, assuming you configure it a certain way. There's nothing wrong with that, but it's worth considering both the strengths (i.e., simple API, lots of breathing room) versus the weaknesses (i.e., simple API, most breathing room isn't necessary, you still pay for the memory being managed).

    This isn't a competitor whining about a product offering (I think!) - Gigaspaces has its own view of BigMemory (mostly: "congrats, terracotta," for the most part, which I echo), and I am not speaking for Gigaspaces here. This is my view of a product offering whose marketing has been just absolutely marvelous, beyond what the product warrants (IMO).

    Again, if Bigmemory satisfies your requirements, by all means, use it! But one of the things I said in the article is still appropriate: it means your problem has to fit Ehcache's solution, rather than being useful enough to be generically applied.

    I do have to admit, I'd be fascinated to speak to the "hundreds" of customers Ari spoke to - because our market research (along with partners' research) indicates a few odd observations:

    1) A JVM can easily manage 10G of heap with tuning. Stop-the-world pauses as reported by terracotta can certainly happen, but shouldn't happen post-tuning.

    2) Most of our customers simply haven't needed massive data caches. Massive data, sure, but relatively little in direct access, maybe 15G or so.

    The data I've seen - which is purely anecdotal, coming from me, okay? - suggests that BigMemory is an interesting and old (I've seen similar implementations in 2002 or so if memory serves, largely abandoned due to lack of need/interest) solution to a problem that most don't have.

    In all fairness, it might be that people don't have the problem because the solution didn't exist - except it DID exist, and has existed for years (although not as part of ehcache, I suppose.)

    I'm fine with you disagreeing with me, point by point - but I'd say that I'm satisfied with even that result, as long as you've thought about it.

  5. 2) Massive data, sure, but relatively little in direct access, maybe 15G or so.

    Could you please clarify since I couldn't figure it out by myself: does this mean that caches are in 15G or so size range while "massive data" part referrs to permanent store (DB or whererver data is permanently kelp).

  6. Typo:

    "DB or whererver data is permanently kept"

  7. 2) Massive data, sure, but relatively little in direct access, maybe 15G or so.

    Could you please clarify since I couldn't figure it out by myself: does this mean that caches are in 15G or so size range while "massive data" part referrs to permanent store (DB or whererver data is permanently kelp).

    Yes, in practice in the environments I've seen - again, anecdotal data from me, myself, and I - a 15G cache is more than enough to handle live data (which can be and often is much larger.)

  8. I understand that you are speaking on our behalf so no worry.

     

    "15G cache is more than enough to handle live data (which can be and often is much larger.)"

    That is what I am trying to say exactly, there are systems which require more than 15G sized caches, truth to be told they are not too common, but they are out there.

    If we assume that BigMemory works (and this is something that can be validated once it has been released), and according to some reports it has been tested successfully on 100G and 350G caches, and I do have system which requires 100G+ of live data, IMO it is much better, looking from development (easier development model), execution (no penalty for distributed operations) and operations (single machine to manage and monitor) perspective, to scale that on single machine than on the cluster.

    If you have requirement for 350G cache size, and single node can host 20G, you end up with cluster of ~15 nodes, which has bigger operations cost, is slower (perhaps it is not noticable to application, but networking will again influence some other system sharing the same network) and has harder development model (you can't hide distributed nature of the system).

    To conclude my opinion: I see the market need for BigMemory.

  9. *nod* I can see a potential market need - but I'd say that in most of the situations where you'd need a giant cache like that, you'd see massive gains in using a cluster or data grid, because you wouldn't want all of that data handled with the process count a single machine would have to muster. The gains stretch beyond just "can I store all that data" to "what can I do with it?" -- which is a pretty big point.

    350G of cached data ... yuck. Even if you need it locally (a point that can easily be argued), the fact that it'd be temporary data in the cache makes that data less valuable -- IMO.

  10. NYSE alone is producing more than 50 million records per day, and if we assume single record is at best ~30 bytes (symbol, bid, offer, timestamp), you get ~1.5GB per day. Now if you want to do some fancy cross-day / cross-exchange calculations, it means you can keep pretty long history in-memory and do calculations pretty damn fast.

    Just saying.

  11. I understand that you are speaking on our behalf so no worry.

     

    "15G cache is more than enough to handle live data (which can be and often is much larger.)"

    That is what I am trying to say exactly, there are systems which require more than 15G sized caches, truth to be told they are not too common, but they are out there.

    If we assume that BigMemory works (and this is something that can be validated once it has been released), and according to some reports it has been tested successfully on 100G and 350G caches, and I do have system which requires 100G+ of live data, IMO it is much better, looking from development (easier development model), execution (no penalty for distributed operations) and operations (single machine to manage and monitor) perspective, to scale that on single machine than on the cluster.

    If you have requirement for 350G cache size, and single node can host 20G, you end up with cluster of ~15 nodes, which has bigger operations cost, is slower (perhaps it is not noticable to application, but networking will again influence some other system sharing the same network) and has harder development model (you can't hide distributed nature of the system).

    To conclude my opinion: I see the market need for BigMemory.

    I've also seen cases where caches greater than 2Gb or 15Gb on a 32bit JVM would have come in handy. I've seen cases where  a customer's data center was near full capacity, but they had lots of mainframes. Asking them to start up a couple of LPars with 64Gb of ram each and create a 50Gb cache would be easier than asking the infrastructure people to find room for 20 cheap servers.

    For me, the factors I'd consider with regard to using fewer nodes with large caches would be: current hardware availability, capital budget, established rules about production hardware, architecture of the application, size of cache, projected growth for the next 12/24 months and "buy in" from the stake holders.

    I can see a startup leaning towards lots of cheap systems, but honestly finding the rack space often becomes an issue. The time it takes to buy, setup, configure and test a couple dozen or hundreds of servers reaches isn't that easy, unless you're Google. For many existing financial firms and health care providers, they have lots of mainframes sitting around. Why not load up a cache with 80Gb of data :)

  12. *thumbs*[ Go to top ]

    *thumbs* 

    Great discussion.

  13. 32 bits, eh?[ Go to top ]

    Why wouldn't you just crank up a 64-bit JVM, then, and roll that way? Or, for that matter, if 20 commodity servers is hard to find room for: get an Azul system, and manage it that way.

    (Taking off devil's advocate hat for a bit now.)

  14. 32 bits, eh?[ Go to top ]

    Why wouldn't you just crank up a 64-bit JVM, then, and roll that way? Or, for that matter, if 20 commodity servers is hard to find room for: get an Azul system, and manage it that way.

    (Taking off devil's advocate hat for a bit now.)

    This isn't meant to be a slam on IBM centric shops, but I've seen places with mainframes that don't use 64bit JVM for a simple reason. It's not approved. Getting approval can be done, but it can take 1 year from start to finish. If a company already has a dozen fully loaded mainframes, it would be silly to not consider using large caches on 32bit JVM. I don't like that kind of corporate infrastructure policy, but they do exist.

    I'm think we can agree those kinds of policies are a pain, but some times you have to work within those limits.

  15. "For me, the factors I'd consider with regard to using fewer nodes with large caches would be: current hardware availability, capital budget, established rules about production hardware, architecture of the application, size of cache, projected growth for the next 12/24 months and "buy in" from the stake holders."

    I second this. There are some many variables involved in decision making, including those coming from the top with which you can't influence at all, that having different solutions for same set of problems is beneficial, and BigMemory seems to be just that: solution that satisfies certain set of requirements some businesses have.

  16. "For me, the factors I'd consider with regard to using fewer nodes with large caches would be: current hardware availability, capital budget, established rules about production hardware, architecture of the application, size of cache, projected growth for the next 12/24 months and "buy in" from the stake holders."

    I second this. There are some many variables involved in decision making, including those coming from the top with which you can't influence at all, that having different solutions for same set of problems is beneficial, and BigMemory seems to be just that: solution that satisfies certain set of requirements some businesses have.

    @Peter Lin and @Chief Thrall

    Gentlemen, I agree with all the points you have presented here.  The analysis was thoughtful, and I'll definitely be checking out BigMemory in the near future.

    We need to give the JVM all the help it needs in squeezing out the last bit of memory a server can offer without the GC stopping traffic to collect the garbage.  Plus BigMemory sounds greener ( i.e. less servers over more servers in a cluster) :) 

  17. Yes, in practice in the environments I've seen - again, anecdotal data from me, myself, and I - a 15G cache is more than enough to handle live data (which can be and often is much larger.)?

     

    A lot of RAM is needed for example for optimization calculations. I worked on a train schedule optimization application for a little Swiss private railway company  for which 4 GB was just barely enough. Larger railway companies only run the planning for several regions and plug the regional optimization results together, because of running the planning for the entire area in once would require that much RAM that it is simply not possible. There are more applications to shared memory than just doing the caching for a web server ... :-)

  18. Oh, I don't doubt the existence of applications with huge memory requirements. I've written them myself! But I'd say that those apps wouldn't benefit quite so much from an unlimited cache as an unlimited heap.

  19. key/value etc.[ Go to top ]

    Not to be too contrarian, but key/value access (not query access) is what most people who are caching data need, and caches do that very efficiently. Look no further than the success of memcached and various nosql databases. For that matter, look at ehCache.

    At any rate, it looks like this will be a nice addition to ehCache.

    Peace,

    Cameron Purdy | Oracle Coherence

    http://coherence.oracle.com/

  20. key/value etc.[ Go to top ]

    Not to be too contrarian, but key/value access (not query access) is what most people who are caching data need, and caches do that very efficiently. Look no further than the success of memcached and various nosql databases. For that matter, look at ehCache.

    Cameron there are countless examples that shows that many of the key/value store eventually added query semantics. Google Big table did that through JPA,  HBase through Hive/Pig and even Coherence announced the support of SQL query. There is a common reason for that. When the data store  manages larger amount of data it becomes more critical to your application and with that comes the demand for better query support and  transaction semantics. If the off-heap architecture can be useful for simple key/value semantics your limiting yourself severely from future enhancment on that regard.

    Personally i believe that the big opportunity in BigMemory is by making the memory the system of record and not just as a simple cache-aside as in Memcache. 

     

  21. key/value etc.[ Go to top ]

    Not to be too contrarian, but key/value access (not query access) is what most people who are caching data need, and caches do that very efficiently. Look no further than the success of memcached and various nosql databases. For that matter, look at ehCache.

    At any rate, it looks like this will be a nice addition to ehCache.

    Peace,

    Cameron Purdy | Oracle Coherence

    http://coherence.oracle.com/

    Cameron, this is what I was saying about how the tool determines that everything looks like a nail (because all the tool is, is a hammer.) You mentioned NoSQL dbs and memcached; I don't think they're in the same ballparks!

    IMO, if you're using a pure cache, and ehcache fulfills your needs, by all means, use it - which I've said a few times. However, I'd say that once you've gone to a data grid that can support simpler and more powerful architectures - and even Coherence would qualify, here, although obviously I prefer a competitor - it's always painful and slightly sad to go backwards.

  22. Scale-out vs Scale-up[ Go to top ]

    There are basically two parts to the discussion WRT to Big Memory.

    1. Scaleup vs Scale out - I actually wrote a fairly lengthy analysis on that part http://natishalom.typepad.com/nati_shaloms_blog/2010/09/scale-up-vs-scale-out.html
      here where i pointed out the reason why people still scale-out even if their entire data fits in a single machine. Scaling out doesn't mean small commodity hardware scale-out means that we can basically leverage big-memory to manage our entire data which could be XTB and not just portion of the data as most system do today. A http://www.gigaspaces.com/files/main/GigaSpaces-Cisco_Joint_Solution_WP-May16.pdfrecent benchmark that we did on Cisco UCS actually proves that.

    2. The other part of big-memory discussion is whether we need to bypass the JVM to manage big memory. The claims so far that justify the Terracotta approach argues that:

    a. JVM tuning is a pain

    b. Managing Heap for limited key/value caching can be done more effectively by an external heap manager that can be tailored for key/value caching scenarios 

    MY argument to that – is that this is one of the options to handle large memory and not necessarily the best one.

    The alternative approach that I see to this would be:

    1. Using a pre-tuned appliance model could take much of the operational tuning complexity

    2. Newer GC version such as G1 does a pretty good job in managing x10G and Azul Zing claim to provide pausless GC that can handle x100G memory.

    So the point is not whether there is a benefit in having large memory but whether moving data off heap is necessarily the right one. As I pointed out there are alternative approach for dealing with BigMemory without moving the data off the heap with all the benefit that comes with that i.e. performance all your existing tools will work just the same i.e. debugger, profiler, memory-dump etc.

    Speaking of operational cost - having a solution that manage memory outside your VM could reduce potentially the operational cost of JVM tuning but the fact that most other tools such as system monitoring, debuggers, profilers etc wouldn't work with it could actually turn to be a fairly big operational cost and risk (assuming that there is going to be a maturity cycles as in any new technology) so your basically trading one operational cost with the other.


  23. Note to editor[ Go to top ]

    What happened to the preview button?

    Also hyper link doesn't work the way the used to! is there a way to edit previous post? 

  24. Scale-out vs Scale-up[ Go to top ]

    "Scaleup vs Scale out"

    I dont necessarily think this is about scale up or out, but rather about utilizing existing hardware/software in more efficient manner with increased SLA guarantees. (No GC means better SLA, draining entire available RAM via heap bypass means better utilization of existing hardware investment, avoiding management of 15 machines in grid means less operating expenses etc). I dont see what prevents BigMemory to scale both up and out.

    One thing regarding the Azul is that it is not plug and play with the remaining stack, since last time I checked they roll they own OS, toolchain, JVM etc (based on open source projects of course) which is a risk, and it is important factor in decision making: if BigMemory is plug and play solution that works and integrates more easily with existing infrastructure and provides same benefits as Azul, I would go with BM since it is less risky. With Big Memory I can inject cache without having to modify almost nothing, at least that is how I hope it will work. It is transparent to application, I just plug it in, and boom, it works.

  25. Scale-out vs Scale-up[ Go to top ]

    To continue from my last post:

    "1. Using a pre-tuned appliance model could take much of the operational tuning complexity

    2. Newer GC version such as G1 does a pretty good job in managing x10G and Azul Zing claim to provide pausless GC that can handle x100G memory."

    I do not believe either of those are better when all factors are considered: 2) is not plug and play and too intrusive, and 1), if I understood the meaning correctly, refers only to GJ tuning problem, which is just tip of the iceberg of what we have discussed thus far.

  26. Scale-out vs Scale-up[ Go to top ]

    I dont necessarily think this is about scale up or out, but rather about utilizing existing hardware/software in more efficient manner with increased SLA guarantees. (No GC means better SLA, draining entire available RAM via heap bypass means better utilization of existing hardware investment, avoiding management of 15 machines in grid means less operating expenses etc). I dont see what prevents BigMemory to scale both up and out

    i'm not sure i follow your argument. My point was that the question is not about trading big-memory vs lots of small memory instances per-se.

    Assuming that in both options were talking on the same size of VM's and same cluster size where do you see the operational benefit with memory bypass? 

    Azul is based on Virtual Applicance i.e. you spawn a VMware or KVM instance to spawn your big memory. Do you see an operational complexity with that option?

    WRT to the stack how do you think monitoring tools, debuggers etc. would work with Memory Bypass? 

     

     

  27. Scale-out vs Scale-up[ Go to top ]

    "i'm not sure i follow your argument..My point was that the question is not about trading big-memory vs lots of small memory instances per-se."

    My error then, I misread your post.

    "Assuming that in both options were talking on the same size of VM's and same cluster size where do you see the operational benefit with memory bypass? "

    If I have system with 300GB cache, if I use BigMemory on single machine I have high (all) SLA guarantees, low operations cost, monitoring discussed below.

    If I use for example, clustered GS with 15 commodity machine cluser (20GB per machine), I have high (GC) SLA guarantees (you have penalty for networking and such), higher operations cost, good (but more complex) monitoring.

    If I use single GS node (same hardware as BM deployment), I have lower SLA guarantees (GC pauses), low operations cost, good monitoring.

    Do you agree with these statements?

    "Azul is based on Virtual Applicance i.e. you spawn a VMware or KVM instance to spawn your big memory. Do you see an operational complexity with that option?"

    The main problem with Azul is high risk in case things go wrong (critical bug detected etc). Last time I checked they are shipping their own Linux build, own gcc toolchain build, own JVM build and own hardware, and that is lots of places where things can go wrong since they ship entire stack. I hope that you understand what I am targeting here, it is clear Azul has good product, but there is also significant risk exposure associated with the fact that you are based entirely on single vendor solution.

    "WRT to the stack how do you think monitoring tools, debuggers etc. would work with Memory Bypass?"

    There are couple of simple ideas that pop into my layman's mind, such as that BM internally proxies all calls to BM, aggregates those statistics and statuses and exposes them to monitoring tools (inclduing integration with existing solutions).The thing that you probably can't expose is perhaps interactive direct debugging of BM off-heap memory layout and such, but BM internally can know pretty much anything that you may want to monitor. Do you have any specific monitoring concerns?

  28. Scale-out vs Scale-up[ Go to top ]

    If I have system with 300GB cache, if I use BigMemory on single machine I have high (all) SLA guarantees, low operations cost, monitoring discussed below.

    If I use for example, clustered GS with 15 commodity machine cluser (20GB per machine), I have high (GC) SLA guarantees (you have penalty for networking and such), higher operations cost, good (but more complex) monitoring.

    The comparison could be the exact same machine with 15 Vm's 20GB each on that same machine and not necessarily 15 machines 20G each.

    In GigaSpaces case you only need an agent per machine regardless of the configuration (Big/Smal VM's, Scale-up/out) that you choose. Basically the agent will span VM's first on the same machine and then on other machines as your capacity requirements grows. It will also kill VM's to free the memory as soon as there is no need for it. 

    From an operational perspective your not exposed to the management of the actual VM's. That is taken care of by the agent. We also provide an option to use agent-less machine and provision machine on-demand so in essence your operational complexity could be significantly lower while your flexability remains highe. Our recent UCS integration  does that on bare-metal machine.

     

    If I use single GS node (same hardware as BM deployment), I have lower SLA guarantees (GC pauses), low operations cost, good monitoring.

    Do you agree with these statements?

    The lower SLA argument at this point is theoretical and i personally have lots of doubts that this is indeed lower as been presented. Unfortunatley it would take time to be at a point where we could have more objective comparison between regular Large memory heap and off-heap scenarios. Until that happens i wouldn't take everything that has been said on that regard with a grain of salt.

     

    The main problem with Azul is high risk in case things go wrong (critical bug detected etc). Last time I checked they are shipping their own Linux build, own gcc toolchain build, own JVM build and own hardware, and that is lots of places where things can go wrong since they ship entire stack. I hope that you understand what I am targeting here, it is clear Azul has good product, but there is also significant risk exposure associated with the fact that you are based entirely on single vendor solution.

    I see your point. Personally i believe that the risk could work another ways as well i.e. the fact that they control the entire stack translates to less moving parts,  less tuning overhead etc. better testability and easier support (on their end). That should translate to lower maintenance and lower chances for failures as a result of miss-configuration (Many statistic shows that 80% of the failures happens due to human error and missconfiguration). See a recent study on that regard here:

     70% of disasters are caused by human error. This statistic was generated by the Uptime Institute following a study of 4,500 data center issues.

      On the monitoring side you mentioned:

    There are couple of simple ideas that pop into my layman's mind, such as that BM internally proxies all calls to BM, aggregates those statistics and statuses and exposes them to monitoring tools (inclduing integration with existing solutions).The thing that you probably can't expose is perhaps interactive direct debugging of BM off-heap memory layout and such, but BM internally can know pretty much anything that you may want to monitor. Do you have any specific monitoring concerns?

    I'm sure that there are probably ways to get over those issues. It sounds to me that your underestimating the effort, the complexity of doing that and therfore the alternative risk that is associated with that limitation.

     

     

  29. Scale-out vs Scale-up[ Go to top ]

    The comparison could be the exact same machine with 15 Vm's 20GB each on that same machine and not necessarily 15 machines 20G each.

    From my perspective this is not optimal solution, yes it works, but if BM works too, it provides better solution since in your case there is waste of RAM and OS resources per every JVM instance. For example, for hello world application (simple program which just invokes System.in.read()) on my machine, with default configuration, there is waste of 5 MB, and for cluster of 15 JVM instances on single machine for this hello world application with default JVM parameters you have waste of 75MB.  I personally would prefer better utilization of my existing hardware resources, and BM seems to do so.


    "The lower SLA argument at this point is theoretical and i personally have lots of doubts that this is indeed lower as been presented."

    In concluded that GS with single machine with single JVM would have lower SLA because only Azul claims to have solved GC problem, and from my understanding you don't have pauseless GC implementation at the moment. In case you use multi-JVM configuration on single machine, see above.

     

    "Personally i believe that the risk could work another ways as well"

    Well it is clear that we disagree on this topic, since I prefer ironed solutions (such as Sun's JVM which has been used in production millions of times).

     

    "It sounds to me that your underestimating the effort, the complexity of doing that and therfore the alternative risk that is associated with that limitation."

    This is entirely up to Terracotta (I don't care if it is complex or not, only if it works), and if they deliver monitoring/profiling support, that's one more benefit I as user can include in decision process which caching solution to use, and in case they don't, then it is one drawback to be considered.

  30. Scale-out vs Scale-up[ Go to top ]

    The comparison could be the exact same machine with 15 Vm's 20GB each on that same machine and not necessarily 15 machines 20G each.

    From my perspective this is not optimal solution, yes it works, but if BM works too, it provides better solution since in your case there is waste of RAM and OS resources per every JVM instance. For example, for hello world application (simple program which just invokes System.in.read()) on my machine, with default configuration, there is waste of 5 MB, and for cluster of 15 JVM instances on single machine for this hello world application with default JVM parameters you have waste of 75MB.  I personally would prefer better utilization of my existing hardware resources, and BM seems to do so.

    This is a straw man - you'd not need BM for "Hello, World." I still think that a pure-cache application, while it could benefit from a large read-mostly cache, would do better with distributed heap.

    More failure-tolerant; the benefit of a unicycle is that you have fewer moving parts, but the downside is that the unicycle is harder to ride and the flat tire is a "problem with finality." A distributed system, meaning one that lives on multiple machines and not just multiple JVMs on one machine, can tolerate isolation/machine failure whereas having everything on one box cannot. If it's an orange-jumpsuit application (meaning, "one whose failure can lead to you having a nice orange jumpsuit"), putting all your eggs in one basket ain't so wise.

    Another aspect is that read-mostly caches typically want to do something with the data being cached; that means processing, and huge gouts of memory available to a static amount of processing power means that there's an imbalance in your future - although this assumes that you've a decent ratio of processing power to RAM already. If you've lots of CPU power to spare, well, obviously the ratios change in your favor.

    A third aspect, one that others have pointed out (even BigMemory, obliquely) is that most cache requests focus on a small subset of the entire dataset; in that case, you have to factor in the cost of a cache miss. If it's not that big of a deal for you to miss something in cache, then you might as well cache the things you use a lot, and ignore the things you don't. A read-mostly cache has negligible effect on a GC cycle in the first place.

    "The lower SLA argument at this point is theoretical and i personally have lots of doubts that this is indeed lower as been presented."

    In concluded that GS with single machine with single JVM would have lower SLA because only Azul claims to have solved GC problem, and from my understanding you don't have pauseless GC implementation at the moment. In case you use multi-JVM configuration on single machine, see above.

    Have you actually tested Azul, to see that it's a risk to use their toolchain? Or is this FUD? Given your reasoning, I suspect FUD, although I don't think it's malicious.

    "Personally i believe that the risk could work another ways as well"

    Well it is clear that we disagree on this topic, since I prefer ironed solutions (such as Sun's JVM which has been used in production millions of times).

    Uh oh, that DOES sound like FUD.

    "It sounds to me that your underestimating the effort, the complexity of doing that and therfore the alternative risk that is associated with that limitation."

    This is entirely up to Terracotta (I don't care if it is complex or not, only if it works), and if they deliver monitoring/profiling support, that's one more benefit I as user can include in decision process which caching solution to use, and in case they don't, then it is one drawback to be considered.

    Heh, if you don't care if it's complex or not, only that it works, then Azul enters back into the conversation!

    It's okay, honestly; IMO you've a bias towards Ehcache and Terracotta, which is fine... but your arguments sound like confirmation bias rather than standalone arguments. "It's complex, that's bad for product X, because complexity is bad" right alongside "It's complex, that's fine for product Y because I don't care as long as it works" doesn't quite hold up.

    I really, really, really wish I had a machine capable of handling 64G ram - and that I had a local cluster that also comprised a decent amount of RAM. I'd love to test these things out with varying read/write ratios.

    Any sponsors? :)

  31. Scale-out vs Scale-up[ Go to top ]

    "Have you actually tested Azul, to see that it's a risk to use their toolchain?"

    No I haven't tested, I am using common sense here. Azul is great idea and great product, and I would recommend to anyone to check out their product if they have problem from similar domain space and see if it meets their needs, but as you may know there is risk factor in any investment, including this one. If my problem can be solved with multiple technical solutions, I would always choose one with the least amount of risk (risk factors are: market share, workforce skilled to work/fix given tools, how easy it is to replace tool from my system, revenue/profit of vendor ...).

     

    "Or is this FUD? Given your reasoning, I suspect FUD, although I don't think it's malicious....Uh oh, that DOES sound like FUD"

    This is uncalled for. I have politely expressed my point of view, and you are basically flagging all effort that went into writing my posts as FUD just because you do no agree with me. I planned to refute your post point-by-point but no need now since I see that you don't want to hear what I have to say, or just accept that we have different opinions, you just want me to agree with you.

  32. Scale-out vs Scale-up[ Go to top ]

    No, I don't think all the effort that went into your posts is FUD. Like I said, I don't think it's malicious, nor do I misunderstand you - but you're using contradictory reasoning to justify a preference.

    Look: the use cases for ALL of this (BigMemory, distributed computing) are outliers: situations for which the normal solutions have failed or been proven insufficient. The nature of the problems themselves say that the tried and true "ironed solutions like Sun's JVM which have been used in production millions of times" (sic) are not sufficient, and outliers like Azul, BigMemory, etc., are options.

    Making a claim that the "ironed solutions" are preferable isn't justified by the nature of the problem itself. If the proven solutions were so great, you'd just allocate a heck of a JVM-managed heap, stuff a lot of tenured data on it, and use ConcurrentHashMap or a variant (depending on the specific nature of the cache you need).

    As stated, the cost of a stop-the-world GC of a large, active heap- the raison d'etre of BigMemory in the first place - is easily avoided, because tuning a JVM isn't hard, and most of the data in a cache is simple to refer to or expire, based on requirements; a GC simply isn't as expensive as people seem to think it is. (It can be, but it doesn't have to be.)

    Remember: it's a cache! The data in it is fairly static as it is, so GC isn't going to spend much time collecting (or walking, for that matter) the information in it.

    Again, for the Nth time, this is not to say to avoid BigMemory. If you're more comfortable with it, by all means, please use it.

    But saying that products like Azul are unsatisfactory because they present "more risk" when compared to BigMemory is misinformed. Not malicious - at no point do I think you're showing any malice or intent to deceive. But it's misinformed nonetheless.

    At the heart of it, if Bigmemory were such a grand idea, I think that other cache vendors would have kept their similar implementations around and more in the forefront. They haven't, and given the issues around off-heap memory, I think I understand why; it's just not worth it for what you get out of it.

    Using a failed solution (with "failed solution" meaning "a solution that others have tried in the past and abandoned") seems to be HIGHER risk than using a lateral thought in the first place. (Although I suppose "higher risk" means "has an unknown success rate," whereas other approaches based on off-heap memory have a known success rate - one that's not very high.)

    My friend, I'm not accusing you of anything. I disagree with your reasoning, and I don't think the reasoning you've presented is entirely justified. I wasn't trying to call you out.

    My humblest apologies for any offense taken.

  33. Scale-out vs Scale-up[ Go to top ]

    Well ok then, no offense taken, I am glad to hear that we agree that we don't agree. Feel free to ignore post I made before this one regarding the "crossing the line".

    I will soon answer in more detail.

  34. Scale-out vs Scale-up[ Go to top ]

    "But saying that products like Azul are unsatisfactory because they present "more risk" when compared to BigMemory is misinformed...Making a claim that the "ironed solutions" are preferable isn't justified by the nature of the problem itself."

    If we assume that I have a problem which can be solved successfully with both Azul and BM, the reason why I believe that Azul rates higher on risk scale is that in worst case scenario (such as complex bug etc), in BM case I have to handle single Java library, and in case of Azul I might need to handle multiple components including JVM, gcc-toolchain, OS or hardware itself. What happens in case of fatal bug? In BM case I could probably patch the system with lower SLA in couple of hourse (fallback to heap based cache), and in Azul case what can I do if God forbid there is a bug in their JVM, or OS or in coherence protocol used in their CPUs? I can't do anything and system is down until it is resolved, and being down may not be an option. (I am not saying there are bugs in either of those solutions, I am saying what are my options in case bugs potentially occur)

    This does not mean that either of these solutions are technically superior, but if they have approx. the same quality, to the point that my application can not detect the difference due to its specific business requirements, I would choose less risky one.

     

    "As stated, the cost of a stop-the-world GC of a large, active heap- the raison d'etre of BigMemory in the first place - is easily avoided, because tuning a JVM isn't hard,"

    I tend to disagree, since it (the claim that GC can be easily tuned for large data sets) is contradicted by innovation from Azul, and I doubt Azul would create their entire stack if there was easier solution.

     

    "Heh, if you don't care if it's complex or not, only that it works, then Azul enters back into the conversation!"

    I don't care how it works if it is not intrusive both in technical and business terms. There is a difference between using single Java library (BM) which I can plugin into my existing system, and between introducing new JVM, new OS and new hardware, and similarly for grid based solutions I am introducing new hardware, new programming model and so on. Some people don't care about those factors, and some do.

    And to conclude, I see business need for all solutions: Azul, BM, GS, Coher and others, no products is satisfactory for all of them.

  35. Scale-out vs Scale-up[ Go to top ]

    "As stated, the cost of a stop-the-world GC of a large, active heap- the raison d'etre of BigMemory in the first place - is easily avoided, because tuning a JVM isn't hard,"

    I tend to disagree, since it (the claim that GC can be easily tuned for large data sets) is contradicted by innovation from Azul, and I doubt Azul would create their entire stack if there was easier solution.

    This would only be truly valid if Azul's sole claim for being was GC - it's not. It's also massive multithreading.

    And to conclude, I see business need for all solutions: Azul, BM, GS, Coher and others, no products is satisfactory for all of them.

    Fully agreed, and something I've been saying for the whole conversation.

  36. Scale-out vs Scale-up[ Go to top ]

    "Any sponsors?"

    You are crossing the line here by implying that I am somehow sponsored to post here since now you are badmouthing Terracotta as well, and they have nothing to do with me. I have been posting on this account for years, and have defended Sun, Oracle, Google and many more when I subjectively believed that objective truth is on their side.

  37. Scale-out vs Scale-up[ Go to top ]

    "Any sponsors?"

    You are crossing the line here by implying that I am somehow sponsored to post here since now you are badmouthing Terracotta as well, and they have nothing to do with me. I have been posting on this account for years, and have defended Sun, Oracle, Google and many more when I subjectively believed that objective truth is on their side.

    Um, no? I was asking for people to sponsor me having a decent cluster of machines at home that I could use to test. Note the paragraph previous to "any sponsors?", written by me:

    I really, really, really wish I had a machine capable of handling 64G ram - and that I had a local cluster that also comprised a decent amount of RAM. I'd love to test these things out with varying read/write ratios.

    Any sponsors? :)

    At no point was I suggesting that you were being paid to say anything, mate.

  38. Scale-out vs Scale-up[ Go to top ]

    Yeah, nevermind, ignore my post. I overreacted. I'll reply soon to your post with technical arguments in attempt to summarize my points.

  39. Scale-out vs Scale-up[ Go to top ]

    The comparison could be the exact same machine with 15 Vm's 20GB each on that same machine and not necessarily 15 machines 20G each.

    From my perspective this is not optimal solution, yes it works, but if BM works too, it provides better solution since in your case there is waste of RAM and OS resources per every JVM instance. For example, for hello world application (simple program which just invokes System.in.read()) on my machine, with default configuration, there is waste of 5 MB, and for cluster of 15 JVM instances on single machine for this hello world application with default JVM parameters you have waste of 75MB.  I personally would prefer better utilization of my existing hardware resources, and BM seems to do so.

     Utilization is a broad term, let me explain:

    With Memory By Pass - MBP (let's not use the term Big Memory as in both cases were talking on Big Memory, just different ways to achieve it) your trading throughput with predictable latency.  The benchmark indicate ~100k ops/sec. With standard HashTable your expected to get x100, x10 the throuputt for the same workload. So in terms of utilization your actually lowering the utilization severely to achieve predictable latency.  My point on that regard is that even without pausless GC, standard VM with proper tuning would't yeild any significant GC spikes under the EXACT SAME SCENARIO i.e. significantly lower throughput, 90% read where out of the 90% only 10% of the kyes are accessed.   (This calls for further independent validation).

    To sum that up with mutliple VM's per machine your trading a fruction of % memory overhead (75M  out of 350G) with x100% lower utilization on throughput. 

    Now with that said if your only looking for a side cache (limited to key/value) with big memory your best choice is probably ConcurrentHashMap with proper GC tuning. If your looking for read/write cache that will front end your database with built-in redundancy then you would need transaction support, extended query semantics to manage that data. 

    Currently the entire discussion around Memory-By-Pass is centered around the first scenario and doesn't seem to apply to the 2nd (many of the performance assumption would break).  Would you agree with that statement?

    If that is the case then the Memory-By-Pass approach may just be an overkill. 

     

     

     

     


    "The lower SLA argument at this point is theoretical and i personally have lots of doubts that this is indeed lower as been presented."

    In concluded that GS with single machine with single JVM would have lower SLA because only Azul claims to have solved GC problem, and from my understanding you don't have pauseless GC implementation at the moment. In case you use multi-JVM configuration on single machine, see above.

     

    "Personally i believe that the risk could work another ways as well"

    Well it is clear that we disagree on this topic, since I prefer ironed solutions (such as Sun's JVM which has been used in production millions of times).

     

    "It sounds to me that your underestimating the effort, the complexity of doing that and therfore the alternative risk that is associated with that limitation."

    This is entirely up to Terracotta (I don't care if it is complex or not, only if it works), and if they deliver monitoring/profiling support, that's one more benefit I as user can include in decision process which caching solution to use, and in case they don't, then it is one drawback to be considered.

  40. Scale-out vs Scale-up[ Go to top ]

    "To sum that up with mutliple VM's per machine your trading a fruction of % memory overhead (75M  out of 350G) with x100% lower utilization on throughput."

    1) I don't have GS installation, but do have Coherence, and single instance with 0 data uses 44MB, which for cluster of 15 members sums up to 660MB, which is small percentage of 300GB, and I agree with you that memory utilization for this specific case (300GB, 15 cluster members, 20GB per member) is not relevant. That does not mean that memory utilization may not be relevant with different data distribution patterns.

    2) If you meant that single machine multi-VM GS based configuration has higher througput I tend to disagree since I don't think that access to off-heap is slower than networking coordination & transfer, even if it is loopback (for this 300GB case). If you however meant that simple Map approach offers higher throughput, I disagree with this as explained below.

     

    "With standard HashTable your expected to get x100, x10 the throuputt for the same workload."

    3) I do no agree that current GC algorithms (excluding Azul) can handle large heap sizes (30GB+) for non-static data. We could argue what defines data as static (which write percentage etc), but observations I made convince me so, hence I do not agree that HashTable/Map based approach is viable. Unfortunately I don't have hardware to test that myself now.

     

    "If your looking for read/write cache that will front end your database with built-in redundancy then you would need transaction support, extended query semantics to manage that data. "

    4) Well indeed, if I need redundancy or extended query support, BM does not seem to solve that type of problem, and I was trying to point out that there are sets of problems with key/value behavior that can be solved with BM.

    5) To summarize: a) I disagree that standard GC (excluding Azul) can handle efficiently huge heaps (hence Map approach does not hold), b) single machine multi-VM grid solution adds both memory penalty (though not significant) and performance penalty, c) if I do not need redundancy and extended query support and similar, but rather have problem that fits to what BM can solve, I would consider BM as possible solution, d).

    6) Only one thing can be 100% correct as to which caching solution to choose: benchmark them for your specific application and see what you gain and what you lose.

  41. Scale-out vs Scale-up[ Go to top ]

    2) If you meant that single machine multi-VM GS based configuration has higher througput I tend to disagree since I don't think that access to off-heap is slower than networking coordination & transfer, even if it is loopback (for this 300GB case). If you however meant that simple Map approach offers higher throughput, I disagree with this as explained below.

     

    I think that were running into a terminology gap. If in your view Big Memory is Local cache with Big HashTable then your right. I thought that we started this entire thread (scale-up vs out) with an understanding that Big Memory is the ability to manage Terra bytes of data in-memory effectively and on less machines. Even in a single machine environment i would assume that the common scenario would a shared (Central cache) serving multiple client/servers rather then having each server runs x100G on a single VM.

     

     

  42. The Ehcache BigMemory Performance Benchmark and Results are now published on ehcache.org

    Greg Luck
    Founder and CTO Ehcache, Terracotta

  43. Numbers look strong, and I really like ease of use and plug-and-play capability.

    This means only one thing to us consumers of cache solutions on the market: even more fierce competition, more innovation and better products, and hopefully with even lower prices.In best case caching shall become commodity (in case it is not so already), though I am not sure vendors would like that part to happen at all. :)

  44. Numbers look strong, and I really like ease of use and plug-and-play capability.

    This means only one thing to us consumers of cache solutions on the market: even more fierce competition, more innovation and better products, and hopefully with even lower prices.In best case caching shall become commodity (in case it is not so already), though I am not sure vendors would like that part to happen at all. :)

    The sentiment expressed in your last paragraph is only valid if Oracle doesn't buy up all the potentially viable cache solution vendors out there. They currently sell some three Java EE app servers, and some three or so DBMS solutions for example.  So selling a half dozen or so distributed caching solutions shouldn't be a problem. :)

  45. "The sentiment expressed in your last paragraph is only valid if Oracle doesn't buy up all the potentially viable cache solution vendors out there"

    Well let's hope such a thing does not happen. :)

  46. The Ehcache BigMemory Performance Benchmark and Results are now published on ehcache.org

    Greg Luck
    Founder and CTO Ehcache, Terracotta

    These results seem rather impressive.  Would be nice to see these verified by other independent groups very soon.

    I also like the seemingly easy setup and configuration part.

  47. Greg

    Thanks for sharing this information - much appreciated!

    I have few questions on that regard:

    You mentioned at one part that you measured 50% read/write scenarios

    We used 50 threads doing an even mix of reads and writes with 1KB elements. We used the default garbage collection settings.

    However you chose to present graphs that falls 90% read, 10% write. 

    The following charts show the most common caching use case. The read/write ratio is 90% reads and 10% writes. The hot set is that 90% of the time cache.get() will access 10% of the key set.”

    1. Is there a reason for that? Can you also share the results of the 50% read 50% writes?

    2. By stating that in your test "cache.get() will access 10% of the key set" i assume that it means that out of the 90% read only 10% will access the off heap storage and go through de-serialization process where all the rest will basically hit the in-memory reference is that right? 

    3. Have you tried your test against an optimized GC configuration i.e. G1 or something similar - the reason why i believe it's important is because it will help to identify whether the issue is GC tuning complexity or something with the actual algorythm that makes the current GC un capable to meet the type of workload that your representing.

    4. Have you done any comparison against the standard HashMap? the reason why i believe it could be interesting is:

    a. The main question is whether or not current GC can handle large memory. For that a test against standard HashMap will provide an objective results that could validate that statements.

    b. HashMap is not vendror specific and could serve as a good reference 

     

  48. "The main question is whether or not current GC can handle large memory....Have you tried your test against an optimized GC configuration"

    Hmm, I thought we all implicitly agreed that current GC algorithms (other than Azul's) can not solve GC SLA problem on large data sets.

  49.  

     


    1. Is there a reason for that? Can you also share the results of the 50% read 50% writes? 

    Yes, we have charts on the 50% reads and 50% writes scenario. This scenario does happen but it is not as common a caching use case in my experience. The performance results for the Ehcache off-heap store for this scenario are very good.

     

    I have uploaded those charts too. See http://ehcache.org/

    2. By stating that in your test "cache.get() will access 10% of the key set" i assume that it means that out of the 90% read only 10% will access the off heap storage and go through de-serialization process where all the rest will basically hit the in-memory reference is that right? 

     

    No. 10% of the key set varies according to the data size. In the 512MB data size then the hot set is 51MB which does fit in the on-heap store, which we set to a maxElementsInMemory equating to around 200MB for the test. Then in the 1Gb data size the 10% is 100MB, which also fits. Beyond 2GB it no longer does. Finally once we get to 40GB, 10% is 4GB. Only 5% fits in our 200MB on-heap store.

    This is why you see average latency and throughput start off quicker due to on-heap hits and then gracefully degrade down to a lower but linear performance. Of course at small sizes GC is also not a big problem.

    3. Have you tried your test against an optimized GC configuration i.e. G1 or something similar - the reason why i believe it's important is because it will help to identify whether the issue is GC tuning complexity or something with the actual algorythm that makes the current GC un capable to meet the type of workload that your representing.

    Not specifically for this test. I like I suspect the readers of this thread, have done a lot of GC tuning. Indeed I was doing it Friday for a customer. My own experience is that I can generally get some GC magic to make things work well up to around 6GB in the 90:10 read write case. But most people use Ehcache with much smaller heaps.

    The test source is up there. I suggest you give it a spin. You may get marginally better results in the 2GB to 8GB range but I doubt they will work up to the higher levels.

    4. Have you done any comparison against the standard HashMap? the reason why i believe it could be interesting is:

    a. The main question is whether or not current GC can handle large memory. For that a test against standard HashMap will provide an objective results that could validate that statements.

    b. HashMap is not vendror specific and could serve as a good reference 

    You mean java.util.HashMap? Ehcache 1.5 and lower used that with synchronization. This test benchmark, which uses 50 threads would give much worse performance. In Ehcache 1.6 I replaced synchronized with CAS using a variety of approaches. A 100 thread test with a mix of gets, puts and removes that I have been using for the last 7 years shows the dramatic performance improvements I got from replacing java.util.HashMap. See http://gregluck.com/blog/archives/2009/02/ehcache-1-6-2-orders-of-magnitude-faster/

    Greg Luck
    Founder and CTO Ehcache, Terracotta 

     

  50. Greg

    Thanks for the detailed response i appreciate it.

    You mean java.util.HashMap? Ehcache 1.5 and lower used that with synchronization

    I obviously meant ConcurrentHashMap not HashTable.

    I have a talk tomorrow  during JavaOne : NoSQL Alternatives: Principles and Patterns for Building Scalable Applications which may be a good place to discuss some of the points from this thread. If your around(Its on 8AM!)  i'll be happy if you could join the session and express your position in case the the question comes up during the session.

    Anyway i'll take a look at the code and see if i have further questions.

     

     

  51. Haha... how true, this:

    Over time, that means if my magical 300G cache does actually get saturated and fragmented, it has to…

    Run... a...

    Garbage collection cycle.

     

    I've had this discussion so many times with C++ programmers who are quick to diss the Garbage Collector until they come back with their tail between their legs agreeing that Fragmentation is a major problem for long running servers and the JVM's managed heap is much faster for object allocations consistently, across platforms. Cliff Click (Azul) explains why in great detail on his blog.

    I've written about it before - http://javaforu.blogspot.com/2008/05/memory-dont-forget-it-hurts.html and http://javaforu.blogspot.com/2010/01/sqlite-talk-select-from-sqliteinternals.html

    Case in point - see how many malloc replacements there are - jemalloc used by Firefox, Google TCmalloc, SQLite's allocator. memcached's slab allocator ... the list goes on.

     

    Cheers!

    Ashwin.

  52. Like Nati said, if it's 90% reads then there is really no fragmentation. The cached data is just sitting on the heap.

    But in a system where objects are allocated at a fast rate, like say 50-50 read-writes, that's when you will notice holes getting created in the heap, which becomes a problem over time when you don't use a (ahem) GC to compact the heap. That's why memcached, Firefox and Google and all those guys use custom (pre)allocators and malloc replacements in their C/C++ code.

    You will also notice that Java object allocation is faster because it does not have to make a native call to allocate memory every time and it's all pre-allocated if -xms and -xmx are the same.

    For a read-mostly pattern, plain old JVMs can hold large heaps. Facebook's Hadoop NameNodes are 50+G heaps - http://borthakur.com/ftp/conf.tar.gz (http://hadoopblog.blogspot.com/2010/05/facebook-has-worlds-largest-hadoop.html)

     

    Of course, this is not an attempt to belittle what you guys have built - you say it's all in pure Java and I like that! It's just that JVM's already have enough FUD against them, spread by various parties for their own reasons and I felt compelled to (attempt to) set the records straight :)

     

    Cheers!

    Ashwin Jayaprakash.

  53. Ashwin,

    Please note that as http://www.ehcache.org/ shows, 50/50 read write runs better for BigMemory than 90/10.  We chose to show 90/10 because it is more indicative of real-world cache usage.  But, yeah.  The more one writes to BigMemory, the more advantage over the normal JVM GC you see because we are never pausing and the more one writes, the more Java will pause--at least the more potential it has to pause.

     

    --Ari

  54. Agree with most of it[ Go to top ]

    I agree with mosts of that is in the article, with the exception of the fact that a GS is strictly needed when you have heterogenous block sizes. You can work around this limitation by using a paged cache and allowing objects to be cached in non-continuos pages. I am not claiming that this is what it was done for BigMemory, just that this is not strictly needed.

    I do understand that this adds some overhead and that you will be wasting some space here and there, but if BigMemory handles 300G of a cache, then a GS cycle there would be a lot more expensive than the paged approach.

  55. Query AND Scale-up vs. out[ Go to top ]

    All,

    Joe's original post makes many many assumptions.  I wanted to go on record and state:

    1. Query / search on the key/value store is indeed a good idea.  It was coming for Ehcache in September and got delayed a tad for BigMemory.  Search for Ehcache as well as for Ehcache+Terracotta will be here before you know it (this year).

    2. Ehcache BigMemory does not suffer GC-style problems.  We tested to be certain of that fact.  And there is no issue when the read/write rate gets closer to 50/50.  Joe seems to be assuming an implementation and then assuming some flaws which is fine because he clearly states he is writing his opinion...and he is not claiming he tested and found flaws (in fact he is asking for HW to test with).  Anyways, the test is publicly available now as are the results for 90/10 read+write and 50/50.  All are hosted over at ehcache.org.

    3. BigMemory works in the single Ehcache node as well as in the cluster.  As I have written on my blog and several other Terracotta folks have explained, BigMemory can be used in Terracotta servers and in Ehcache nodes.  This means you don't have to choose to scale up or out.  You can do both.  Example: 100GB of data in cache can be stored 10GB per JVM in a 10-node Ehcache+Terracotta cluster with BigMemory with no fear of pauses.  It can also be stored in a single JVM using just Ehcache at 100GB with no fear of pauses.  Or it can use Ehcache + Terracotta scaled out in a more "traditional" grid at 2GB / JVM at 50 nodes.

    Thanks for all the questions and curiosity.  Come see our booth at JavaOne this week.  We have 128GB of machines and can demo the same app running in Java + GC versus running in Java + BigMemory (working around the GC).  Definitely worth seeing.  Our engineers will also be able to answer all your questions.

    --Ari