Terracotta's BigMemory Has Gone GA

Discussions

News: Terracotta's BigMemory Has Gone GA

  1. Terracotta's BigMemory Has Gone GA (9 messages)

    Terracotta's submission in the field of managing its massive eCache memory stores has gone GA.

    "BigMemory is Terracotta’s structural solution to the garbage collection problem. As a pure Java add-on to Ehcache, BigMemory is an in-process, off-heap cache that is not subject to garbage collection. "

    You can read more in the press release, although a more interesting read might be Himadri Singh's blog posting about how Big Memory deals with latency, throughput and GC duration. 

    Big Memory Goes GA Press Release

    Himadri Singh's blog posting, "Terracotta Fairy" brings BigMemory for Java users

    Threaded Messages (9)

  2. Terracotta's BigMemory Has Gone GA[ Go to top ]

    This is a scam, of sorts.

    This is a *cache*, not a Java heap replacement. They surely don't assume we have unbounded memory or want to shuffle everything into a cache.

    Sorry guys, the GC is still with us. Not that bad, actually. IT'S JUST A FRIGGIN' CACHE.

  3. Big Memory[ Go to top ]

    Well, 'scam' is a pretty harsh word.

    If you've got 300 gigs of data in your JVM cache, and someone can move that away and take 99% or 95% or whatever out of your JVM and make it all lightning fast, well, that's a good thing.

    There are no doubt semantic arguments that might be made about the fact that there is still memory in the JVM or what have you, but that doesn't take away from the performance numbers they post and the benefits their customers get.

     

  4. Analysis of BigMemory[ Go to top ]

    I don't know that I'd go so far as to say it's a "scam," but my thought is, based on my tests and analysis, that if BigMemory's a solution for you, your problem is poorly-defined.

    I ran the same heap sizes that Terracotta did (90G, although they've now published 100G heap numbers) with a standard ConcurrentHashMap, and got not only better latency and throughput (by leaps and bounds, tens of millions of transactions/sec against their 200k) but I was also able to get better garbage collection results with fairly little effort.

    Sorry, BigMemory looks like the right solution for the wrong problem, for almost every definition of problem there is. You're far better off tuning your application and distributing it properly.

  5. Context[ Go to top ]

    I'm a bit surprised by negativity (something is really wrong with Terracotta marketing approach). But I think we need to keep in mind that this technology exists within in a context of Terracotta - a client-server technology that simply wasn't designed to work in a highly distributed environment. In such case - maxing our heap on high-end boxes becomes a necessity (otherwise it won't scale even for a simple problems). 

     

    It is a technology that is VERY specific to Terracotta and its underlying client-server architecture. And I'm sure every other vendor will cite different results from GC optimization on large heaps (we can, GigaSpaces does few posts up). 

     

    Best,

    Nikita Ivanov.

    GridGain - Cloud Computing With Zero Deployment.

  6. Context[ Go to top ]

    I'm a bit surprised by negativity (something is really wrong with Terracotta marketing approach). But I think we need to keep in mind that this technology exists within in a context of Terracotta - a client-server technology that simply wasn't designed to work in a highly distributed environment. In such case - maxing our heap on high-end boxes becomes a necessity (otherwise it won't scale even for a simple problems). 

     

    It is a technology that is VERY specific to Terracotta and its underlying client-server architecture. And I'm sure every other vendor will cite different results from GC optimization on large heaps (we can, GigaSpaces does few posts up). 

     

    Best,

    Nikita Ivanov.

    GridGain - Cloud Computing With Zero Deployment.

    A few of the times I've used Ehcache, it was for local cache on Z. In that case, it isn't distributed. For those purposes, Big Memory feels fine to me. I feel there's plenty of room for both models of cache usage.

  7. Context[ Go to top ]

    It *is* designated as a local cache, although you could distribute it via DSO - why you would do THAT is beyond me, honestly.

    But using it as a local cache gives you the performance characteristics of a distributed cache with none of the benefits.

    Let's assume you're using it as a local cache: every VM that you use it with, you get to allocate ginormous amounts of RAM to. Four VMs, 90G each (for example), and that's 360G of RAM you're using. And access to it is slow; the hot set for each VM will be fast (and likely duplicated), but any data not in the hot set is slow, almost as if were distributed in a worst-case scenario.

    At that point, you're better off USING a distributed cache. But wait! With DSO you can distribute it, right? But then you're crushing speed *again*. 

    The fact that they got BigMemory out the door is cool. That said, the range of apps it's suitable for is very limited, and most would benefit far more from investing elsewhere, in my honest and humble opinion, based on running THEIR test and comparing numbers.

  8. Context[ Go to top ]

    It *is* designated as a local cache, although you could distribute it via DSO - why you would do THAT is beyond me, honestly.

    But using it as a local cache gives you the performance characteristics of a distributed cache with none of the benefits.

    Let's assume you're using it as a local cache: every VM that you use it with, you get to allocate ginormous amounts of RAM to. Four VMs, 90G each (for example), and that's 360G of RAM you're using. And access to it is slow; the hot set for each VM will be fast (and likely duplicated), but any data not in the hot set is slow, almost as if were distributed in a worst-case scenario.

    At that point, you're better off USING a distributed cache. But wait! With DSO you can distribute it, right? But then you're crushing speed *again*. 

    The fact that they got BigMemory out the door is cool. That said, the range of apps it's suitable for is very limited, and most would benefit far more from investing elsewhere, in my honest and humble opinion, based on running THEIR test and comparing numbers.

    I would disagree with that argument as a general case. For some cases I would agree, but not the kind of cases I've encountered first hand on Z + Websphere. We used local cache for reference data that isn't transactional. After startup, the cache data was close to 1Gb. Add to that, the production environment can spin up a new LPar to accomodate increased load. At any given time, the number of LPar running could change.

    I don't think anyone can say exactly how many people fit that use case. From my experience in health insurance sector, there's a lot of companies on Z + Websphere + Db2. Distributed caches are great for scaling out, but many shops don't like that model due to their investment in IBM Z.

  9. Context[ Go to top ]

    Sure, and that's fine - but what you're describing is someone who removes their fuel injector, and then demands that their car not use fuel, because it's not working.

    (Best analogy I could come up with offhand.)

    If that's what you want to do, it's all good - but you should recognize that you're choosing to define the problem in such a way that the solution fits, instead of having a problem that has other solutions.

    (I don't mean to castigate you in any way; you have your own requirements, and there's nothing wrong with that.)

  10. Context[ Go to top ]

    Sure, and that's fine - but what you're describing is someone who removes their fuel injector, and then demands that their car not use fuel, because it's not working.

    (Best analogy I could come up with offhand.)

    If that's what you want to do, it's all good - but you should recognize that you're choosing to define the problem in such a way that the solution fits, instead of having a problem that has other solutions.

    (I don't mean to castigate you in any way; you have your own requirements, and there's nothing wrong with that.)

    Honestly, I have little love of Z + Websphere + Db2. If I can avoid it, I do. If I can't avoid it, I do the best I can to work within the given environment. In an ideal world, developer would get to choose the production hardware, but that's often not the case. In the health insurance world, I find IBM is quite dominant and many of them follow what IBM recommends.

    As much as I would like to change the problem definition, usually I don't have that power. Having lots of different products and options definitely helps in my mind.