The comparison could be the exact same machine with 15 Vm's 20GB each on that same machine and not necessarily 15 machines 20G each.
From my perspective this is not optimal solution, yes it works, but if BM works too, it provides better solution since in your case there is waste of RAM and OS resources per every JVM instance. For example, for hello world application (simple program which just invokes System.in.read()) on my machine, with default configuration, there is waste of 5 MB, and for cluster of 15 JVM instances on single machine for this hello world application with default JVM parameters you have waste of 75MB. I personally would prefer better utilization of my existing hardware resources, and BM seems to do so.
This is a straw man - you'd not need BM for "Hello, World." I still think that a pure-cache application, while it could benefit from a large read-mostly cache, would do better with distributed heap.
More failure-tolerant; the benefit of a unicycle is that you have fewer moving parts, but the downside is that the unicycle is harder to ride and the flat tire is a "problem with finality." A distributed system, meaning one that lives on multiple machines and not just multiple JVMs on one machine, can tolerate isolation/machine failure whereas having everything on one box cannot. If it's an orange-jumpsuit application (meaning, "one whose failure can lead to you having a nice orange jumpsuit"), putting all your eggs in one basket ain't so wise.
Another aspect is that read-mostly caches typically want to do something with the data being cached; that means processing, and huge gouts of memory available to a static amount of processing power means that there's an imbalance in your future - although this assumes that you've a decent ratio of processing power to RAM already. If you've lots of CPU power to spare, well, obviously the ratios change in your favor.
A third aspect, one that others have pointed out (even BigMemory, obliquely) is that most cache requests focus on a small subset of the entire dataset; in that case, you have to factor in the cost of a cache miss. If it's not that big of a deal for you to miss something in cache, then you might as well cache the things you use a lot, and ignore the things you don't. A read-mostly cache has negligible effect on a GC cycle in the first place.
"The lower SLA argument at this point is theoretical and i personally have lots of doubts that this is indeed lower as been presented."
In concluded that GS with single machine with single JVM would have lower SLA because only Azul claims to have solved GC problem, and from my understanding you don't have pauseless GC implementation at the moment. In case you use multi-JVM configuration on single machine, see above.
Have you actually tested Azul, to see that it's a risk to use their toolchain? Or is this FUD? Given your reasoning, I suspect FUD, although I don't think it's malicious.
"Personally i believe that the risk could work another ways as well"
Well it is clear that we disagree on this topic, since I prefer ironed solutions (such as Sun's JVM which has been used in production millions of times).
Uh oh, that DOES sound like FUD.
"It sounds to me that your underestimating the effort, the complexity of doing that and therfore the alternative risk that is associated with that limitation."
This is entirely up to Terracotta (I don't care if it is complex or not, only if it works), and if they deliver monitoring/profiling support, that's one more benefit I as user can include in decision process which caching solution to use, and in case they don't, then it is one drawback to be considered.
Heh, if you don't care if it's complex or not, only that it works, then Azul enters back into the conversation!
It's okay, honestly; IMO you've a bias towards Ehcache and Terracotta, which is fine... but your arguments sound like confirmation bias rather than standalone arguments. "It's complex, that's bad for product X, because complexity is bad" right alongside "It's complex, that's fine for product Y because I don't care as long as it works" doesn't quite hold up.
I really, really, really wish I had a machine capable of handling 64G ram - and that I had a local cluster that also comprised a decent amount of RAM. I'd love to test these things out with varying read/write ratios.
Any sponsors? :)