-
GridGain Scales to 512 Nodes on Amazon EC2 (13 messages)
- Posted by: Eugene Ciurana
- Posted on: August 06 2008 09:30 EDT
What are the scalability characteristics of EC2? Solid numbers on the performance of EC2 had been elusive. Max Gorbunov from GridGain executed a 512-node Monte Carlo simulation to find out how well Amazon EC2 performs and shared his results. The test consisted of a custom setup based on open-source components including GridGain and Open MQ running on the default EC2 Fedora Core 8 distribution and using a custom test harness developed for this project. The test showed near linear scalability from 2 to 512 nodes and good performance throughout the test. Max's article goes on to describe how the software was set up to work within the restrictions of the EC2 environment and how everything executed.Threaded Messages (13)
- Small correction by Nikita Ivanov on August 06 2008 14:09 EDT
- Re: GridGain Scales to 512 Nodes on Amazon EC2 by John Davies on August 08 2008 10:19 EDT
- Re: GridGain Scales to 512 Nodes on Amazon EC2 by Max Gorbunov on August 08 2008 13:06 EDT
- Re: GridGain Scales to 512 Nodes on Amazon EC2 by Rob Davies on August 11 2008 08:05 EDT
- Re: GridGain Scales to 512 Nodes on Amazon EC2 by Dmitriy Setrakyan on August 11 2008 16:51 EDT
- Re: GridGain Scales to 512 Nodes on Amazon EC2 by Joseph Ottinger on August 11 2008 11:57 EDT
- Re: GridGain Scales to 512 Nodes on Amazon EC2 by Dmitriy Setrakyan on August 11 2008 15:49 EDT
-
Re: GridGain Scales to 512 Nodes on Amazon EC2 by Joseph Ottinger on August 11 2008 06:20 EDT
-
Re: GridGain Scales to 512 Nodes on Amazon EC2 by Nikita Ivanov on August 11 2008 07:57 EDT
-
Re: GridGain Scales to 512 Nodes on Amazon EC2 by Joseph Ottinger on August 12 2008 05:39 EDT
-
Self-reply... because Nikita did sort of explain it by Joseph Ottinger on August 12 2008 07:33 EDT
- Re: Self-reply... because Nikita did sort of explain it by Kirk Pepperdine on August 19 2008 01:40 EDT
-
Self-reply... because Nikita did sort of explain it by Joseph Ottinger on August 12 2008 07:33 EDT
-
Re: GridGain Scales to 512 Nodes on Amazon EC2 by Joseph Ottinger on August 12 2008 05:39 EDT
-
Re: GridGain Scales to 512 Nodes on Amazon EC2 by Nikita Ivanov on August 11 2008 07:57 EDT
-
Re: GridGain Scales to 512 Nodes on Amazon EC2 by Joseph Ottinger on August 11 2008 06:20 EDT
- Let me clarify it for you... by Nikita Ivanov on August 11 2008 16:28 EDT
- Re: GridGain Scales to 512 Nodes on Amazon EC2 by Dmitriy Setrakyan on August 11 2008 15:49 EDT
-
Small correction[ Go to top ]
- Posted by: Nikita Ivanov
- Posted on: August 06 2008 14:09 EDT
- in response to Eugene Ciurana
Max works for Grid Dynamics that performed this test with GridGain software. Best, Nikita Ivanov. GridGain - Grid Computing Made Simple -
Re: GridGain Scales to 512 Nodes on Amazon EC2[ Go to top ]
- Posted by: John Davies
- Posted on: August 08 2008 10:19 EDT
- in response to Eugene Ciurana
This was interesting, can you please tell me the AMI you used or was it your own version. How did you set up the instances, there's still AFAIK no way to guarantee instances on the same subnet. Where the ActiveMQ problems related to it not scaling? Thanks, -John- -
Re: GridGain Scales to 512 Nodes on Amazon EC2[ Go to top ]
- Posted by: Max Gorbunov
- Posted on: August 08 2008 13:06 EDT
- in response to John Davies
This was interesting, can you please tell me the AMI you used or was it your own version.
We used our private AMI shared only with GridGain. We're going to make it public soon, so you can evaluate it.How did you set up the instances, there's still AFAIK no way to guarantee instances on the same subnet.
There is a way to co-locate instances in a single availability zone. In either case IP multicast is unavailable, so you have to use non-default DiscoverySPI to perform discovery. Please read comments to our blog, you can find some answers there. Best wishes, Max -
Re: GridGain Scales to 512 Nodes on Amazon EC2[ Go to top ]
- Posted by: Rob Davies
- Posted on: August 11 2008 08:05 EDT
- in response to Eugene Ciurana
Shame you used such an old version of ActiveMQ - version 5.1 is a lot better - and its currently being used with 500+ clients per broker in production - using plain old blocking I/O - and we've scaled to a thousands - and got better performance using nio. cheers, Rob -
Re: GridGain Scales to 512 Nodes on Amazon EC2[ Go to top ]
- Posted by: Dmitriy Setrakyan
- Posted on: August 11 2008 16:51 EDT
- in response to Rob Davies
Shame you used such an old version of ActiveMQ - version 5.1 is a lot better - and its currently being used with 500+ clients per broker in production - using plain old blocking I/O - and we've scaled to a thousands - and got better performance using nio.
Hi Rob, ActiveMQ is a very nice and easy to use JMS implementation and many of our users do use our JMS Grid Node Discovery implementation with ActiveMQ in production in environments where IP Multicast is not supported. I will make sure we download the latest ActiveMQ release and give it a shot. Best, Dmitriy Setrakyan GridGain - Grid Computing Made Simple -
Re: GridGain Scales to 512 Nodes on Amazon EC2[ Go to top ]
- Posted by: Joseph Ottinger
- Posted on: August 11 2008 11:57 EDT
- in response to Eugene Ciurana
Okay, I'm confused. First off, it should be clear to everyone: I'm a GigaSpaces Technologies employee, and as such I'm not an entirely neutral observer. But... this is meaningless. I think it's good that they, um, "scaled up" but... I don't quite understand what's being illustrated. If it's something like "Yay, we can use 512 of the EC2 nodes," well, that's good - but since EC2 can go up to 550 nodes, why not use them all? All you're doing is verifying the claim that you can use that many instances. Farming out the jobs via ActiveMQ is good - but it's also a well-accepted and well-known method for distributing jobs to worker nodes. Monte Carlo, though, has no transactional interdependencies, so you're not actually doing anything other than spawning tasks to worker nodes. Given that methodology, a degradation of 20% over a 256x growth of the "cluster size" is rather shocking - adding consumers added that much degradation? That'd... worry me more than anything else. Also: what was gained by the use of GridGain? Why not just set up an HTTP server to farm out requests RESTfully, and accept responses the same way? That way we'd be able to claim that HTTP can support up to 512 clients... (And yes, that was meant sarcastically. Any HTTP server that can't handle 512 stateless clients needs to be taken out and shot.) -
Re: GridGain Scales to 512 Nodes on Amazon EC2[ Go to top ]
- Posted by: Dmitriy Setrakyan
- Posted on: August 11 2008 15:49 EDT
- in response to Joseph Ottinger
Joseph, This is the usual problem with benchmarks - somebody will be unhappy. Unfortunately you seem to have lost your objective view on technology after having joined GigaSpaces... which is kind of understandable since you do now work for a competitor company. Now, if you have to ask what benefits GridGain brought to the picture, you have not visited our website and practically know absolutely nothing about our product. I suggest you do some minimal reading before posting such inflammatory comments. How about these features just to name a few: - Automatic node discovery - Transparent grid-enabling of Java code with @Gridify annotation. - One of the best MapReduce implementations in the industry - Zero deployment with Peer Class Loading - Automatic Task Topology Management - Load Balancing - Automatic Fail-Over - Grid job collision resolution - Job Stealing (from more busy nodes to less busy nodes) - Over 50 up-to-date metrics for all grid nodes - Elegance of design and ease of use - Open Source under LGPL and Apache license - Many, many more... Now, as far as 20% overhead... in the grid as big as 512 nodes a lot of factors come into play. Note that JMS hub needs to manage 512 clients and significant overhead comes from that. I assume that GC comes into play as well here. In any case, I will let readers form their own opinion rather than listening to a baseless rant from a competitor company. Best, Dmitriy Setrakyan GridGain - Grid Computing Made Simple -
Re: GridGain Scales to 512 Nodes on Amazon EC2[ Go to top ]
- Posted by: Joseph Ottinger
- Posted on: August 11 2008 18:20 EDT
- in response to Dmitriy Setrakyan
This is the usual problem with benchmarks - somebody will be unhappy. Unfortunately you seem to have lost your objective view on technology after having joined GigaSpaces... which is kind of understandable since you do now work for a competitor company.
That's no loss - you guys didn't see me as objective before, no reason for you to see me as objective now.Now, if you have to ask what benefits GridGain brought to the picture, you have not visited our website and practically know absolutely nothing about our product. I suggest you do some minimal reading before posting such inflammatory comments.
I know absolutely nothing? Nonsense. My point was, and is, that for this test gridgain added... nothing. It's a test that shows that you can use gridgain on 512 nodes... sort of, except that you could have done the same thing without GridGain. I was hoping to see something... more. Just because I work for GigaSpaces doesn't mean I'm not interested in the technology.Note that JMS hub needs to manage 512 clients and significant overhead comes from that. I assume that GC comes into play as well here.
This was my point to begin with. You didn't show anything about your technology... the benchmark was just a dog and pony show. -
Re: GridGain Scales to 512 Nodes on Amazon EC2[ Go to top ]
- Posted by: Nikita Ivanov
- Posted on: August 11 2008 19:57 EDT
- in response to Joseph Ottinger
You didn't show anything about your technology... the benchmark was just a dog and pony show.
Joe, This bizarre overreaction is certainly hurting your employer's image. Think about it... Everyone's got your point no matter how ridiculous, in my opinion, it is. There's a full disclosure and information about this test for everyone to see and make their own conclusions. Grid Dynamics could have performed many other tests with GridGain, of course, including with transactional data grids using JBoss Cacne, ehcache, your very own GigaSpaces, or Coherence, to name a few. But I think the choice of test was very correct as it shows basic and simple example of how GridGain can be used to achieved massive scalability with literally few lines of code - in the business case that is used by 100s of business around the globe today. Relax and take a break :) Nikita Ivanov. GridGain - Grid Computing Made Simple -
Re: GridGain Scales to 512 Nodes on Amazon EC2[ Go to top ]
- Posted by: Joseph Ottinger
- Posted on: August 12 2008 05:39 EDT
- in response to Nikita Ivanov
Bizarre overreaction? What overreaction? I'm speaking as myself here, not for GigaSpaces; even if that were not so, how is GigaSpaces being affected by my questioning what a test is trying to show me? I'm actually a little surprised - I didn't say anything negative about GridGain here, at all. Yet you seem to feel attacked. I don't know why. I still don't see what the test was for. If you'd care to enlighten me instead of being defensive, that'd be great.
Joe,
You didn't show anything about your technology... the benchmark was just a dog and pony show.
This bizarre overreaction is certainly hurting your employer's image. Think about it... Everyone's got your point no matter how ridiculous, in my opinion, it is. There's a full disclosure and information about this test for everyone to see and make their own conclusions.Grid Dynamics could have performed many other tests with GridGain, of course, including with transactional data grids using JBoss Cacne, ehcache, your very own GigaSpaces, or Coherence, to name a few. But I think the choice of test was very correct as it shows basic and simple example of how GridGain can be used to achieved massive scalability with literally few lines of code - in the business case that is used by 100s of business around the globe today.
There's the rub: Monte Carlo is in use by hundreds of businesses, sure. But they don't need GridGain to get the same numbers - or better! - that the test showed. (Nor do they need GigaSpaces to get the same numbers or better... or Coherence... or anything.) That's why I wondered about the test. You didn't show me anything. I wanted to see something. This is not an attack. If you want to see it as one, fine, go ahead - it's not like I've ever been able to stop you from deciding that if it's not overwhelmingly positive, it has to be negative. -
Self-reply... because Nikita did sort of explain it[ Go to top ]
- Posted by: Joseph Ottinger
- Posted on: August 12 2008 07:33 EDT
- in response to Joseph Ottinger
From Nikita Ivanov, in this thread, message 266007:I also don't think that Grid Dynamics claims anything beyond just this test - you can simply perform computationally intensive tasks with almost linear scalability on 512-node strong Amazon EC2 cloud.
This was my original point, and I regret not seeing you confirm this more clearly when I first read your responses. Should have seen it initially. But that still goes back to my original question: linear scalability for computationally intensive tasks - especially when they're not interdependent - is not a real accomplishment. If I had 512 (okay, 514, including MQ and database hosts) servers in my own lab, I could do the same thing... with or without GridGain, with or without almost everything mentioned here. The inclusion of GridGain is important because that's what was used -- but I still haven't seen what GridGain added. Was job stealing included? Were there any node failures? Were transactions a factor at all? -
Re: Self-reply... because Nikita did sort of explain it[ Go to top ]
- Posted by: Kirk Pepperdine
- Posted on: August 19 2008 13:40 EDT
- in response to Joseph Ottinger
From Nikita Ivanov, in this thread, message 266007:
I've read the description of the website and I have to say that one of the things that I believe differentiates marketing fluff from a serious study is transparency and a methodical reporting of method and results. For example, you claim linear scalability yet you don't offer any goodness of fit calculation. Also without any idea about how the Monte Carlo was setup, there is little anyone can say about the value of GridGain in this message. I'm not trying to side with Joe here. What I am saying is that this has the potential to be a very cool useful study. The last benchmarking article that I did editorial work on took 4 months to complete. It was also a potentially cool study but it needed work before (IMHO) it could be published. I would like to offer you the same editorial advice that I gave then, please rework it to give us some useful information, code, methodology and statistics. Regards, KirkI also don't think that Grid Dynamics claims anything beyond just this test - you can simply perform computationally intensive tasks with almost linear scalability on 512-node strong Amazon EC2 cloud.
This was my original point, and I regret not seeing you confirm this more clearly when I first read your responses. Should have seen it initially.
But that still goes back to my original question: linear scalability for computationally intensive tasks - especially when they're not interdependent - is not a real accomplishment. If I had 512 (okay, 514, including MQ and database hosts) servers in my own lab, I could do the same thing... with or without GridGain, with or without almost everything mentioned here. The inclusion of GridGain is important because that's what was used -- but I still haven't seen what GridGain added.
Was job stealing included? Were there any node failures? Were transactions a factor at all? -
Let me clarify it for you...[ Go to top ]
- Posted by: Nikita Ivanov
- Posted on: August 11 2008 16:28 EDT
- in response to Joseph Ottinger
I don't quite understand what's being illustrated.
What is being demonstrated is something that is used by 100s of financial, banking and insurance companies daily around the globe. And it's used almost in a verbatim scenario. I also don't think that Grid Dynamics claims anything beyond just this test - you can simply perform computationally intensive tasks with almost linear scalability on 512-node strong Amazon EC2 cloud. What I do like about this test (or benchmark) that unlike other "tests" it was:- independently performed
- simple to understand
- easily verifiable