Discussions

News: GridGain 1.5 - Open Source Grid Computing For Java

  1. GridGain Systems has announced the availability of the GridGain 1.5 release, an enterprise open source grid computing platform for Java. This release hardens core functionality and APIs as well as provides new integrations. GridGain now supports Jboss, Spring, AspectJ, Weblogic, Websphere, Coherence, GigaSpaces, JXInsight, and Mule with native integration. This release culminates 18 months of overall development and presents a software product with unique set of grid computing features, open source LGPL licensing and clear focus on high-performance Java-based development. GridGain's main features include:
    • Clear focus on computational grids
    • Developed entirely on and for Java 5
    • LGPL open source licensed
    • Out-of-the-box integration with JBoss, Spring, Mule, Coherece, GigaSpaces, Weblogic, Websphere and JXInsight
    • Unique annotation and AOP-based grid enabling technology
    For more information:

    Threaded Messages (17)

  2. What are the main differences between GridGain and Globus? Wei J2EE tools
  3. Well, the differences are quite significant. Globus is a heavy beast concentrating on solving computational problems in global (going across firewalls) grids. It can take days to install and configure and is absolutely not suited for day-to-day development within enterprises. It is good for running non-critical miscellaneous jobs that otherwise would take days to run. While GridGain also concentrates on solving computational problems, the main focus of GridGain is Ease of Use and Enterprise Grids. Unlike Globus, GridGain can take a task that would take several minutes to run and execute it on the Grid within seconds. Also, once you install GridGain you will see how easy to use it is. Our P2P class-loading and AOP-based grid-enabling makes development of grid applications very transparent, and in many cases without explicit deployment steps or complex configurations. In fact, you can simply start a couple of grid nodes on your local computer, then run GridGain together with your application logic directly from Eclipse or Idea and watch your code execute on Grid. I don’t think Globus comes anywhere near such simplicity. GridGain also supports Distributed Task Session which allows multiple jobs within the same task communicate and synchronize with each other, pluggable scheduling and collision management, discovery and fail-over and many other enterprise-ready features. Nikita Ivanov http://www.gridgain.com
  4. Hadoop?[ Go to top ]

    How would you compare GridGain to Hadoop? Both are essentially replicating the map-reduce paradigm for distributed computation that was developed by Google...
  5. Re: Hadoop?[ Go to top ]

    You are right. Both GridGain and Hadoop implement Google’s MapReduce paradigm. However, the approaches differ quite significantly. Hadoop’s main focus is working with very large data files (terabytes in size). So its main responsibility is basically splitting large data into smaller data subsets for processing. GridGain, on the contrary, focuses on making it extremely easy to split your logic, not data (although you can split your data too). It allows user to “map” a computation into multiple sub-computation units and distribute these computational units across your node topology. User has a fine-grained control over task node topology distribution, scheduling, fail-over of computations to other nodes, checkpoint storage for longer computations, etc… We encourage you to download both products and see which one suites your needs better. Nikita Ivanov, http://www.gridgain.com
  6. Re: Hadoop?[ Go to top ]

    Just a slight clarification… We certainly implement Map/Reduce concept but the overall idea of splitting and aggregating (reducing in MPI speak) was certainly not developed in Google. It’s been around for several decades in one way or another (parallel computing, divide and concur, etc.). Our core API design borrowed from several sources: MPI, Concord project, Map/Reduce among the main ones. Nikita Ivanov www.gridgain.org
  7. Re: Hadoop?[ Go to top ]

    Just a slight clarification… We certainly implement Map/Reduce concept but the overall idea of splitting and aggregating (reducing in MPI speak) was certainly not developed in Google. It’s been around for several decades in one way or another (parallel computing, divide and concur, etc.). Our core API design borrowed from several sources: MPI, Concord project, Map/Reduce among the main ones.
    I'm glad you pointed this out. The myth that this idea originated at Google has become fairly widespread.
  8. Congrats GridGain is a good example on how grid computing can be made simple. It is also a good example on how commercial and OSS vendors can complement each other, see Geva Perry recent post on that matter: GridGain-GigaSpaces Integration Nati S GigaSpaces Write Once Scale Anywhere
  9. It is also a good example on how commercial and OSS vendors can complement each other...
    Another important aspect is the high degree of extensibility built into the products stack that allows customers and vendors to replace and/or enhance aspects of the runtime. I am very happy that they have also decided not to base their Grid programming model on a single AOP runtime or IOC framework (which other grid computing vendors have done) and instead layered the support on top of a core runtime system whilst offering integration that aligns well to the strengths of each framework. Regards, William Louth JXInsight Product Architect Blog "Performance monitoring and problem management for Java EE, SOA and Grid Computing Platforms" JINSPIRED
  10. Congratulations, Nikita .. you have certainly brought this a long way since its original conception :-) Peace, Cameron Purdy Oracle Coherence: The Java Data Grid
  11. GridGain and Terracotta[ Go to top ]

    Could you give please a simple comparison of GridGain and Terractotta (http://www.terracotta.org/) technologies?
  12. Re: GridGain and Terracotta[ Go to top ]

    Well, I am not familiar with Terracota on detailed level, but on a face value it looks like what they provide is a Clustered JVM – Not a Grid Computing platform. I am not sure how well it scales given its master-slave approach, but in Grid Computing arena it looks like it lacks certain key features, such as
    • Grid task deployment
    • Peer class loading
    • Fine grained job scheduling and collision management
    • Support for non-homogeneous environments (what if one computer is twice as fast as the other?)
    • Fine grained control on how a job gets split between grid nodes and which node gets which piece of computation.
    • Fine grained control over job fail-over logic – when and to which node a job should fail-over to.
    • Support for connected tasks and a concept of task session – what if one job has to signal the others about a certain event?
    • Checkpoint management for long running jobs.
    Having mentioned the above, I am sure Terracotta has many features that are not found in many Grid Computing products today and certainly found its niche in distributed computing market. Best, Nikita Ivanov http://www.gridgain.org
  13. Re: GridGain and Terracotta[ Go to top ]

    Well, I am not familiar with Terracota on detailed level, but on a face value it looks like what they provide is a Clustered JVM – Not a Grid Computing platform. I am not sure how well it scales given its master-slave approach, but in Grid Computing arena it looks like it lacks certain key features, such as
    • Grid task deployment
    • Peer class loading
    • Fine grained job scheduling and collision management
    • Support for non-homogeneous environments (what if one computer is twice as fast as the other?)
    • Fine grained control on how a job gets split between grid nodes and which node gets which piece of computation.
    • Fine grained control over job fail-over logic – when and to which node a job should fail-over to.
    • Support for connected tasks and a concept of task session – what if one job has to signal the others about a certain event?
    • Checkpoint management for long running jobs.
    Having mentioned the above, I am sure Terracotta has many features that are not found in many Grid Computing products today and certainly found its niche in distributed computing market.

    Best,
    Nikita Ivanov
    http://www.gridgain.org
    Since Terracotta was mentioned and answered by Nikita (who politely points out he is not familiar with our products, so no worries), I just want to clarify two things. While many grid-associated vendors believe that Terracotta cannot "do grid" because it has no API, this is a misconception. Our grid framework is located here: http://www.terracotta.org/confluence/display/labs/WorkManager It can be used for distributed queries, partioning of work, and it has interfaces for handling failure, etc. As a general rule, Terracotta is not against API's nor is it demanding that people code in POJO without frameworks. Terracotta likes to enable HA and scale-out of frameworks like Spring, web app frameworks like Wicket and Rife, or asynch frameworks such as SEDA and java.util.concurrent. This BTW suggests that a framework like Gridgain could likely work ON TOP of Terracotta as opposed to being directly compared. Another misconception that we admittedly caused out there is that we are a "clustered JVM." We like to say that we cluster _at_ the JVM-level which means we plug-in to Hotspot or IBM VM...not that we implement our own. Thanks, --Ari
  14. Re: GridGain and Terracotta[ Go to top ]

    Ari, Making efforts towards having an API is definitely a step in the right direction :) However, compliance with CommonJ is just a small part of what GridGain provides. The fact that Terracotta is trying to comply with CommonJ probably brings it to the level of competition with Tangosol Coherence, which is also a Data Grid product with CommonJ Api. However, Coherence caches transactional data that is backed by DB, but I am not sure if Terracotta DSO can be backed by DB or be transactional for that matter. Coming back to Computation Grids, I would still say that some of the most essential features of Grid Computing are:
    1. Ability to Split your task into sub-tasks and then aggregating the result back.
    2. Peer Class Loading - you should not have to restart your grid nodes whenever you change your code.
    3. Grid task deployment.
    4. Comprehensive grid job scheduling routines.
    5. Connected jobs - in GridGain such feature is realized via distributed task session which allows for all jobs within a grid task to communicate with each other.
    6. Ability to adapt to any environment with any discovery or communication protocol.
    Given that Terracota is a JVM-level clustering product essentially without an API, I don't think such plugability can even be supported. Best, Dmitriy GridGain - Grid Computing Made Easy
  15. When utilizing a Compute Grid, a user is not necessarily interested in decomposing his task but most often he just want s to be able to use a compute resource matching some specification. The important point here is that these resource s are distributed across organizational boundaries. Research compute infrastructures like DEISA or TeraGrid provide resources to real as well as virtual organisations today. Other projects like EGEE have almost 200 partners adding their resources to a common pool. Do you think it is possible to use GridGain as a framework to distribute the workload in those scenarios? What would the effort look like? Thanks, Thomas
  16. When utilizing a Compute Grid, a user is not necessarily interested in decomposing his task but most often he just want s to be able to use a compute resource matching some specification. The important point here is that these resource s are distributed across organizational boundaries. Research compute infrastructures like DEISA or TeraGrid provide resources to real as well as virtual organisations today. Other projects like EGEE have almost 200 partners adding their resources to a common pool. Do you think it is possible to use GridGain as a framework to distribute the workload in those scenarios?
    What would the effort look like?

    Thanks,
    Thomas
    The short answer is YES. GridGain has a pluggable SPI architecture which, among other things, allows you to plug your own custom node discovery and communication along with comprehensive scheduling policies. So, it certainly seems like it would be possible to utilize GridGain in scenarios you have described. You can find out more about our SPI-based architecture on our Wiki at http://216.93.179.140:8080/wiki/display/GG15UG/Configuring+SPIs Best, Nikita Ivanov http://www.gridgain.org
  17. Well done Nikita, GridGain is really shaping up to be a very compelling technology. Looking forward to test-driving the Mule integration in this release. Cheers, Ross
  18. So which framework to use?[ Go to top ]

    If designing an architecture for a highly available and scalable system, where both computing and data needs to be distributed due to size and load (& availability)? The application is fairly "Google" like, it creates a "database" containing a lot of meta-data about documents or other artifacts, that needs to be accessible quickly and scalable (to handle heavy load). Hadoop, can handle the data distribution with HBase. GridGain looks easier to use and deploy, but how about the data distribution? How about the momentum of Hadoop when comparing with GridGain, even Yahoo seems to be working with Hadoop? Cheers, Nicolai