With data grids growing in importance within large organizations, it is becoming apparent that each cluster is becoming an island of data within the organization. There is increasing demand to bridge these data grids to achieve some of the following requirements:
- create a global view,
- synchronise data across multiple regions,
- replicate data for failover and contingency,
- push data to local regions,
- bridge to create an event medium.
The design foundations of data grids are generally ideal for deployment in a single data center as their implementations typically take advantage of network protocols which are optimized for “local” operations. The network protocols which are optimised for “local” operations are focused upon throughput at the expense of the demands of wide-area-network (WAN) communication. Hence it is inappropriate to use the normal “local” deployment pattern for using a data grid over a WAN. As for all distributed systems the challenges are increased when the connection is to be maintained over a WAN, these include:
These general challenges are increased when coupled with need for the data grid to maintain synchronised data, which can be ordered or un-order updates in a timely manner (as fast physically possibly with the bandwidth limitations of the WAN).
There are numerous potential solutions which should be considered, of these the common approach is to use a hub based pattern, where a shared component is used to manage the message passing, for example two common ones are, centralized data source and using a messaging platform;
The first example is to use a centralized data source to share the information between grid clusters. This essentially means all nodes in the data grid clusters connect to a central transactional version of truth. This approach is best for systems with a relatively low rate of change, as the transaction rate of the database quickly becomes is a limiting factor, also the conflict resolution can become a complex additional challenge. As a result this doesn’t really solve the core challenges.
The second example is to use a messaging platform between grids (such as JMS or other message broker technology) also known as Gateway, Hub or Mirror service. This would allow the data grid clusters to maintain synchronicity using messages to transport each event. This approach can be used to solve the challenges’ and hence it is a viable option, unfortunately there are some foreseen limitations:
- requirement for an additional software component to the application stack,
- development costs of coding to a message broker API,
- remove the requirement for static lookups,
- extra configuration and support costs of a message broker,
- scalability challenges of passing all messages via a one broker in order to achieve synchronizations,
- potentially for topic explosion.
- Support for a more interactive style of communication (message brokers are essentially one way communication).
- High availability. How do you horizontally scale a message broker? How do you make it fault tolerant?
- Extra hardware and monitoring requirement.
The alternative to the common hub based solutions would be to connect the data grid clusters directly within the application stack. This approach avoids the limitations caused by the introduction of a hub through which all the messages are channelled. This approach is what has been termed “hub-less messaging” and makes use of the features for a distributed caching infrastructure, to provide a messaging backbone. This enables the data grid clusters to communicate directly within the single application stack and avoid the limiting factors of the hub and associated synchronisation bottleneck.
- no additional software components,
- scalability is identical to the clustered cache,
- high availability achieve using the core data grid capabilities,
- Avoid the additional costs of hardware, support and licensing.
Implementing the Hub-less Messaging Solution
The approach taken was to use the Coherence Incubator Patterns to create a hub-less messaging solution to bridge the data grids. This approach leverages the power of a distributed cache and the experience distilled in the patterns. The Coherence Incubator patterns do not attempt to provide a “one size fits all” solutions, but rather provide a set of building blocks to solve different requirements in connecting data grids. The solution to bridging data grids is solved using a combination of these patterns, Command, Messaging and Push Replication.
The Command pattern is a fundamental pattern, which relates to encapsulating actions to be performed as an object called a Command. This implementation of the pattern ensures that the commands are guaranteed to execute within the target contexts in the order in which they are received. With regards to connecting data grids, the command pattern takes on the role of orchestration of the update actions as they arrive at the target context. If the requirement is to ensure all events are ordered, the command pattern can be used to provide the ordering across the grid asynchronously.
The Messaging pattern, is an implementation of the publish and subscribe messaging pattern. This implementation provides durable subscriptions by making use of the data grid to a reliable store and forward. Each message is associated with a topic and destinations, the message subscribers consume the messages (receive and acknowledge) which allows the infrastructure to manage the messages.
The Push Replication pattern, builds upon the two previous patterns to provide a flexible and scalable infrastructure to transport the data / messages to locations. The data / messages are in this implementation termed Operations which relate to the insert, update and delete of data in the data grid. A location can be a Coherence cluster, file system, data store, or any component that can consume the operations. The pattern details the semantics of sending the data and the configuration of the infrastructure, for example the batching, durability, retry rates.
Using the Push Replication pattern to bridge the data grids, it is possible to use the pattern as the building blocks to create uni-directional or multi-directional replication. The uni-directional form is a “hub-and-spoke” with one cluster as the core and the other clusters as the receiving locations. It is possible to alternatively define each cluster as both a “hub” and “spoke” to create peer to peer replication, with each cluster publishing and receiving updates.
The data grid was successfully extended using the Coherence Incubator patterns and resulted in the benefits of being able to;
- Provide disaster recovery capabilities
- Use the data grid as an event medium
- Overcome the limitations of a hosting centre
- Provide global data local to the consuming application
- Include WAN monitoring capabilities
Putting together the power of Coherence with the knowledge and experience contained in the Coherence Incubator Patterns, it is possible to deliver a powerful solution which can be used to meet multiple requirements and overcome performance, scalability and reliability challenges.
These Incubator Patterns has provided a flexible approach to overcome the problems which are faced when extending the data grid. We recommend people look at the incubator patterns as a starting point to help them create their solutions.
About the Authors
Nicholas Gregory. Is an experienced engineer in distributed computing, having worked for Gemstone, experience in Jini and being a Sun certified JEE architect. Nicholas is a core contributor to the Coherence Incubator (Patterns: Command, Messaging, Publishing and Functor). Nicholas currently works in financial services.
Stephen Price. Has 10 years financial service experience, working on a diverse array of applications and technologies, most recently working on projects integrating Java services and .Net clients.