In a recent article by 17 year industry veteran, SpringOne speaker, and award-winning VMware Enterprise Architect, Guillermo Tantachuco, two approaches were outlined for running Java with in-memory data grids.  While the article outlines a 3-stage process for how SQL-based data tiers can evolve from distributed caches for legacy databases, to full-fledged data stores (i.e. OLTP), to cloud-centric data grids, Guillermo also explains two unique ways your Java applications can work with the VMware’s vFabric SQLFire data platform:

  • Embedded database
  • As part of a Distributed Compute Grid

Embedded database

Let’s look at a scenario where you might want to give Java access to an in-memory data grid. For example, your Java application needs a truly scalable, distributed database, with no-hop or one-hop access to data.  Sometimes, you have an architecture where many applications access content frequently from within the process heap. One example of this is session state data, but there are other cases like where a number of external applications or clients need real-time pricing or some other mathematical calculations.

In these types of scenarios, you can embed the SQLFire engine into Java applications by including the required SQLFire libraries in your JVM. When the application initiates a connection to SQLFire, it starts a peer server that joins other peers in the same cluster. Unlike other embedded databases such as H2 or Derby, SQLFire allows several servers to store replicated and partitioned tables, persist data to disk, communicate directly with other servers, and participate in distributed queries. 

A Distributed Compute Grid: SQLFire Stored Procedures

Traditional RDBMS stored procedures are usually run for high performance because they are run in close proximity to the data.  In SQLFire, you can use Java-based stored procedures to implement business logic at the server level that runs on the same process space as your data and in parallel on multiple SQLFire servers, thus significantly improving application performance and scalability. The capability here is basically a real-time, in-database map-reduce. You can run these procedures in several ways:

  • On all data stores in the SQLFire cluster.
  • On a group or a specific data store.
  • Only on stores that hold certain tables.
  • Only on stores that hold a subset of data in the table.

When these are run in parallel on the target members, results are streamed and aggregated. As an example, you might want to review an entire order history to categorize a bunch of customers.  Map-reduce types of use cases also apply.

The 3 Stages to evolve Legacy DBs into Cloud Data Grids

Guillermo goes on to explain where these two Java-based approaches fit within the greater context of a data architecture roadmap.  Many existing Java applications may begin to have performance issues related to data.  At the same time, they need a cost-effective way to migrate to a higher performing back-end.

Stage 1: Add an embedded database or a distributed cache

Step 2: Evolve the cache to be the primary OLTP

Step 3: Expand to a global data grid with real-time, in-database map-reduce capabilities

Read the entire article here.