EJB design: in-memory design
Hi, I am looking some architecture or patterns to sole the following problem. An operations research kind of application. Let’s say a seaport where, there are boats, trains, trucks, all with schedule, crews for inspection, equipment most be available…etc. We need to propose the best way to organize the work in order to process everything as quickly as possible. There are a lot of assignment rules. The input to these rules changes very often. We receive many messages, there are triggering rules but the assignment will be triggered very often, let’s say every 5 seconds. For a 3-tier, n-tier or any database centric solution, I am afraid there will be just too much information to retrieve and objects to create. We will hit a performance wall very fast. We thought about an in-memory model. This is for operation and there is no need to store the information so no database is really required. The idea itself is good it the implementation raises question for concurrency, isolation, transaction (ACID). The best I could come up with is a transaction cache (JBOSS). So, does anyone already implemented such application or have idea on how to implement it. Any comment is more than welcome. Stephan
Hi Stephen, Can you please clarify why you need an in-memory database - and why will just Plain java objects not do ? You could do simple locking during writes to get data consistency. Should the same data be visible to multiple machines in the cluster ( if this is a clustered application)? Why exactly are you looking for an in-memory database?
Stephan, The general scenario is a complex resource optimization model driven by many independent and possibly unpredictable event streams. If you can handle all of the data streams and calculation CPU load in-memory in a single process, your primary concern becomes data mirroring for high availability. Modern distributed caching (or, more broadly "Data Fabric") technologies offer you a range of options that trade cache consistency for performance. These may include pessimistic transactions, optimistic transactions, synchronous replication of each object modification, fire-and-forget optimistic mirroring, and many more shades between these. The key is that the newest technologies give you the full set of logical replication possibilities with many under-the-covers performance optimizations--the choice then depends on your specific business requirements. The commercial Data Fabric vendors are pushing towards the physical boundaries of latency, throughput, and scalability at an amazing level of detail these days. JBOSS Cache is a very capable solution for simpler requirements such as application server session caching, but you have to look beyond open-source solutions to get a robust in-memory product capable of handling diverse use-cases. A couple of more things to consider are the pattern-matching and push-based notification mechanisms these vendors provide. Continuous Query technology lets you express interest in a potential condition through an SQL statement against an in-memory relational data model—but without the need for polling. OQL and proprietary API’s let you do similar things in the object world with filter expressions. This often makes it much more intuitive to break down your problem into smaller pieces for distribution over a cluster where results from one part of your model are automatically fed downstream to another. If the model and data rates grow to a point where a single machine (or a small cluster) isn’t sufficient to meet your requirements, then you’ll likely need to consider a grid solution. Here the need for a robust distributed main-memory data solution becomes even more important. If you start with one now, you'll be in great shape as you need to scale. Cheers, Gideon GemFire—The Enterprise Data Fabric http://www.gemstone.com
For a 3-tier, n-tier or any database centric solution, I am afraid there will be just too much information to retrieve and objects to create. We will hit a performance wall very fast.Quite likely. Besides, if the data is in-memory, you can still keep a copy in the database for reporting, operational restores, etc.
So, does anyone already implemented such application or have idea on how to implement it. Any comment is more than welcome.I have experience with applications such as this, but only second-hand via some of our customers (e.g. logistics companies, container shippers, port operators, trucking companies). Your requirements sound very familiar, though. Check out the information available on our WIKI: http://wiki.tangosol.com/ Perhaps some of the capabilities will trigger some ideas ;-) Peace, Cameron Purdy Tangosol Coherence: Clustered Shared Memory for Java