Our client had a rather lengthy stored procedure to aggregate values across a series of data records and compute "scores" from these aggregates using various look-up tables. The records in the database represent essentially a collection of user input over time. Our client was facing two main issues: first, the stored procedure was becoming unwieldy due to new requirements for the support of new types of products and scores. Second, the batch-orientation of the stored procedure caused response times to be uncomfortably slow. While the data access part of the procedure was naturally efficient, more and more business logic inside the procedure started to bog down the database server.Read more: Look Ma -- No Middleware! Using Event-Driven Architectures inside a JVM.
We addressed the first problem by converting from PL/SQL to Java code, which allowed us to structure the solution using object-oriented constructs. We targeted the second problem by processing user events as they occur instead of waiting for all events to be collected first before starting to compute aggregated values. In order to allow for future flexibility we developed a set of self-contained components, called "calculators", that we could connect to one another via event channels. Each calculator would subscribe to a set of relevant event types and in response publish events of other types. This this loosely coupled architecture enabled us to compose new solutions from existing calculators quite easily, sorta like Lego.
-
Using Event-Driven Architectures inside a JVM (12 messages)
- Posted by: Dion Almaer
- Posted on: October 13 2004 11:06 EDT
Gregor Hohpe has written about a project he was recently on that took a "rather lengthy stored procedure" to an event-driven architecture. He discusses the solution, and some emerging patterns that came from it.Threaded Messages (12)
- Interesting approach by peter lin on October 13 2004 11:32 EDT
- Event-Driven & HiveMind by Howard Lewis Ship on October 13 2004 11:54 EDT
- Using Event-Driven Architectures inside a JVM by Rashid Jilani on October 13 2004 13:21 EDT
- Using Event-Driven Architectures inside a JVM by peter lin on October 13 2004 14:30 EDT
- Using Event-Driven Architectures inside a JVM by Vincent Shek on October 14 2004 02:25 EDT
- Using Event-Driven Architectures inside a JVM by Jon Tirsen on October 13 2004 16:26 EDT
- Using Event-Driven Architectures inside a JVM by peter lin on October 13 2004 14:30 EDT
- Using Event-Driven Architectures inside a JVM by Jacob Hookom on October 14 2004 00:49 EDT
- Using Event-Driven Architectures inside a JVM by Jacob Hookom on October 14 2004 00:57 EDT
- No Middleware... by Karl Banke on October 14 2004 04:15 EDT
- No Middleware... by Rashid Jilani on October 14 2004 11:44 EDT
- you do not like cats? just learn how to cook them right! by Alex V on October 14 2004 12:58 EDT
-
Interesting approach[ Go to top ]
- Posted by: peter lin
- Posted on: October 13 2004 11:32 EDT
- in response to Dion Almaer
I've used a similar approach in the past and it can work well. The primary limitation is when a calculation requires a large chunk of data, like say 30 million rows. Obviously, it's not practical to load that much data and aggregate it. In those cases, I find a combination of event driven + summary tables/OLAP provides a nice solution. In the more extreme cases where a system has to aggregate a ton of data in semi-realtime (sub minute) and the data stream is constant, the system has to incrementally recalculate. Obviously there are cases where it's not feasible. Calculations like duration, and median that require the entire dataset, a distributed approach like java spaces might be better. ultimately, the size of the dataset determines whether it should be done in the database. Though running a stored procedure to calculate the median constantly will likely overwhelm the poor database.
It's interesting that some people are choosing to by pass the OLAP route for aggregating data. -
Event-Driven & HiveMind[ Go to top ]
- Posted by: Howard Lewis Ship
- Posted on: October 13 2004 11:54 EDT
- in response to Dion Almaer
Can't help but plug HiveMind (http://jakarta.apache.org/hivemind/) here.
In HiveMind, connecting services together via event notifications is built in. When building a service, you may specify another service that produces events that are listened to, i.e.:
<service-point id="EventSender" interface="...">
...
</service-point>
<service-point id="EventConsumer" interface="...">
<invoke-factory>
<construct class="...">
<event-listener service-id="EventSender"/>
</construct>
</invoke-factory>
</service-point>
Details here
Only the service implementation has to implement the event listener interface, the service interface does not have to extend the event listener interface.
Many other messaging patterns can be easily accomplished in HiveMind by combining configuration data with services; that is, by contributing services into configuration points. Part of the contribution can be a description of when the service can be used, i.e.
<handler event-type="auction-ended" service-id="AuctionEnded"/>
You can imagine a service that receives events of different types, including "auction-ended", then looks in its configuration for a <handler> element with the matching type; the service is then passed the event.
Gregor mentions "composition ... sorta like Lego" and that's one of the hallmarks of HiveMind development. Being freed from concerns about when and how service objects are instantiated and configured (especially in a multi-threaded environment) is very liberating, making approaches like Gregor's completely practical in a wide range of situations. -
Using Event-Driven Architectures inside a JVM[ Go to top ]
- Posted by: Rashid Jilani
- Posted on: October 13 2004 13:21 EDT
- in response to Dion Almaer
There are lots of tricks you can pull to call a long running stored procedure asynchronously both in .NET and Java but is this the right approach. I think the more elegant approach is to calculate every thing at the batch-oriented mode if the calculation is so demanding. Of course it requires some redesigning of the application and how user interacts with the system. -
Using Event-Driven Architectures inside a JVM[ Go to top ]
- Posted by: peter lin
- Posted on: October 13 2004 14:30 EDT
- in response to Rashid Jilani
There are lots of tricks you can pull to call a long running stored procedure asynchronously both in .NET and Java but is this the right approach. I think the more elegant approach is to calculate every thing at the batch-oriented mode if the calculation is so demanding. Of course it requires some redesigning of the application and how user interacts with the system.
From first hand experience, batch mode fails if the granularity needs to be at a transaction level. I'll use pre-trade compliance as an example again. Say I have a constant stream of transactions for buy/sell securitites.
If the system has to make sure trades do not result in violation of SEC regulations, performing the analytics in batch mode would mean the calculations are neither accurate or timely. The result of processing analytics in this fashion would result in heafty fines from the SEC. Many of the current compliance systems I know of do not perform pre-trade compliance for that reason. In fact some of the biggest firms still only do over night batch processes for complete compliance validation. Very few shops actually do partial pre-trade compliance due to the nature of regulatory rules and the analytics required.
One tricky part of running fairly complex analytics with stored procedures is it impacts the server. Say you have a rule that says, "an account cannot exceed 5% of the account's total value for any given issuer." An individual account may invest between 20-60 individual securities with a mix of stocks, funds, bonds and etc.
performing a simple calculation means the database has to perform what is called a "look through" for all funds and all fixed income holdings that have multiple issuers. These kinds of analytics can be done in stored procedures in a batch fashion, but often they take a long time. Long as in 30minutes to several hours depending on how many accounts a database has and the number of securities in each account.
I know of one company that claims to do pre-trade compliance using just stored procedures, but from what I know of business process rules and business process systems using Sql, it doesn't scale well. There are many cases where it is appropriate, but for the more challenging cases, it fails miserably. -
Using Event-Driven Architectures inside a JVM[ Go to top ]
- Posted by: Vincent Shek
- Posted on: October 14 2004 02:25 EDT
- in response to peter lin
From first hand experience, batch mode fails if the granularity needs to be at a transaction level. I'll use pre-trade compliance as an example again. Say I have a constant stream of transactions for buy/sell securitites.If the system has to make sure trades do not result in violation of SEC regulations, performing the analytics in batch mode would mean the calculations are neither accurate or timely. The result of processing analytics in this fashion would result in heafty fines from the SEC. Many of the current compliance systems I know of do not perform pre-trade compliance for that reason. ............There are many cases where it is appropriate, but for the more challenging cases, it fails miserably.
Peter, our firm is facing a similar problem but we also have to handle multiple trading markets (but not the US market as this moment) as well. Our analytics usually take under a minute to complete for an acocunt with 10-20 securities so we can afford to recalculate it every time. I am curious to know how many orders your system is designed to handle and what sort of analytics are required. Our analytics are quite simple for the moment, using simple discounted rate formula to calculate risk and exposure. -
analytics[ Go to top ]
- Posted by: peter lin
- Posted on: October 14 2004 03:03 EDT
- in response to Vincent Shek
Peter, our firm is facing a similar problem but we also have to handle multiple trading markets (but not the US market as this moment) as well. Our analytics usually take under a minute to complete for an acocunt with 10-20 securities so we can afford to recalculate it every time. I am curious to know how many orders your system is designed to handle and what sort of analytics are required. Our analytics are quite simple for the moment, using simple discounted rate formula to calculate risk and exposure.
Generally, account level analytics like total market value take less than a 50ms using OLAP, anaytics package like tibco or something home grown. the harder analytics are related to "look through", duration calculation and historical analytics. I've seen some crazy analytics for risk/compliance that compare relative delta to historical delta's. Typically 2A7 and 1940Act are easy until it is applied to an entire firm. I can't really say the real target is, but the original wish list was for 10K tps with 4 CPU servers. That was wishful thinking and we had to run several series of benchmarks with COM+ and OLEDB to show the max throughput on 4 cpu server.
the normal "weight" of a given issuer, GISC taxonomy or country is straight forward, if the positions aren't constantly changing. The harder thing to do is when large batches of orders come in, and the system has to consider regs, firm wide and account rules within the same validation process. some of the older systems process compliance procedurally and suffer from poor implementation. An example of 2A7.
3/4 of the account cannot exceed 10% exposure to any issuer.
what some of the older systems did was it calculated the exposure for every issuer in an account. It then sorted the weights and checked to see which ones exceed 10%. Obviously, there's faster ways of checking this particular reg rule. When the rule is applied to a firm as a firm wide exposure rule, there might be 5K issuers and 2million rows of positions. running this for every single transaction would be costly to say the least :) -
Using Event-Driven Architectures inside a JVM[ Go to top ]
- Posted by: Jon Tirsen
- Posted on: October 13 2004 16:26 EDT
- in response to Rashid Jilani
There are lots of tricks you can pull to call a long running stored procedure asynchronously both in .NET and Java but is this the right approach. I think the more elegant approach is to calculate every thing at the batch-oriented mode if the calculation is so demanding. Of course it requires some redesigning of the application and how user interacts with the system.
This was how the app was working when ThoughtWorks was hired. In this particular case the user required direct feedback as the new data became available and the calculation progressed. When all data was available the calculation had to almost immediately return.
The design you're outlining is a valid one, but alas! it could not be used in this case. -
Using Event-Driven Architectures inside a JVM[ Go to top ]
- Posted by: Jacob Hookom
- Posted on: October 14 2004 00:49 EDT
- in response to Dion Almaer
I really apprechiate event driven design-- I've been playing around with some handheld development in C# which uses combinations of 'delegates' and 'events' to basically accomplish template based method reflection:public void ListenForEvent0(Object Source, Event event) { /* do something */ }
I'm trying to find the best way to accomplish this in Java while injecting the Filter/Chain of Responsibility Pattern:
public void ListenForEvent1(Object Source, Event event) { /* do something */ }
public void ListenForEvent2(Object Source, Event event) { /* do something */ }
this.EventListeners += this.ListenForEvent0;
this.EventListeners += this.ListenForEvent1;
this.EventListeners += this.ListenForEvent2;
// notify all listeners
this.EventListeners(Source,Event);// action method signature:
I'm still trying to figure out the details of EventContext-- if I should make something that's more controller specific (view behavior) or take more of an AOP approach, treating each method as an interceptor and visit or filter based on method signatures....
// public (void|String) name (Event, EventContext?)
void OrderController.validate(Event, EventContext)
{
if (!EventContext.hasRole("order")) EventContext.finish();
else EventContext.continue();
}
void OrderController.exception(Event, EventContext)
{
try { EventContext.continue(); } catch (Exception e) { .... }
}
void OrderController.store(StoreOrderEvent, EventContext)
{
EventContext.finish(success ? "pass" : "fail");
} -
Using Event-Driven Architectures inside a JVM[ Go to top ]
- Posted by: Jacob Hookom
- Posted on: October 14 2004 00:57 EDT
- in response to Dion Almaer
Using the architecture in the article, how would you introduce calculator inter-dependencies? A simple example would be validation of event state before persisting the event state.
In the article's architecture, it would seem that you would be able to determine process order by finely grained event types-- which could lead to uncessary bloating?
Suggestions or I missed something? -
No Middleware...[ Go to top ]
- Posted by: Karl Banke
- Posted on: October 14 2004 04:15 EDT
- in response to Dion Almaer
Hm, while the architecture itself works, I wonder about the possible consequences. In a real world example you might have various JVMs all doing the same job, some consolidation needs to take place, surely? Meaning concurrent access to a persistant storage? XA Transactions? Or will data that can't be stored just get discarded? If not, where are the safeguards? At what cost? How does it scale? -
No Middleware...[ Go to top ]
- Posted by: Rashid Jilani
- Posted on: October 14 2004 11:44 EDT
- in response to Karl Banke
Hm, while the architecture itself works, I wonder about the possible consequences. In a real world example you might have various JVMs all doing the same job, some consolidation needs to take place, surely? Meaning concurrent access to a persistant storage? XA Transactions? Or will data that can't be stored just get discarded? If not, where are the safeguards? At what cost? How does it scale?
Let the Database deal the Transaction. Isn't it's the best place? Besides that we are talking about fire and forget kind of scenario, injecting transaction at the middleware doesn't make lot of sense in this case. -
you do not like cats? just learn how to cook them right![ Go to top ]
- Posted by: Alex V
- Posted on: October 14 2004 12:58 EDT
- in response to Dion Almaer
Middle-tier is by definition is layer
between data source(es) and several clientS.
Analytical tasks (tons of DB data + small result out)
and batch tasks (tons of select-update inside DB)
does not requires midle-tier. To shovel tons of data NOT
in data layer is not right in most cases both for performance
(network fee, even for the same machine)
and for integrity.
The urge to use OO makes it even worse - you need OR
while data in RDB.
In enterprise level midlle-tier is not
the ONLY access path to data, so any Java/NET greates
"calculators" will be inconsistent if flat-file batch
update arrives. Triggers and views with scheduled update
are the best way to handle such tasks.
Of course, if you have single java application as the only
data access and do not like your database - use
java solutions...
Alex V.