Discussions

News: Article: Complex Event Processing Made Simple Using Esper

  1. "Complex Event Processing Made Simple Using Esper" by Alexandre Vasseur and Thomas Bernhardt explains how Esper, an open source project for continuous real-time analysis of data, provides constant and effective access to complex events.
    Event processing has been at the heart of any computing system for more then a decade. A common challenge across industries is to be able to extract actionable intelligence from disparate event sources in the most real-time possible way. You may have experienced this yourself: you need to have some information right away, not 1 hour later. Any delay in getting the information reduces the value it has to you…and you need to know when certain things are not happening within a certain time period, not just the presence of some events. Among those events you may want to filter a few of them and do aggregate computation over some period of time. Relational databases make it really hard to deal with temporal data and real-time or continuous queries. Other well known event processing systems have thus far focused on integrating endpoints as services, gaining abstraction over transports and protocols: EAI, MOM and more generally SOA. The missing part for actionable intelligence is a form of event processor capable of executing continuous queries with an highly expressive event processing language, capable to render the most complex situations: those of a real world system in which time and causality are first class citizens: this is what Complex Event Processing (CEP) is about. CEP aims at analyzing the data (events) that flow between information systems to gain valuable information in real-time. The term ‘Complex’ means that events can exist in relationships with each other, for example through timing or causality. Introduction to Esper Esper is an open-source CEP engine written entirely in Java and fully embeddable into any Java process – custom, JEE, ESB, BPM, etc. It recently has reached version 2.0, is backed by EsperTech under a professional open source / dual license business model, and triggers a growing interest in the Java and .Net community as well with NEsper, its full .Net/C# implementation. Both can be downloaded from http://esper.codehaus.org with complete examples and documentation.
  2. Better resource on CEP[ Go to top ]

    The article is a decent introduction, but a better source of knowledge is the cep blog. The biggest limitation I see with SQL based languages, is the semantics doesn't cover everything. The cep blog and event processing thinking provide detailed analysis of the pros and cons. For the last year, I've been studying event processing from a rule perspective. My bias opinion is the current examples aren't complex, or aren't what I would consider complex. Those interested should read the cep blog. It's going to take another 10 years before off the shelf event processing products mature enough to tackle the general problem of event processing beyond the simple time window examples. peter
  3. Re: Better resource on CEP[ Go to top ]

    The article is a decent introduction, but a better source of knowledge is the cep blog.
    Peter, of course the CEP blog is going to be a more complete resource - it's dedicated to this sort of thing and TSS isn't. :)
  4. Much better sources for sure[ Go to top ]

    Hi Peter There are actually numerous sources online regarding CEP and the best centralized place is likely http://www.complexevents.com , including its forum where most of industry experts (including the ones beehind the blogs you mention) participate and share. I think this introductory article geared at the Java community does a much better job in pitching the concept in a 15min reading with code snips rather reading some blog ;-) The "complex" term in CEP acronym does not always means complex but likely more composite and causality, and if you want to read about more realistic use cases, check out our online use case samples at http://esper.codehaus.org/tutorials/solution_patterns/solution_patterns.html f.e. this one from an algo trading use case that combines stream processing (time windows) and complex event processing (composite and causality) http://esper.codehaus.org/tutorials/solution_patterns/solution_patterns.html#triple-bottom-pattern A triple bottom is a reversal pattern with bullish implications composed of three failed attempts at making new lows in the same area, followed by a price move up through resistance. This pattern is rare, but a very reliable buy signal. Alex
  5. Re: Much better sources for sure[ Go to top ]

    Hi Peter
    There are actually numerous sources online regarding CEP and the best centralized place is likely http://www.complexevents.com , including its forum where most of industry experts (including the ones beehind the blogs you mention) participate and share.

    I think this introductory article geared at the Java community does a much better job in pitching the concept in a 15min reading with code snips rather reading some blog ;-)

    The "complex" term in CEP acronym does not always means complex but likely more composite and causality, and if you want to read about more realistic use cases, check out our online use case samples at http://esper.codehaus.org/tutorials/solution_patterns/solution_patterns.html f.e. this one from an algo trading use case that combines stream processing (time windows) and complex event processing (composite and causality)
    http://esper.codehaus.org/tutorials/solution_patterns/solution_patterns.html#triple-bottom-pattern
    A triple bottom is a reversal pattern with bullish implications composed of three failed attempts at making new lows in the same area, followed by a price move up through resistance. This pattern is rare, but a very reliable buy signal.

    Alex
    Another good resource is yahoo groups on event processing. I find that more useful than complexevents.com. I like Opher Etzion's categorization of event processing. I've looked at the esper examples several times last year and honestly they seem rather trivial to me. Having worked on real-time trading systems handling complex compliance rules, the usual examples feel rather simple to me. That's my bias experience with one tiny segment of financial applications. I'm sure other segments differ from order management systems and pre-trade compliance. I think many of the points that Tim Bass makes about the current state of CEP is dead on. Back when I worked on order management systems, the types of trading algorithms used in real life are quite a bit more complex than the examples I've seen on espers website or other product's tutorials. Then again, most firms consider those algorithms their "secret sauce", so getting a realistic example of a real trading algorithm is going to be impossible. My goal was to encourage people to explore further after they read the simple example. peter
  6. Re: Much better sources for sure[ Go to top ]

    Exact Peter As for any product documentation and samples (or article), the use case and code sample provided are always a billion time simpler than what the product will be used for in a real production system. FYI the yahoo group you refer to has been invited to migrate to forum.complexevents.com by the group founders so I expect the later to replace the former quiet soon. If you look for some more real world use case you can also check out Esper case study delivered at TheServerSideSymposium Las Vegas this year - at http://esper.codehaus.org/tutorials/tutorial/presentations.html (again made simpler than what it is for the format of a 1h talk). Alex
  7. Re: Much better sources for sure[ Go to top ]

    thanks for the links, but those aren't realistic to me either. Like I said before, coming up with a realistic trading scenario is not trivial. From my experience with trading systems and risk systems, the events are used with historical data to determine if the event is above the noise threshold. For example, figuring out if a stock really is making a significant shift is actually quite complex. The number parameters the system needs to look at might range from 10-30. To make sure a shift in price isn't due to a shift in the industry, sector or market, one has to compare it to historical data at several different levels. A very simple example is comparing the price shift and volume against the industry, sector, country, exchange and time. Stocks prices tend to fluctuate more closer to the quarterly reports, so the noise level is constantly shifting. this means an event processing system must be able to query external databases, analytics or filters and provide an easy way to perform multi-dimensional queries of the in-memory data. Looking at the various SQL inspired languages, none of them provide an easy way to express multidimensional cubes or provide an easy way to plugin external components. for the record, I've done this type of thing in the past using JESS for trading systems. Achieving this kind of complexity often requires existential quantifier and negation quantifier. They don't always require it, but it definitely makes it easier and more performant. peter
  8. The statement you have on various SQL inspired languages is rather vague. There is currently a large set of capabilities to explore in CEP solutions, and there are indeed very significant differences between "built atop SQL / RDBMS" approach, and "event processing language (EPL) that looks like SQL for simplicity and ease of learning" (this last category is Esper). As an FYI, Esper can do continuous joins with historical data with RDBMS SQL, and also with data available in virtually anything through joins with method calls (see http://esper.codehaus.org/esper-2.0.0/doc/reference/en/html/epl_clauses.html#histdata_overview and http://esper.codehaus.org/esper-2.0.0/doc/reference/en/html/epl_clauses.html#joining_method). The Esper EPL language has support for absence of events and more complex existential / non existential can also likely be implemented with the concept of named windows - essentially a shared view on a stream that you can insert/delete/query into on demand or continuously. I know some other vendor added this recently to their offering. There is also support for a continuous olap cube (http://esper.codehaus.org/esper-2.0.0/doc/reference/en/html/epl-views.html#view-stat-multidim) and adding more dimensions to it is likely a good use case for custom extensions. There are indeed many ways to extend the Esper EPL language itself with custom views, aggregators, pattern guards etc, and that is also an interesting capability when you are skeptical that things will fall short. Alex
  9. Thanks for the links. I wasn't aware of those features in esper. Looking at esper's concept of cube, it's not the same thing as what I'm thinking of. The type of Cube I'm thinking of is the same as MOLAP provided by Microsoft Analysis Service, or any of the other mature OLAP products on the market. I don't know if this is possible with esper, but with the major OLAP products, one can define a cube from one or more tables. A cube consists of dimensions and measures. A measure can be sum, median, mean, top, bottom, nth or distinct. Most of the OLAP products utilize bitmap indexes and provide constant query time. I don't have a license of Apama or Coral8, but looking at the material available on their website, I don't see that kind of functionality. It's not a scientific analysis by any measure, so they could support those features and don't list them on their website. I like SQL, but it's not an ideal event processing language to me. You can simulate existential and negation quantifiers with database tables, but the main problem is it's rather inefficient and not scalable. If you look at how high performance RETE engines do it, you'll see what I mean. This is my own bias, but the term continuous join is terrible in my mind. It's no different than pattern matching where data entering the system traverses a discrimination network aka a rooted acyclic graph. Discrimination network is synonymous with a query plan in this case. Over the last 6 months I've been contemplating the pros/cons of various approaches of compiling queries. The missing piece for RETE and non-RETE engines is a formal model of managing and calculating the life cycle of the facts in the engine. To address that, I've written a paper with a detailed description of how to calculate temporal distance for a given ruleset and make it easier for a rule engine to manage the data efficiently. You can find the paper here http://jamocha.svn.sourceforge.net/viewvc/jamocha/morendo/doc/ thanks taking time to post a respond and provide explanations of what espers does support in the current version. peter
  10. I was just re-reading the definition of the cube. If I understand this correctly: "8.3.5. Multi-dimensional statistics (stat:cube) This view works similar to the std:groupby views in that it groups information by one or more event properties." I could be wrong, but that isn't the same as a MOLAP cube. It approximates a multidimensional cube, but it won't perform or scale as well. Most modern MOLAP products store the data in a multi-dimensional cube and use bitmap indexes. I've been working on a MOLAP extension for RETE algorithm, which will provide an efficient way to support multi-dimensional queries. I started to work on this back in 2003, so it's taken almost 5 years to arrive at a model that "should" be powerful and scalable for the general case. peter
  11. I think it does what the doc says, and the source and complete product is available for you to play with. Feel free to join the user or dev community. General claims you have like "it won't scale (...)" starts looking a lot like FUD Peter. The use case we have here is to integrate this continuous computation to present it continuously updated f.e. in the form of an Excel like pivot table or a Flex OLAPDataGrid (like here http://labs.adobe.com/wiki/index.php/Flex_3:Feature_Introductions:_OLAPDataGrid ) Alex
  12. I think it does what the doc says, and the source and complete product is available for you to play with. Feel free to join the user or dev community. General claims you have like "it won't scale (...)" starts looking a lot like FUD Peter.

    The use case we have here is to integrate this continuous computation to present it continuously updated f.e. in the form of an Excel like pivot table or a Flex OLAPDataGrid (like here
    http://labs.adobe.com/wiki/index.php/Flex_3:Feature_Introductions:_OLAPDataGrid )

    Alex
    That's not FUD. It's well established MOLAP theory and practice. I've spent the last several years studying bitmap cubes to figure out how I can use them and incorporate those ideas into a rule engine. This came out of my own work with trading systems. I've studied the query compiler in esper last year for a few weeks, but I didn't look at how esper's indexes things. I think it is a fair question, since the documentation doesn't state what kind of indexing it uses and I don't know esper code. My apologies if my tone was a bit harsh, but I did ask if my interpretation is correct. If you point me to the classes in esper that handle the indexing, I'll happily take a look. I've studied various bitmap implementations over the last 4 years, so it would be interesting to me to compare it. If you can't tell, I'm obsessed with all things related to pattern matching, rule engines, and business rules. I'm not trying to single out espers. I take a critical eye to all rule engines and rules related technologies. My focus is on pushing rule technology forward, be it open or closed source. I've challenge plenty of commercial products in the past, so I have nothing against esper. peter
  13. I think it does what the doc says, and the source and complete product is available for you to play with. Feel free to join the user or dev community. General claims you have like "it won't scale (...)" starts looking a lot like FUD Peter.

    The use case we have here is to integrate this continuous computation to present it continuously updated f.e. in the form of an Excel like pivot table or a Flex OLAPDataGrid (like here
    http://labs.adobe.com/wiki/index.php/Flex_3:Feature_Introductions:_OLAPDataGrid )

    Alex
    Over lunch I spent a little bit of time and looked at the current code in trunk. http://svn.codehaus.org/esper/esper/trunk/esper/src/main/java/com/espertech/esper/view/stat/ This is totally non-scientific and non-rigorous, so please excuse any misinterpretation or errors. Looking at the classes in that package and the olap package, I see Esper does have dimensions, cells, and measures. I'm not an expert in OLAP by any measure, but it looks like a faithful attempt at implementing multi-dimensional cube. If I understand the code correctly, each cell stores a calculated value, like sum, ave, stdev, etc. That conforms to my understanding of how some OLAP engines work. Not all OLAP products work this way. Some delay the calculation of the measure until query time. In some of the older OLAP products that pre-calculated the entire cube, there was an issue with space explosion. The OlapReport provides a great explanation of that issue with older products. What I can't tell is whether Espers eagerly pre-calculates all measures. Just a suggestion, but it might be a good idea to enhance the documentation to explain the feature in greater detail. For those of us that are nitpicky and anal, it helps to provide a clear explanation. My first impression is the cube was simply a materialized view like a summary table. the other thing that isn't clear to me is whether Esper provides a way to create a cube from multiple tables. The approach I've been working on for RETE uses a lazy approach, which doesn't pre-calculate all measures. Instead, it just indexes the facts based on the join. peter
  14. I've been working on a MOLAP extension for RETE algorithm, which will provide an efficient way to support multi-dimensional queries. I started to work on this back in 2003, so it's taken almost 5 years to arrive at a model that "should" be powerful and scalable for the general case
    Peter, I would be very interested to talk to about integration of CEP and BRE using OLAP-based uniformal data model. Any chance you could reply to my e-mail agodin at mitre dot org directly? Respectfully, Arkady
  15. Join TowerGroup, TraderTools and e-Forex for the free Sybase sponsored audio webinar scheduled for Tuesday September 14 at 10.am EST


    Complex Event Processing - helping to build, test, debug and deploy faster FX trading applications

    Key topics to be covered include:
    - CEP as a foundation for high speed, high frequency FX trading
    - Reconfiguring CEP platforms to assist with liquidity aggregation and FX trade venue connectivity
    - Factors influencing a customer’s choice of CEP supplier
    - Deployment time-frames involved with integrating CEP solutions into FX trading operations
    - Leveraging CEP for Retail FX applications
    - The development of “best practices” for connectivity, integration and deployment of CEP frameworks in FX

    Please register at this link to attend:
    https://eforex.omnovia.com/register/91361280165277

    Regards
    The e-Forex Digital media team