Java Development News:

An Introduction to the Drools Project

By N. Alex Rupp

01 May 2004 | TheServerSide.com

May 2004

Discuss this Article


Introduction

Part one of this article revisits an old concept and introduces a new technology for the Java Enterprise developer's utility belt. I'll discuss how Rules Engines can improve the agility of your business by helping you isolate the "logic of the bottom line" from the technical logic of your software applications. I'll also introduce the JSR-94 Rules Engine API and an Open Source product called Drools, the forerunner implementation of this up-and-coming technology. In part two, we'll revisit our examples in greater depth and take a closer look at some of the intricacies of both the Drools engine and its JSR-94 extensions.

Why use a rules engine?

The business world is full of cliches about change, how everything changes, how change is the only thing you can depend on. In the technical world, this isn't exactly the case. We've been collectively trying to solve the same set of problems in software for thirty years--sometimes more. In the last decade, IT folks have been inundated with literature about rapid/extreme/agile development ideologies that stress the importance of flexibility and change.

But business needs often change faster than development teams, their processes and the technologies they rely on can keep up. We're getting better at it, but business tacticians still find themselves crippled as they try to realign their IT departments to support the shifting needs of their business. There's a lot of friction and frustration involved in this process.

Lost in Translation

As smart as IT personnel are, they are susceptible to the "telephone" effect. IT efforts often add as much friction to the execution of a business plan as they do value. Unfortunately, by the time the development teams fully understand the rules which govern decision-making in the business and are able to capture that decision making power in software code, the rules have changed. The software is obsolete before it has even reached the market and it needs to be refactored to support the new requirements. If you're a developer, you know what I'm talking about--we call this aspect of the development cycle "feature creep". Very few things are as frustrating to developers as having to build a system on shifting soil. As a software developer, you often have to know as much about the business as the executives do--sometimes more.

Imagine for a moment that you're a business leader. Your company's success often hinges on your ability to notice changing conditions in the market and figure out a way to take advantage of the new environment before your competitors catch on. Every day you have access to more and better information about your market, but it might not matter. Bold and clever insights and the "information advantage" can easily be squandered in the 6-9 months it might take to complete a development cycle for a new product. And when the product ships, chances are it's either feature-light, over budget, past due or some combination of the three.

To make matters worse, at the end of the development lifecycle, the market conditions could be fundamentally different than they were when the project was concieved. Now you're forced to comply with new legislation, you've lost your marginal advantage, and three of the five people who designed your software system have left the company--or worse--HQ is moving the entire department overseas. You're going to need to explain the complexities of the business to people who might not share your native language. If things don't work out, you could easily find yourself straddled with a poorly documented legacy application that you don't understand, that doesn't address your immediate business needs, and all the while orders are coming down from on high to "leverage existing assets".

Where did your strategy break down? Where are the places you might have done better? Recent literature on extreme programming, agile development and other lightweight processes stress the importance of automated unit testing and feature prioritization. There are other principles your developers are familiar with, which can help them respond to your changing needs and shorten turnaround time for their porjects. Most of these principles, like system decomposition, have been around for decades and are aided by up-and-coming additions to the Java platform (like the Java Management Extensions library). Many of these principles, like Object-Orientation and Role modeling, are built right into the Java language.

But Java's still a pretty young language, and the Java platform is by no means complete. One technique that's gaining traction in the Java community is to separate the business decisions of your executives from the technical decisions of your developers, and to keep those business decisions in a central data store, where they can be managed and altered in real-time (that is, business-time). It's one strategy you might consider.

Why should your development team have to capture in code the subtle and complex rules that guide your decision making as a business executive? How can you convey the subtleties of your reasoning to them? If so, is it prudent? Probably not. Something might be lost in translation, like the bottom line. Why take the risk that the logic governing your decisions (the executive logic) will be misrepresented in the application code or even in the testing code? If it is, how would you verify that--would you learn to program and write all the unit tests yourself, or would your customers test the software for you? It's hard to keep one eye on the markets and the other in the software code.

It makes more sense that these rules should be centrally located in a place where you can manage them directly, in an intuitive format that you can easily understand, instead of scattered throughout the application in software code where you can't get at them. If you can keep the executive logic out of your software and trust your developers to make the right technical decisions, you will notice the difference. Your project lifecycles will be shorter, and your software will be more adaptable to the needs of your business. Instead of trying to steer the titanic, you'll feel like you're in a tri-hull racing cat.

A Standard API for Rule Engines in Java

In November, the Java Community approved the final draft of the Java Rule Engine API specification (JSR-94). This new API gives developers a standard way to access and execute rules at runtime. As implementations of this new spec ripen and are brought to the market, programming teams will be able to pull executive logic out of their applications.

A new generation of management tools will be needed to help executives define and refine the behavior of their software systems. Instead of rushing changes in application behavior through the development cycle and hoping it comes out correct on the other end, executives will be able to change the rules, run tests in the staging environment and roll out to production as often as becomes necessary.

But this is going to require some changes in the way developers approach the design of the system, and they're going to need the right tools for the job.

Separating the Executive and the Technical Concerns

Here's a very simple example of how this might look, from an executive's perspective.

You manage a contrarian investment fund. One part of your company's computer system performs fundamental analysis of stock prices, earnings, and assets per share, and alerts you when a stock warrants further inspection. The job of the computer system is to identify stocks with a low PE ratio relative to the market, and flag the stock for further inspection.

Your IT staff has a Bloomberg data feed and has developed a collection of simple data objects which you can reference in your rules. Now, for the sake of the example, we're going to assume you are a fairly educated and tech-savvy manager, and you understand the basics of XML enough that you can write and maintain a simple XML rules file. (like I said, a whole generation of open source tools is on the way for writing and maintaining rule bases, but for now we're going to look at the raw XML).

Your first rule might be to evaluate all stocks in the Dow Jones Industrial and pull out anything with a P/E ratio over 10 (this is a bit simplistic, but play along for now). The stocks that remain can then be used to generate a series of reports. For the sake of the example, this is what your rule file would look like (we'll revisit the structure of this file later on):

<stock:overvalued>
    <stock:index> DJIA </stock:index>
    <stock:pe> over 10.0 </stock:pe>
</stock:overvalued>

One month, you get a call from a Brazilian analyst firm that wants to hire your company to generate a series of reports on Brazilian stocks, but they have much more stringent criteria. Right now in Brazil, the average P/E ratio is in the single digits, so your threshold for picking out undervalued stocks needs to change. Also, your new client wants to cross-reference low P/E with each stock's price-to-book ratio.

You fire up your rule editor, and change it to reflect the new conditions. Now, it polls all stocks in the Brazilian market with a P/E beneath 6.5 and a Price to Book of 1.0 or lower. After you're done editing the rule file, it looks like this:

<stock:overvalued>
    <stock:index> Brazil </stock:index>
    <stock:pe> over 6.5 </stock:pe>
    <stock:pb> over 1.0 </stock:pb>
</stock:overvalued>

You don't need to explain any of this to your development teams. You don't need to wait for them to write the software or test it, or anything. If your engine's semantic language is robust enough to describe the data you want to work with, you can change the rules whenever you need to.

And if the limiting factors are semantic languages and data models, you can be certain that standards will appear for both, followed closely by advanced editors and tools to simplify the task of writing, storing and maintaining the rules.

By now, I hope that the following principle has become clear: Whether or not a stock gets flagged in this example is a business decision, not a technical decision. The logic that determines which stocks get forwarded to your analysts is executive logic, the "logic of the bottom line". Executives make those decisions, and executives should be able to drive that entire portion of the app as they see fit. The rules become a sort of control panel, a powerful new type of user interface for business systems.

Developing With Rules

If you're the developer in this scenario, your job is made somewhat easier. Once you've got a standard semantics language for stock analysis to simplify writing rules, you can take your data objects and run them through a rules engine. We'll come back to semantics languages later on. For now, we're going to return to the example.

Your system feeds a list of stock beans into the rule engine. As the rules are executed, a handful of the stock beans get flagged and your system can do whatever it needs to do with them. Perhaps it forwards them on to a report generation system. The analysts use the reports to help them in their fundamental analysis. Meanwhile, the boss has got you working on a new technical analytics package for plotting Gann angles, and using Dow Theory for predicting market tops and bottoms.

The rules makes your system less complex because you don't need to capture in code why a stock gets flagged, what strange combination of conditions makes a stock important to management. That logic never makes it into your code. You're free to focus on fleshing out the data model, or (better yet), working on the next feature platform, adding value to the application.

By now, the point should be well established that sometimes you can be more efficient by removing volatile business logic from the application code. Not always--simple applications might not benefit from having a rule system in their application. But if you're running a giant application and you've got a lot of volatile business logic, you might want to develop a strategy for incorporating a rules engine into your app. If executed properly, a good rules strategy can make these sort of highly mutable systems much easier to implement and maintain.

Rules engines have other valuable uses apart from removing executive logic from an application. Sometimes you need to apply hundreds of thousands of rules to make a decision, and you've got to run these rules against hundreds of thousands of objects. It isn't difficult to imagine advanced artificial intelligence engines growing to that size. In this case, you're going to need an extremely fast decision-making algorithm or you're going to need some Big Iron. Big Iron doesn't come cheap, but you can have the most performant and scalable decision-making algorithm in the world for a song.

Bob McWhirter's Drools Project

At this time I'd like to introduce Drools, an "augmented implementation of Charles Forgy's Rete algorithm tailored for the Java language." Drools is an Open Source project, written by Bob McWhirter and hosted at The Codehaus. As I write this, Drools is approaching its 2.0-beta-14 release. A complete implementation of the JSR94 Rule Engine API and supporting unit test cases is already available in CVS.

The Rete algorithm was invented by Charles Forgy in 1979, and is by far the most efficient algorithm for production systems ever written (with the exception of Rete II, which is proprietary). Rete is the only decision-making algorithm whose efficiency is asymptotically independent of the number of rules being executed. For the uninitiated, that means it can scale to incorporate and execute hundreds of thousands of rules in a manner which is an order of magnitude more efficient then the next best algorithm. Rete has been used for years in production systems, but hasn't been widely available for the Java platform in an Open Source package.

Aside from its Rete core, Open Source License (Apache-ish), 100% Java implementation and charming development community, Drools offers a number of extremely useful features. Among these are its implementation of the JSR94 API and its innovative semantics framework, which can be used to write languages for describing rules. Right now Drools comes with three semantic modules--one for Python, one for Java and one for Groovy. The rest of this article is going to focus on using the JSR94 API, and we'll cover the semantics framework in the second article.

As a developer using the javax.rules API, your ultimate goal is to construct a RuleExecutionSet, and then to get a RuleSession with it at runtime. In order to simplify this process, I've written a facade for the rule engine API, which can be used to parse an InputStream representing a drools DRL file and build a RuleExecutionSet.

The XML examples I used above would require a custom semantics module to be written. If I wanted to build the same functionality right now I'd be limited to one of the three existing language modules that comes built into Drools. And I'd choose (surprise!) the Java language module. The first rule example from above would look like this if it were written using the Java language module:

<rule-set name="StockFlagger"
      xmlns="http://drools.org/rules"
      xmlns:java="http://drools.org/semantics/java">

  <rule name="FlagAsUndervalued">
    <parameter identifier="stock">
      <java:class>org.codehaus.drools.example.Stock</java:class>
    </parameter>
    <java:condition>stock.getIndexName().equals("DJIA");</java:condition>
    <java:condition>stock.getPE() > 10 </java:condition>
    <java:consequence>
      removeObject(stock);
    </java:consequence>
  </rule>
</rule-set>

Not quite as pretty as the example above, right? Don't worry--we'll cover semantics modules in the next article. For now, notice the basic structure of the xml file. There's a rule-set containing one or more rule elements, which in turn contain parameter, condition and consequence elements. The contents of the condition and consequence blocks bear a strong resemblance to Java. Be careful--there are certain things you can and cannot do inside these blocks. Right now, Drools uses BeanShell v2.0b1 as its Java interpreter. We're not going to go too far into the details of the DRL file or the Java semantics syntax right now, but we'll return to it in another article. Our goal is to show you how to use Drools through its JSR-94 API.

In the CVS tree for the Drools project, among the unit tests in the drools-jsr94 module, there's an ExampleRuleEngineFacade object, based on code from Brian Topping's Dentaku project. This facade object runs through the javax.rules API, building up the most common object structure necessary for working with RuleExecutionSet and RuleSession objects. It doesn't take advantage of all the robust features of the API, or all the subtle nuances of the underlying Drools engine, but it serves as a good example for beginners looking to work with the API.

The following code snippet shows how you would use the rule engine facade to construct a RuleExecutionSet and get an execution session for use at runtime:

import java.io.InputStream;
import javax.rules.*;
import org.drools.jsr94.rules.ExampleRuleEngineFacade;

public class Example
{
    private ExampleRuleEngineFacade engine;
    private StatelessRuleSession statelessSession;

    /* place the rule file in the same package as this class */
    private String bindUri = "myRuleFile.drl"

    public Example()
    {
        /* get your engine facade */
        engine = new ExampleRuleEngineFacade();

        /* get your input stream */
        InputStream inputStream =
                Example.class.getResourceAsStream(bindUri);

        /* build a RuleExecutionSet to the engine */
        engine.addRuleExecutionSet(
                bindUri,
                inputStream);

        /* don't forget to close your InputStream! */
        inputStream.close();

        /* get your runtime session */
        this.statelessSession =
                engine.getStatelessRuleSession(bindUri);
    }

    ...
}

You'll have to write IOException handlers for the InputStream in the sample above, but it conveys the general point. All you have to do is build up your InputStream and feed it into the engine facade to build a RuleExecutionSet. After that, you can get a session and use it to fire off your rules. Using that StatelessRuleSession is a fairly simple matter. In the class above, we could add a method for executing the rules against a list of objects:

public List getUndervalued(List stocks)
{
    return statelessSession.executeRules(stocks);
}

This method would push the list of stock objects into the rule engine, and the rules would evaluate the stocks and remove any stock not considered to be undervalued. It's a simple example, but it gets the point across.

Under the hood, things are a bit more complex. The engine facade builds a RuleServiceProvider object, and uses it to build RuleAdministrator, RuleExecutionSetProvider and RuleRuntime objects. The RuleExecutionSetProvider is responsible for parsing your InputStream and building a RuleExecutionSet. The RuleRuntime object is used to get a session for use at runtime, and the RuleAdministrator manages it all. Beneath this layer is the core Drools API, and at the heart of that is the Rete implementation. I'll spare you the details for the time being, but you're welcome to peek into the facade object to see how it all fits together.

By now you should see some of the business and scientific uses for rules engines, have a very basic introduction to the Drools project, and have some sense of what remains to be learned about this fascinating and powerful new technology. We have barely even scratched the surface of Drools or JSR-94, but hopefully I've been able to show some practical concepts and give you a general idea of things to come. In the next article, I'll revisit and deconstruct the DRL file structure and go over the Java semantic library so you can write your own DRL files. I'll also show you how to get started writing your own semantics modules. We'll also talk about the concepts of salience and working memory, and show how to take full advantage of the DRL syntax in your consequence blocks.

Resources
N. Alex Rupp is a freelance software architect and developer from Minneapolis, and the current JSR94 Lead for the Drools project. He'd like to thank Thomas Diesler for getting the JSR94 implementation off its feet last summer, and Bob and Brian for bringing him up to speed on Drools. Finally, if you're a financial analyst, he asks that you please forgive him for reducing your profession to a caricature and urges you to keep using Bloomberg.