New Article: A Finite State Machine for Asynchronous Services

Home

News: New Article: A Finite State Machine for Asynchronous Services

  1. More and more business processes involve workflow - the ability to link together multiple tasks in a given sequence, and to control the flow of that sequence depending on internal and external events. Benjamin Possolo describes just such a software system implemented in Java, which also includes the use of Hibernate to persist and save the state of the workflow at any given time. Read article

    Threaded Messages (17)

  2. Perhaps it is my browser, but I cannot click on the link.
  3. IMO FSM is a private case of EDA (Event Driven Architecture). There are many cases in which FSM needs to be executed in real time. A good example for that are call centers. In those type of applications using a database to save the call state or a broker to execute the workflow is un acceptable due to the performance and scaling bottleneck. Pure data caching or messaging solution falls short as well as you need a combination of the two to be able to share state and generate events per state change at the same time. Other in-memory solution that combines the two would be a better fit. Both GigaSpaces or Terracotta seem to be better positioned on that area. Both provides a combination of state sharing and event distribution capabilities built into the core of the technology. The nice thing is that you can easily use the java.util.Concurrent executor in a distributed fashion that will enable the distribution of tasks and sharing of state between those tasks in a distributed system to be as simple as stand alone application. See snippet of how such execution would look like using new GigaSpaces executor abstraction: ExecutorService executorService = TaskExecutors.newExecutorService(gigaSpace); Future future = gigaSpace.submit(new MyCallable()); int result = future.get(); Full API and code can be viewed here. There was a case study presented by Nortel few years ago where they illustrated their call center application based on similar model mentioned above (Using core JavaSpaces API). The online presentation (taken from one of the Jini meetings) is available here HTH Nati S.
  4. Some simpler alternative[ Go to top ]

    IMO FSM is a private case of EDA (Event Driven Architecture). There are many cases in which FSM needs to be executed in real time. A good example for that are call centers. In those type of applications using a database to save the call state or a broker to execute the workflow is un acceptable due to the performance and scaling bottleneck.

    Pure data caching or messaging solution falls short as well as you need a combination of the two to be able to share state and generate events per state change at the same time.

    Other in-memory solution that combines the two would be a better fit. Both GigaSpaces or Terracotta seem to be better positioned on that area. Both provides a combination of state sharing and event distribution capabilities built into the core of the technology.

    The nice thing is that you can easily use the java.util.Concurrent executor in a distributed fashion that will enable the distribution of tasks and sharing of state between those tasks in a distributed system to be as simple as stand alone application.

    See snippet of how such execution would look like using new GigaSpaces executor abstraction:


    ExecutorService executorService = TaskExecutors.newExecutorService(gigaSpace);
    Future future = gigaSpace.submit(new MyCallable());
    int result = future.get();


    Full API and code can be viewed here.

    There was a case study presented by Nortel few years ago where they illustrated their call center application based on similar model mentioned above (Using core JavaSpaces API). The online presentation (taken from one of the Jini meetings) is available here

    HTH
    Nati S.
    Valid approach. But a simpler approach is just to wrap stateful session beans around the nodes in a workflow. It provides state, txn, security and distributed capabilities. And you can configure how long you want the state to be maintained in memory before writing to disk/db (passivation time). For asynchronous-ity, you can invoke it via JMS, Quartz, etc.
  5. Re: Some simpler alternative[ Go to top ]

    a simpler approach is just to wrap stateful session beans around the nodes in a workflow. It provides state, txn, security and distributed capabilities. And you can configure how long you want the state to be maintained in memory before writing to disk/db (passivation time).
    I fail to see how an approach the requires so many moving parts can be made simpler. Nati S.
  6. You seem confused[ Go to top ]


    a simpler approach is just to wrap stateful session beans around the nodes in a workflow. It provides state, txn, security and distributed capabilities. And you can configure how long you want the state to be maintained in memory before writing to disk/db (passivation time).


    I fail to see how an approach the requires so many moving parts can be made simpler.

    Nati S.
    And I fail to see why using x number of systems to solve y number of moving parts automatically makes it the better solution. Just because it has many moving parts does not mean it can't be made simpler. "Simpler" is a relative word. It does not imply something is simple, merely less complex to what is being compared. And just because you fail to see, does not mean it does not exist nor it is not viable. I'd prefer to evaluate and make a decision than to simply hand-wave alternatives away. Of course that is just me, you are more than welcome to do whatever you want.
  7. Re: You seem confused[ Go to top ]

    And I fail to see why using x number of systems to solve y number of moving parts automatically makes it the better solution.
    Steven It looks like we have a potential communication gap. Can you elaborate why do you assume that there is x number of systems involved? My point was that the term FSM usually associated with Real Time Systems. For example a typical FSM is a telco switch. You could argue that other systems such as billing system and trading systems have similar requirements as well. In many of RT systems having a solution that relies on central database, transaction coordinator (2pc) and a workflow engine is not going to work. The approach that i was referring to provides similar level of workflow and state management using pure in-memory cluster that leverages event driven programming to trigger state transition and workflow. Because it is in-memory and lite-weight in nature (no moving parts but a single in memory cluster) it suites better with RT systems. Now the nice thing is that you can wrap most of that logic in the same way you would write a SessionBean if that the preferred programming model. The following workflow example shows how that model works in practice. It demonstrate a simple transaction flow that include incomming state to matching state and to completion state and how that state transition can happen outside of your code. Nati S. GigaSpaces
  8. Perhaps..[ Go to top ]

    Steven
    It looks like we have a potential communication gap.
    Can you elaborate why do you assume that there is x number of systems involved?
    Because you seem to imply as such in your first post. You paint a picture of using x number of systems (caching, transaction, etc) before saying it can all be done system y (Gigaspaces/Terracotta). My first reply to that was to say you dont even need to paint the picture of x number of systems, ergo using system y, when simple stateful session beans give you all of those features. Why overcomplicate things and then propose a simpler solution, when you can start off with a simpler solution from the get go? I question that line of logic not the systems you propose.
    My point was that the term FSM usually associated with Real Time Systems.
    That is an incorrect assumption and a generalization. Without going off topic, the article discusses about using a FSM for business process workflow purposes. Not all business process workflows need a Real Time requirement. Many companies across many industries uses workflow to dispatch and queue tasks among the organizational hierarchy in mid/back office environments for various business processes. There is no real time requirement for most of these. But I dont mind limiting the discussion to Real Time requirements that you are focusing on.
    For example a typical FSM is a telco switch. You could argue that other systems such as billing system and trading systems have similar requirements as well. In many of RT systems having a solution that relies on central database, transaction coordinator (2pc) and a workflow engine is not going to work.

    The approach that i was referring to provides similar level of workflow and state management using pure in-memory cluster that leverages event driven programming to trigger state transition and workflow. Because it is in-memory and lite-weight in nature (no moving parts but a single in memory cluster) it suites better with RT systems.

    Now the nice thing is that you can wrap most of that logic in the same way you would write a SessionBean if that the preferred programming model. The following workflow example shows how that model works in practice. It demonstrate a simple transaction flow that include incomming state to matching state and to completion state and how that state transition can happen outside of your code.

    Nati S.
    GigaSpaces
    I dont think we disagree here. Having a DB to load state to handle real time demands is not going to be feasible. However, I was pointing out that Stateful Session Beans are in-memory, you only need to hit the DB when you passivate, which can be configured and customized. Although it is not in the J2EE spec, most containers provide means for distributed state, clustering, security, and distributed transactions. So you get that without plying addition systems/framework on top. Again, I dont disagree with the system you propose. I merely question the need to complicate matters and then go the proprietary (however open it may be) route when the problem can be easily solve with the prevalent standard system. Yes, I'm assuming that since j2ee is the prevalent standard for Java enterprises, there are already existing j2ee systems in most Java Enterprise based shops. Perhaps I should have made this assumption clear up front, my bad for not doing that. Obviously if you are setting up a new shop to cater for a specific niche such as the one you focused on, then perhaps the prop systems you describe are a better fit. You do realize there are other dimensions to consider when deciding which technology to adopt?
  9. Real-time[ Go to top ]

    Nati, you are correct. this framework does even attempt to realize any sort of solution for systems with strict temporal requirements...but then again, i think this can be said for BPM solutions (or really, anything that interacts with a database) as well.
  10. Another asynchronous alternative[ Go to top ]

    Hmm... The link to the article doesnt seem to work.. try this Read Article Anyway, other than spawning threads yourself or using WorkManager or using JMS, another alternative is to use Scheduling frameworks like Quartz. They are generally by nature asynchronous. You can dispatch your job and return immediately for asynchronous behaviour. Also, you get all the customizability like retries, persistence of state, transactions, job management, etc. Although I wouldnt recommend using the Quartz state to store application states, a handle to it probably, but not the full state.
  11. Quartz[ Go to top ]

    Steven, yes you are correct, Quartz does provide asynchronous behaviour. I haven't researched it too much (so correct me if im wrong) but i was under the impression that it simply serializes Jobs and one has very little control over exactly what/how things in a Job are persisted. If you are persisting massive amounts of data, then reloading a Job and its huge state multiple times would be extremely inefficient. Additionally, Jobs created for Quartz are not portable across other asynchronous services. One of the goals of my framework is that it allows one to define a process that can be executed by an ExecutorService, by a WorkManager, or if you wanted, by a Quartz Job. The main goal of this framework was to provide a model for designing a process as a finite state machine that is persistent. the process's state is disjoint from the process itself and this allows one to coherently identify exactly what/how/when things are being persisted.
  12. Workflow[ Go to top ]

    More and more business processes involve workflow
    I think if you read this article with "workflow engine" in mind and how it compares to another workflow engine you may consider using, it'll help understand what the author is developing. Of course, a finite state machine is a workflow. How the background (batch) execution of workflows is performed, factoring in workflow throttling capabilities and workflow storage repositories, is paramount for consideration when you care about managing many workflows running in the background. Transaction handling, how you handle threads in a JEE environment, persistence of workflow user data, etc are all things that you either need to trust that your workflow engine handles for you or you need to take care of yourself. As Steven mentioned above, you have batch scheduling frameworks to consider using as well in addition to java.util.concurrent.Executor, JEE's WorkManager, and workflow engines that provide support for background execution of workflows and throttling. Much to consider. Much prior work to compare against, since that's the easiest way for potential users to decide whether one solution meets their needs over a different solution. Cheers, David Flux - Java Job Scheduler. File Transfer. Workflow.
  13. FSM?[ Go to top ]

    Perhaps I didn't read your article with enough attention, but AFAIK FSM are NOT powerful enough to model workflows. (Maybe petri nets could provide a more powerful and flexible framework) Regards, Sandro
  14. It has all been done before...[ Go to top ]

    I see that the author explains how a state machine (Mealy and Moore machines, to be precise) can execute the business flows, along with persistence. Aren't problems being addressed by BPEL, BPM related technologies such Weblogic JPDs, etc.,? Also, these technologies offer some pluses - a consistent, XMLish language to define business processes (=> framework as opposed to design guidelines) -sipayi
  15. BPM solutions[ Go to top ]

    Siplin, BPM solutions are slightly different from what ive tried to create here. Perhaps if i explain a bit about my requirements i can provide some insight into the differences. i have a collocated, JSE application that had the need for some long running processes that I could easily start and stop on user command. The application is for managing some financial market data feeds and involves retrieving very large files from an FTP server and processing them locally. Downloading the files is done as a background process and in various phases and i wanted to be able to interrupt the download process and then restart it at any given time. Local processing is also a background process and can take a while too and i wanted to be able to break it up so it could be paused at any time. There are also some other long-running tasks that I occasionally do. additionally, some of my processes communicate with each other via transactional/persistent queues. Finally, i had a need for wait states in some of my processes. I didn't want to have to be dependent on an application/messaging server because it would be overly complicated for my customers (who are mostly finance guys) to setup, configure and run a full blown app server on their machines which could potentially be on anywhere from a few hours per day to months at a time. So naturally i decided to define some formal "processes" to do all these tasks (and in a concurrent/multi-threaded fashion). I was originally modeling them as FSMs and eventually wanted to define the various state changes as atomic operations that could be retried but i didn't want to bog everything down with transaction demarcation and "retry-logic". from there came the framework. there are lots of other process management frameworks out there like jBPM and Pyshun (a recent article on TheServerSide was written about this as well) but I didnt like the idea of separating the business logic for a process across a dozen files (ie process definition, actions, transitions, conditions, etc). Granted, these other frameworks have some really fancy features but they were simply too complex and i wanted a process developer to have a bit more control over what exactly is being persisted. by defining a concrete "ProcessState" with your persistent fields, it becomes, IMO, much clearer. The major difference between a process within my framework and a JBPM-style process is that the JBPM-style process is used for defining a workflow that is driven by external threads of execution. A JBPM process uses a graph-based structure to define control flow. External threads of execution send signals to the process to change states. Transactions are demarcated by the external threads of execution and usually begin before a signal is sent to a process and conclude once the call to that signal has completed (which occurs when control within the process has encountered a wait state). Once a wait state has been reached, the process (i.e. it's state variables) can be persisted. In this framework's context, a process is associated with a single thread of execution. The thread imperatively executes the steps in a process's run() method. Often times, a process must communicate with external components or resources, for example, to receive some sort of input data or approval. Wait states can be implemented using traditional Condition variables and await()/signal() or using some of the other constructs provided by the java.util.concurrent package such as CyclicBarrier and CountDownLatch. This is not possible within a JBPM process (as is mentioned directly in its reference document) because all execution occurs within the context of a transaction. The entity that represents a process's state is encapsulated within a ProcessState class which is explicitly associated with a Process via class generics. Since a ProcessState instance is disjoint from a Process, it can be easily persisted and manipulated by its owning Process or inspected by other components. Interacting with the ProcessState object is done within a Transition (which is associated with a single transaction). Thus, outside of a transition, the thread executing the process can do anything it wants (for example, sleep for 20 hours). I hope this clarifies things a bit.
  16. The 3 transitions are enclosed in a single transaction context at the beginning of the article, which suggests they have to be committed or rolled back all at once during a batch processing. However, they are split into 3 transactions in the rest of the article, meaning, if transition 2 fails, there is no way to roll back the committed transitions 0 and 1. In short, the state machine is not transactional at all.
  17. transaction[ Go to top ]

    In the first example, the three transitions are grouped within a single transaction because that particular example is not modeled as an FSM. If you were to wrap each transition in its own transaction, and stop the process between transition 1 and transition 2, for example, then the next time the process is started, it would immediately attempt to execute transition 1 again. It has no recollection of where it terminated. The example is later extended to define that same process as an FSM where it can be stopped at any time and later restarted. Thereby allowing one to partition the large transaction into smaller, atomic units of work.
  18. Project on Google Code[ Go to top ]

    I have put the project up on Google Code for any that are interested. A link to the project can be found here: http://code.google.com/p/javenue-process-framework/ The SVN related access information is available under the Source tab: http://code.google.com/p/javenue-process-framework/source/checkout