I've thought about this design problem and searched the J2EE space. The design itself is appealing outside of J2EE, but I would like some feedback about how to achieve it with EJBs or J2EE in general.
The simple model is of a work queue (typical example CompSci tutorial on concurrent programming). The items in the queue are jobs. They essentially contain a Command object and some other info about the request being made.
Clients request work by enqueueing jobs. Workers perform work in the form of dequeued jobs. If you like, these are the Producer and Consumer in the usual jargon.
Multiple "Workers" (when they are free) may check the queue for pending jobs and dequeue and perform the job. The jobs may take minutes to hours to complete. Users don't wait.
The queue full of jobs must be persistent so that enqueued requests will not be forgotten after a reboot. And transactional access to this shared resource is desirable.
Now, please nobody jump in and say that JMS or MDB magically solves the problem because I've tried to see how that works but I conclude that it does not. I do know that JMS is implemented using a similar model, that of message queues.
Firstly, consider the situation where the queue always has jobs in it and workers are always working. The workers are "busy" most of the time, but every now and then they finish a job and may check the queue for more work to do. The jobs themselves know nothing of being in a queue.
Jobs can be cancelled.
I have considered representing the queue with a SessionFacade but this would only allow me one queue (unless the dequeue() and enqueue() methods were parametised by queue identity) and it may mean I need to add back-references for queue position (and queue id) into the jobs (which could be implemented as Entity Beans). The ForeignKeyMapping Pattern suggests this for persisting collections. This seems OK, but I'd like to bind queues and jobs in only one direction, (queue knows the jobs but not vice versa).
Anyone got some thoughts?
Look back into JMS and MDB. It solves your problem, I am not too sure what you are missing. I run a simmilar case with no issues. Infact has been running for the past year and a half.
I'd be interested to know why you think otherwise.
I suppose you could have Job and Stack entity beans, with a one to many unidirectional CMR from Stack to Job. But I don't see how you can get away from JMS if each worker is to run concurrently. How are you planning to manage worker threading?
JMS does solve your problem with Message driven beans.
You create your Queue using JMS and then you have message driven beans subscribe to the queue.
You can have a servlet or some app publish messages to the JMS queue. Every time a message is put in the queue a new instance of your message driven bean is created which does the work you want it to do...
I've followed up the references provided and re-read a fair bit of JMS material. Thankyou, however closer to my answer and JMS/MDB provides more options than I was aware of. However there are still gaps (probably in my understanding).
I would love JMS to be a magic bullet, but my JobQueue and Job types do not appear to cleanly map to MessageQueue and Message cleanly.
I assume the idea of using JMS is that I would have a MessageQueue full of Messages (serialized Jobs) and message consumer(s) doing the work. What's missing? So far I can see two things:
1. JMS doesn't appear to guarantee order. Or does it? I'm not sure about this. I need to ensure that once a jobs are dequeued in the same order as they are enqueued. This requirement can bend (I have high tolerance for time variations). Note that this requirement assumes the queue is a FIFO queue. I am probably not going to need a priority queue (where items are dequeued in priority order) but my non-distributed non-J2EE prototype can have a priority queue dropped in with minimal change. This flexibility is a design strength to me that seems absent from the "just use JMS" answer. Before anyone suggests using Message Metadata I do not want to simply leave the priority or custom metadata choices to the message producer. In my view it's a schedule administration issue.
2. Correct me if I'm wrong, but you can't cancel a message. I need to be able to cancel jobs. I also need to be able to see the list of jobs sitting in the queue.
I hope I'm being clear here: I'll most likely use a message driven anychronous request for starting the actual work, but understand that I need to control (and observe) pending jobs prior to commencing work on them. It's a scheduler!
Am I still just being slow here?
Another thing: if I used MDBs I would want to ensure that I limited the number of instances of MDBs that were created to service the queue. Is the opportunity and mechanism for this container-implementation-specific?
This is a pretty well developed area of knowledge. Posting design or implementation examples here would be an overkill - just go to www.google.com and search for ThreadPool ThreadWorker keywords. In fact, all J2EE servers utilize this approach internally to serve all ranges of incoming requests. Also, java.sun.com has extensive examples on the subject, and Apache commons has some toy implementations.
As for particularities of the implementation running inside an appserver, there are things that you have to keep in mind to avoid hard to track problems:
1. Never instantiate such thread pools from EJB - the best place for this is an init() methods of a dedicated servlet that has load-on-startup on - that will guarantee that a thread pool starta up at right time.
2. If you are going to supply SLSB interface to such a service - make sure that bisness methods of such SLSB are not transactional.
Personally I'd try to leverage existing J2EE infrastructure (JMS/MDB) before going the way descibed above.
Hope this helps.
Actually I probably only want one worker thread per cpu. The jobs take several hours of full CPU utilization to complete and they use a lot of RAM. Multithreading that on a single CPU is not appropriate.