Anyone developing production Batch processes in J2EE these days? Cause I'm having to build lots and there isnt much information about.
Would like feedback on the appoaches outlined in:
http://www.devx.com/Java/Article/20791
Any comments on Scalability? Recoverability? etc greatly appreciated.
-
Batch Processing in J2EE - strategies, pros & cons (5 messages)
- Posted by: Lara D'Abreo
- Posted on: April 28 2004 19:44 EDT
Threaded Messages (5)
- Batch Processing in J2EE - strategies, pros & cons by Paul Strack on April 28 2004 21:00 EDT
- Batch Processing in J2EE - strategies, pros & cons by Lara D'Abreo on April 28 2004 21:54 EDT
- Batch Processing in J2EE - strategies, pros & cons by Niel Eyde on May 04 2004 09:33 EDT
- Batch Processing in J2EE - strategies, pros & cons by Lara D'Abreo on May 05 2004 07:53 EDT
- Cons by Joel Boeder on March 04 2005 15:33 EST
-
Batch Processing in J2EE - strategies, pros & cons[ Go to top ]
- Posted by: Paul Strack
- Posted on: April 28 2004 21:00 EDT
- in response to Lara D'Abreo
One "gotcha" I have seen for J2EE-based batch processing is using Entity beans. In other words, don't. Entities are optimizations for caching and small transactions, and these optimizations break down in large batch jobs. Using JDBC for better control (or maybe a JDBC-based ORM tool like JDO/Hibernate).
As for the rest of it ... I have to admit it is outside my area of expertise. -
Batch Processing in J2EE - strategies, pros & cons[ Go to top ]
- Posted by: Lara D'Abreo
- Posted on: April 28 2004 21:54 EDT
- in response to Paul Strack
One "gotcha" I have seen for J2EE-based batch processing is using Entity beans. In other words, don't. Entities are optimizations for caching and small transactions, and these optimizations break down in large batch jobs. Using JDBC for better control (or maybe a JDBC-based ORM tool like JDO/Hibernate).
I cant comment on Hibernate or other persistence mechanisms under load but in my experience any sort of caching of the current operating data (whether it be entities or not) is wasted. This is cause Batch processes tend to process once and forget. A particular piece of data is never revisited so caching doesnt add any value.
Call centres have the same problem. High rate of calls, little or no common data from call to call.
For Batch processing I've tended to use pure JDBC so that I can use perf. optimisations like Batched Updates. etc. -
Batch Processing in J2EE - strategies, pros & cons[ Go to top ]
- Posted by: Niel Eyde
- Posted on: May 04 2004 09:33 EDT
- in response to Lara D'Abreo
By enabling the J2EE processes (session beans) as web services, you can pretty easily invoke them from the command line. There a situations where I have used Windows Scheduler to invoke J2EE-based batch processes. -
Batch Processing in J2EE - strategies, pros & cons[ Go to top ]
- Posted by: Lara D'Abreo
- Posted on: May 05 2004 07:53 EDT
- in response to Niel Eyde
Any Bean can be invoked from the command line and I dont need WS to execute a bean. Given that the problem is not to interroperate or distribute across different technologies and systems but rather to scale on one platform (J2EE), I dont understand what WS provides or perhaps what your proposing.
As a side note, Web services are fairly verbose, they arent transactional (i.e guaranteed) and specifications from Oasis and W3C like WS-ReliableMessaging are yet to be ratified. -
Cons[ Go to top ]
- Posted by: Joel Boeder
- Posted on: March 04 2005 15:33 EST
- in response to Lara D'Abreo
I programmed batch processes with J2SE and used the operating system for control (CRON jobs, Windows Scheduler, or whatever your OS supports) and this worked fine.
I don't think the J2EE servers are ready for batch processing yet. I played around with using J2EE for batch a little two years ago and had all sorts of problems. This article didn't mention a lot of practical problems such as:
- EJB pools are not sized correctly for batch processing. On some app servers min and max number of objects are just suggestions. In batch processing max number of batches processing at a time need to be hard limits.
- EJB's passivate at the wrong intervals for batch processing and passivation can not be shut off.
- Entity beans are not efficient for batch.
- Alarm beans are only implemented on the newest versions of some servers.
- Memory constraints on session beans could be problematic for large batch processes.
One thing that bugged me about this article assumed that all of your batch processes could be placed into tiny little subprocesses. In high throughput batch processing this is almost never done because you want to achieve maximum throughput (it can never go fast enough) and the quickest method is always to leave everything in one process. What they really seemed to be describing is a messaging architecture which is similar to batch processing but is still somewhat different.
The main issue I see is that application servers assume an online environment and until the vendors (bea and ibm) put out a version of their server specifically for batch the online environment assumption will get in the way. This article had a lot of nice points but its all theory until one of the vendors supports it.