Opinion: Tying non-blocking IO to J2EE

Home

News: Opinion: Tying non-blocking IO to J2EE

  1. Opinion: Tying non-blocking IO to J2EE (49 messages)

    Mike Spille was on a New York subway train when he had an idea: tying non-blocking NIO with J2EE inside of an application server to get great performance with very high resource-usage efficiency. Mike has been working on switching I/O and threading models for various Java-based servers from the old traditional "one thread per socket" method to using NIO and pools of worker threads. Can this be done in the J2EE to help us out?

    Some of his thoughts:

    "A lot of my recent work for the past few months has been concentrated on switching I/O and threading models for various Java-based servers from the old traditional "one thread per socket" method to using NIO and pools of worker threads. The main benefit of doing this is to have control over threading - we can configure thread pools to our needs, and if we wish we can easily support thousands of clients on a single server w/out swamping the server with thousands of threads at the same time. Some direct performance benefits can be realized here as well, but with "legacy" code this is often minimal, most often due to usage of Java serialization - serialization is so slow that it's almost not worth optimizing anything else. Cut out serialization, and now you can make some NIO optimizations. Beyond the obvious speed ups of using NIO artifacts like buffers and channels, you can get fancy with non-blocking I/O and various specialized buffer types to really make things zippy.

    But I digress. While doing all this NIO work, I've run into an old problem: we want to have decoupled I/O services which aren't aware of each other, but at the same time we want to centralize control of I/O events, scheduling servicing of them, and assigning thread pools to actually deal with them. My solution is to have a centralized object called the DispatchingSelector, on which NIO channels are registered along with a Worker factory class. The DispatchingSelector manages all of the nuts and bolts of NIO selectors and channels, creates Thread pools of various worker types, and as read/write/accept events occur within the Selector, the event gets dispatched out to an appropriate Worker to deal with it.

    To people who are old hats at high-throughput I/O this is no big deal - people have been doing this for years. But the problem here is really the same sort of one that J2EE addresses for other areas: the baseline JVM and Java library facilities get the job done, but they're low level and difficult for less seasoned developers to use in a productive fashion - and each individual solution is unique and won't play well with others. I expect lots of people are writing their own NIO libraries or frameworks, but none of them are standard, and probably many of them are sub-standard or riddled with bugs, and certainly none of them work together.

    It surely would be nice to have a J2EE-standard NIO mechanism. A higher-level framework built on top of NIO which does all of the niggling bookkeeping, solves all the threading issues, and simplifies channel management so that Plain Old Developers can write high performance non-blocking I/O with little muss or fuss, and have it all managed by Someone Else (the holy grail of software development - have someone else worry about it!)."


    Read Mike Spille's Thinking on NIO and J2EE

    There were reponses to Mike's thoughts, and he this had him writing a "Part Two" of his thoughts. In this he explains more ideas, and delves into what we have now with technology such as JMS, and XA.

    "Another example is XA, 2PC distributed transactions. Right now, when the 2PC lifecycle gets triggered, pretty much every application server hits each resource involved in a serial, synchronous fashion. Let's say you've got a JMS publisher and a JDBC connection linked in via XA. At prepare() time, the Application Server will hit the JMS publisher() with prepare, and wait. After a response, it hits the JDBC connection with prepare(), and waits. Then it hits its own transaction log with a forced disk write (and waits for that to complete). Then the cycle repeats again for the commit() side. Now, it's really a shame that the prepare() and commit() calls go out serially - each added resource elongates the transaction time directly.

    Imagine a world where this is done asynchronously in a non-blocking fashion, and where the 2PC transaction isn't tied to a particular thread. The app server could issue prepare() calls to all resources in parallel, queueing them for external writes, and then that thread can go away (with a timeout stuck in a timer somewhere, just in case). As each call completes, we set up for a read event on each associated external channel, and then go away. As each read event comes back in, we check if we've hit all resources. If so, we trigger the local transaction log disk write, then do the same thing with the commit() cycle.

    In this sort of scenario, the length of the 2PC cycle would be [length of longest prepare] + local disk force + [length of longest commit]. In the app servers of today, the cost is [sum of all prepares] + local disk force + [sum of all commits]. You have to wait for all of these applications to complete, and in the meanwhile you're tying up a thread to do it."


    Read more at NIO and J2EE, Part II

    In related news:

    842 Technology has recently announced Engine/J which "decouples request processing resources from client-server I/O and uses non-blocking I/O to bring greater scalability to the HTTP and AJP request processing components of existing J2EE servers."

    See more at http://www.842technology.com

    Threaded Messages (49)

  2. J2EE IO x J2EE NIO x C/C++[ Go to top ]

    Does somebody know some comparisons between IO and NIO in some real J2EE implementation. I read sometime somewhere something like that JAVA webserver with NIO was faster then C/C++ equivalent.

    I want to believe, but some other proofs can be useful too ...
  3. J2EE IO x J2EE NIO x C/C++[ Go to top ]

    In general, Java with NIO will not be faster than the optimized C/C++ equivalent. The difference is that Java can approach the efficiency of the C/C++ equivalent with far less work and far less risk (security, system crashes, etc.)

    The examples where Java is faster than C/C++ are typically comparing "apples and oranges," or comparing a C/C++ library that is statically optimized against a JVM that dynamically optimizes (for example, some C/C++ encryption libraries are 10x slower than the Java ports of the same libraries, because HotSpot can optimize and inline as far up and down as it sees fit.)

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  4. Matt Welsh is the guy who did the Java/NIO comparison to C-based Apache. Matt's analysis is available at http://citeseer.nj.nec.com/welsh00staged.html. If you dig into the stuff at that link you'll find some quantification of Matt's claim that he was able to leverage NIO to build a "Staged Event Delivery Architecture" (his name) framework that was able to outperform Apache. *More* importantly, Matt demonstrated that his framework was able to condition the server response to varying request loads (look at some of the graphs in has stuff and you'll see what I mean).

    In my opinion, if it was possible to build j2ee servers in a way like Matt suggests, then it would be much easier for customers to buy j2ee server capacity (software + hardware) as the throughput per server would be much more predictable. Too often customers underbuy or overbuy capacity because they don't have a good way to predict throughput per server.
  5. Frameworks on top of NIO[ Go to top ]

    Matt's aproach of building scalable services with adaptability to the load is neat.
    But in general it's very difficult to use event based programming aproach when you are building service which itself is synchronous. Example FTP server will be difficult to build by using this aproach (I already tried and I'm still trying :))

    Regarding more high level frameworks on top of NIO there is couple of patterns:

    Acceptor and Connector: Design Patterns for Initializing Communication Services
    Reactor: An Object Behavioral Pattern for Concurrent Event Demultiplexing and Dispatchin

    By using these patterns it's possible to build higher level I/O handlers which sits on top of NIO.
    Good example: Reattore HTTP server
  6. Leader/Followers (2000)
  7. Frameworks on top of NIO[ Go to top ]

    Matt's aproach of building scalable services with adaptability to the load is neat.

    > But in general it's very difficult to use event based programming aproach when you are building service which itself is synchronous. Example FTP server will be difficult to build by using this aproach (I already tried and I'm still trying :))
    >
    > Regarding more high level frameworks on top of NIO there is couple of patterns:
    >
    > Acceptor and Connector: Design Patterns for Initializing Communication Services
    > Reactor: An Object Behavioral Pattern for Concurrent Event Demultiplexing and Dispatchin
    >


    More generally, see Pattern Oriented Software Architectures, Volume 2 (review discussing relationship to Java: http://www.adtmag.com/java/article.asp?id=357&mon=3&yr=2001 ); of particular interest are patterns like Half-Async,Half-Sync. Java, however, suffers from it's lack of asynch I/O tremendously.
  8. Frameworks on top of NIO[ Go to top ]

    /Giedrius Trumpickas/
    But in general it's very difficult to use event based programming aproach when you are building service which itself is synchronous. Example FTP server will be difficult to build by using this aproach (I already tried and I'm still trying :))
    /Giedrius Trumpickas/

    Using non-blocking I/O and event-driven techniques involves alot of nitpicking attention to detail, but it's not what I would consider "difficult". For something like an FTP server, all that's required is that I/O operations be interruptable and restartable. To do that, you need to track where the I/O was interrupted (due to the next I/O call being flagged as blocking), reinsert it into the event system, and then when more data is available to be able to pick off where you left off. There's alot of detail in such an approach to track the state, but it's not really difficult.

        -Mike
  9. Frameworks on top of NIO[ Go to top ]

    In this case I misused a word to describe complexity.
    In FTP server case is not as simple as you are describing, because this protocol is not typical request/reply protocol and it has more than one connection with client: one for commands another for data transfers. In addition you can get multiple responses from server to one client request depending on current data transfer state. Client can abort data transfer at any time and so on.
    From high level it looks simple: just implement suspend/resume for asynchronous I/O operations and you have it - but in this case evil is in the protocol details.
    So I believe you are a underestimating complexity in this case.
  10. Frameworks on top of NIO[ Go to top ]

    \Giedrius\
    ... one for commands another for data transfers. In addition you can get multiple responses from server to one client request depending on current data transfer state. Client can abort data transfer at any time and so on.
    From high level it looks simple: just implement suspend/resume for asynchronous I/O operations and you have it - but in this case evil is in the protocol details.
    So I believe you are a underestimating complexity in this case.
    \Giedrius\

    Perhaps I am, but I don't think so. I've worked with more complex protocols then you're describing above. The solution is to use a state machine encapsulated in one or more objects which hold both the async I/O progress and any higher level semantics - clean and simple, but admitedly with niggling details. Pretty much all of my non-blocking I/O code works as a state machine at the lower levels for exactly this reason. You only get into trouble if you avoid a state machine model.

        -Mike
  11. Matt Welsh is the guy who did the Java/NIO comparison

    > to C-based Apache. Matt's analysis is available
    > at http://citeseer.nj.nec.com/welsh00staged.html.
    > If you dig into the stuff at that link you'll find some quantification
    > of Matt's claim that he was able to leverage NIO to build a "Staged Event
    > Delivery Architecture" (his name) framework that was able to
    > outperform Apache. *More* importantly, Matt demonstrated that
    > his framework was able to condition the server response to varying
    > request loads (look at some of the graphs in has stuff and you'll see
    > what I mean).
    >

    Matt's SEDA project is available here:

    http://sourceforge.net/projects/seda
  12. RE: Matt Welsh, SEDA, async I/O, and WLS[ Go to top ]

    For those of you who haven't seen it, I highly recommend reading Matt Welsh's papers and code regarding SEDA (the Staged Event-Driven Architecture that was his PhD thesis) and high-throughput I/O systems in general. They're all collected conveniently at:

    http://www.eecs.harvard.edu/~mdw/proj/seda/

    He has working (and open source) code there that he cites as being used in several high throughput (although non J2EE) servers.

    However, before you dismiss the used of asynchronous stages as being untenable for J2EE, think about the way Weblogic implements its I/O handling. If you've ever run WLS on Windows, you probably have noticed that during startup, the log reads something like "Allocated 4 NT reader threads", where the number is something like half the execute threads you configure for the container. While I can't be sure, it seems like a significant part of WLS' phenomenal ability to push I/O comes from some layer of workers that bridge the synchronous behavior of Java I/O streams and some kind of asynchronous native socket API. (As a side note, Welsh did speak at BEA in 2000, but I can't really infer much without more information.)

    From testing, this seems to give WLS a huge advantage in its ability to push a large amount of data across a large number of sockets when used directly as a web server (e.g. no Apache/IIS/iPlanet + plugin in front of it).

    Does anyone have any more data or insight about what is happening here, and how this architecture may be used more broadly?
  13. As a side-note, in a lot of tests, Weblogic can serve static content now faster than Apache.

    Not that I'd suggest paying $90k a CPU for a web server, or whatever the gripe-du-jour is about Weblogic. (The web container called "Weblogic Express" is actually about $300 per CPU if I remember correctly.)

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Clustered JCache for Grid Computing!
  14. Weblogic Express pricing[ Go to top ]

    Weblogic Express costs $640/CPU(AND NOT $300) as listed in many news posts.
    But I believe that if it has the same performance as the Enterprise version, then it might be worth going for it what with a lot of projects NOT using EJBs(especially Entity EJBs).

    Is there a possibility that Weblogic considers releasing a NEW Combo version with just Servlet/JSP AND ONLY Stateless Session EJBs ? That would be pretty interesting to many projects, including ours.
  15. I read all links on NIO.

    Is there a book on NIO or more links on NIO plesae?
    tia,
    .V
  16. java.nio[ Go to top ]


    > Is there a book on NIO or more links on NIO plesae?

    http://java.sun.com/j2se/1.4.2/docs/guide/nio/index.html

    http://otn.oracle.com/oramag/webcolumns/2003/techarticles/hunter_j2se1.html

    http://www.oreilly.com/catalog/javanio/
  17. Improving 2PC[ Go to top ]

    Interesting idea about 2PC.

    However usually the application will want to wait for the outcome of the commit before performing more work ie was my save successful.

    The other point is the throughput of the application server will increase as the threads are used more efficiently however the throughput of your application may not improve appreciably as the application would still have to wait for the data to become available.

    I also posted on Mike's blog but this is the better place since I don't read that blog. Sorry for the duplication.
  18. Improving 2PC[ Go to top ]

    \Riad Mohammed\
    Interesting idea about 2PC.

    However usually the application will want to wait for the outcome of the commit before performing more work ie was my save successful.

    The other point is the throughput of the application server will increase as the threads are used more efficiently however the throughput of your application may not improve appreciably as the application would still have to wait for the data to become available.
    \Riad Mohammed\

    Asynchronous dispatch of the 2PC lifecycle pieces (specifically, async dispatch on prepare() and commit()) can have significant gains even for single-threaded access on the client side. For example, imagine sample timings like these:

       Resource 1 prepare(): 20millis
       Resource 2 prepare(): 25millis
       Tranmanager log: 15 millis
       Resource 1 commit(): 15 millis
       Resource 2 commit(): 20 millis

    In most application servers, the above 2PC lifecycle will take 95 milliseconds. Now imagine that prepare() and commit() calls are dispatched in parallel. In that case, the 2PC lifecycle would take 45 milliseconds (plus a small overhead to coordinate threads). That better than halves the overall transaction time.

    An aggressive use of asynchronicity and NIO in an application server not only can improve overall throughput and efficiency, but it really can have a direct positive impact on individual request/transaction times.

        -Mike
  19. Improving 2PC[ Go to top ]

    \Riad Mohammed\

    > Interesting idea about 2PC.
    >
    > However usually the application will want to wait for the outcome of the commit before performing more work ie was my save successful.
    >

    This isn't necessarily true. In many systems, once you enter the termination protocol, the termination protocol is done once it *decides* to commit (or rollback). Barring heuristic outcomes, it's a matter of time until the outcome is available, but the transaction may be considered to be a success at this point. So it may be that throughput considerations cause the application to care only about the ultimate rather than the immediate visbility of data/state changes under control of the transaction.

    > The other point is the throughput of the application server will increase as the threads are used more efficiently however the throughput of your application may not improve appreciably as the application would still have to wait for the data to become available.
    > \Riad Mohammed\
    >
    > Asynchronous dispatch of the 2PC lifecycle pieces (specifically, async dispatch on prepare() and commit()) can have significant gains even for single-threaded access on the client side. For example, imagine sample timings like these:
    >
    >    Resource 1 prepare(): 20millis
    >    Resource 2 prepare(): 25millis
    >    Tranmanager log: 15 millis
    >    Resource 1 commit(): 15 millis
    >    Resource 2 commit(): 20 millis
    >
    > In most application servers, the above 2PC lifecycle will take 95 milliseconds. Now imagine that prepare() and commit() calls are dispatched in parallel. In that case, the 2PC lifecycle would take 45 milliseconds (plus a small overhead to coordinate threads). That better than halves the overall transaction time.
    >

    Sometimes the gains aren't quite so dramatic. In any case, if you assume that most two phase transactions are interacting against a history of database queries, sometimes the completion protocol is noise. I think the area this is most likely to have a noticeable impact is when less efficient protocols and bridging between trust domains is in play, eg, web services, rather than J2EE.
  20. Improving 2PC[ Go to top ]

    \Greg Pavlik\
    Sometimes the gains aren't quite so dramatic. In any case, if you assume that most two phase transactions are interacting against a history of database queries, sometimes the completion protocol is noise. I think the area this is most likely to have a noticeable impact is when less efficient protocols and bridging between trust domains is in play, eg, web services, rather than J2EE.
    \Greg Pavlik\

    Mileage may of course vary, but in my experience the 2PC cost is often 25%-33% of the total transaction cost. In cases where one or more of the XA resources is doing a fairly trivial operation (but not quite touching on read only semantics), the 2PC cost can drive well over 50% of the total cost.

    My own motivator for wanting more asynchronicity specifically on the XA side is observing real systems where each prepare() and commit() call to each resource is taking 15-30 milliseconds a piece on average. Given that the actual "work" involved in these transactions (e.g. all the work minus the 2PC protocol) only takes around 80 milliseconds, doing the prepares and commits in parallel would result in a very large time savings.

    The true bonus is that adding in a 3rd resource would _not_ significantly increase transaction time (at least as far as 2PC is concerned). Using the synchronous approach that most app servers use today, each added XAResource is incredibly painful. In fact one can argue that almost nobody uses more than 2 resources in 2PC because the cost of hitting more than 2 resources synchronously is just too high. Reduce that cost, and you might see more people doing it.

         -Mike
  21. Improving 2PC[ Go to top ]

    \Greg Pavlik\
    This isn't necessarily true. In many systems, once you enter the termination protocol, the termination protocol is done once it *decides* to commit (or rollback). Barring heuristic outcomes, it's a matter of time until the outcome is available, but the transaction may be considered to be a success at this point.
    \Greg Pavlik\

    I can't think of any application uses where you could do that. If I'm understanding you, everyone can vote "yes" to prepare, and then you could return back to the application with a success indicator (and presumably do the commit() in the background). This would totally violate transactional guarantees - someone might insert() something and then try to read it in the next transaction, and if the background commit() hasn't completed, they wouldn't see it yet. This would break alot of application logic all over the place.

        -Mike
  22. 40 msec is pretty good[ Go to top ]

    But by the bye - the apps I've worked on are usually heavy-duty processing intranet enterprise apps, so I deal with tens of threads in the server concurrently working HARD rather than hundreds that are messing about doing simple reads.

    Saving of 30 to 40 msec per transaction for XA commits would be nice. Since the maximum number of threads for efficient throughput is often limited by dbase blocking for apps with a lot of updates, and since blocking increases exponentially with transcation length, an app server that cut 3 to 4% off my 1-sec transactions could give me an increase of more like 6 to 8% in throughput.

    A noticeable gain, which would be nice to get for free from the app server.

    Sean
  23. Improving 2PC[ Go to top ]

    \Greg Pavlik\

    > This isn't necessarily true. In many systems, once you enter the termination protocol, the termination protocol is done once it *decides* to commit (or rollback). Barring heuristic outcomes, it's a matter of time until the outcome is available, but the transaction may be considered to be a success at this point.
    > \Greg Pavlik\
    >
    > I can't think of any application uses where you could do that. If I'm understanding you, everyone can vote "yes" to prepare, and then you could return back to the application with a success indicator (and presumably do the commit() in the background). This would totally violate transactional guarantees - someone might insert() something and then try to read it in the next transaction, and if the background commit() hasn't completed, they wouldn't see it yet. This would break alot of application logic all over the place.
    >
    >     -Mike

    In general, this is not different than any other case. In fact, this is the essence of why transactions work: what's visible to other transactions (including subsequent transactions) is dictated by the isolation constraints on the system that are typically enforced by locks. The state of the uncommitted but prepared branches is the same as if the transaction was still in flight. Of course, you can wind up in a blocking state because the system still holds locks that won't be released until the (first) transaction completes, but in this case you're no worse off than you were had you waited for the first transaction to complete all commits. The serial processing order between subsquent transactions is preserved because the prepared state is not visible (beyond relaxed isolation that exists for other transactions).

    Greg
  24. Improving 2PC[ Go to top ]

    \Greg Pavlik\
     In fact, this is the essence of why transactions work: what's visible to other transactions (including subsequent transactions) is dictated by the isolation constraints on the system that are typically enforced by locks. The state of the uncommitted but prepared branches is the same as if the transaction was still in flight.
    \Greg Pavlik\

    This sounds good in theory, but I don't think it will fly in the real world (at least not in the J2EE real world). Often, people purposely design the RDBMS so they can safely make assumptions in their own little world that may not hold in a more generic transaction world. For example - an app may "know" that it's safe to insert into certain tables, and if a commit therein succeeds that they can safely assume that that data is there (e.g. an insert-only strategy into certain tables). That is, they are aware of what they're doing to the database, and take certain "OK" responses to commits to mean that they can make certain assumptions. Some may consider this poor programming, but people do it just because it works, and in other cases people do this as a performance optimization (e.g. I _know_ I don't need these X number of IDs via selects, 'cuz I just inserted those PKs myself...).

        -Mike
  25. Improving 2PC[ Go to top ]

    /Mike Spille/
    > This sounds good in theory, but I don't think it will fly in the real world (at least not in the J2EE real world). Often, people purposely design the RDBMS so they can safely make assumptions in their own little world that may not hold in a more generic transaction world.
    /Mike Spille/

    It's fine as long as it's only "their own little world" but if I'm interacting with their world then I certainly hope that they follow standard transaction behaviour as it would make my life easier.

    Trick question what does SERIALIZABLE mean to Oracle vs SQLServer, Sybase and DB2?

    Riad
  26. Improving 2PC[ Go to top ]

    /Mike Spille/
    This sounds good in theory, but I don't think it will fly in the real world (at least not in the J2EE real world). Often, people purposely design the RDBMS so they can safely make assumptions in their own little world that may not hold in a more generic transaction world.
    /Mike Spille/

    /Riad Mohammed/
    It's fine as long as it's only "their own little world" but if I'm interacting with their world then I certainly hope that they follow standard transaction behaviour as it would make my life easier.

    Trick question what does SERIALIZABLE mean to Oracle vs SQLServer, Sybase and DB2?
    /Riad Mohammed/

    We have the SQL standard, but such standards do not always prevent differing implementations. No where is this more evident than isolation levels, which certainly do differ between databases. This is one reason database independence is difficult to achieve.
  27. Improving 2PC[ Go to top ]

    \Greg Pavlik\

    >  In fact, this is the essence of why transactions work: what's visible to other transactions (including subsequent transactions) is dictated by the isolation constraints on the system that are typically enforced by locks. The state of the uncommitted but prepared branches is the same as if the transaction was still in flight.
    > \Greg Pavlik\
    >
    > This sounds good in theory, but I don't think it will fly in the real world (at least not in the J2EE real world). Often, people purposely design the RDBMS so they can safely make assumptions in their own little world that may not hold in a more generic transaction world. For example - an app may "know" that it's safe to insert into certain tables, and if a commit therein succeeds that they can safely assume that that data is there (e.g. an insert-only strategy into certain tables). That is, they are aware of what they're doing to the database, and take certain "OK" responses to commits to mean that they can make certain assumptions. Some may consider this poor programming, but people do it just because it works, and in other cases people do this as a performance optimization (e.g. I _know_ I don't need these X number of IDs via selects, 'cuz I just inserted those PKs myself...).
    >
    >     -Mike

    But that's how things work in the real world! I took this example from a real TP monitor... If a person hacks in back doors without knowing the full consequence of what they are doing, that's a recipe for very subtle problems. Transactional systems are about applying rules to get correct outcomes; they assume people play by the rules.

    The best thing I can recommend is that people need to understand the behavior of the platform because it effects design choices.

    Greg
  28. Improving 2PC[ Go to top ]

    Forgot to mention: this is an example from something done by Tandem in one of
    their old TP monitors.

    > \Greg Pavlik\
    > This isn't necessarily true. In many systems, once you enter the termination protocol, the termination protocol is done once it *decides* to commit (or rollback). Barring heuristic outcomes, it's a matter of time until the outcome is available, but the transaction may be considered to be a success at this point.
    > \Greg Pavlik\
    >
    > I can't think of any application uses where you could do that. If I'm understanding you, everyone can vote "yes" to prepare, and then you could return back to the application with a success indicator (and presumably do the commit() in the background). This would totally violate transactional guarantees - someone might insert() something and then try to read it in the next transaction, and if the background commit() hasn't completed, they wouldn't see it yet. This would break alot of application logic all over the place.
    >
    >     -Mike
  29. Improving 2PC[ Go to top ]

    \Riad Mohammed\

    > > Interesting idea about 2PC.
    > >
    > > However usually the application will want to wait for the outcome of the commit before performing more work ie was my save successful.
    > >
    >\Greg Pavlik\
    > This isn't necessarily true. In many systems, once you enter the termination protocol, the termination protocol is done once it *decides* to commit (or rollback). Barring heuristic outcomes, it's a matter of time until the outcome is available, but the transaction may be considered to be a success at this point. So it may be that throughput considerations cause the application to care only about the ultimate rather than the immediate visbility of data/state changes under control of the transaction.
    \Greg Pavlik\

    Sure if the prepare phase is successful then you can assume that the resources will commit. It comes down to the application which is why I said "usually", which is comes from my limited experience. I certainly hope that my bank cares about heuristic outcomes.

    Ok so my question would be if you ignore heuristic outcomes, what do you do if one actually happens after you've done other work.
  30. Improving 2PC[ Go to top ]

    \Riad Mohammed\

    > > > Interesting idea about 2PC.
    > > >
    > > > However usually the application will want to wait for the outcome of the commit before performing more work ie was my save successful.
    > > >
    > >\Greg Pavlik\
    > > This isn't necessarily true. In many systems, once you enter the termination protocol, the termination protocol is done once it *decides* to commit (or rollback). Barring heuristic outcomes, it's a matter of time until the outcome is available, but the transaction may be considered to be a success at this point. So it may be that throughput considerations cause the application to care only about the ultimate rather than the immediate visbility of data/state changes under control of the transaction.
    > \Greg Pavlik\
    >
    > Sure if the prepare phase is successful then you can assume that the resources will commit. It comes down to the application which is why I said "usually", which is comes from my limited experience. I certainly hope that my bank cares about heuristic outcomes.
    >
    > Ok so my question would be if you ignore heuristic outcomes, what do you do if one actually happens after you've done other work.

    Well, I meant to suggest that we could ignore heuristics to make a point, so you're really reacting to a strawman. But, let me turn the question around. Let's assume that we block the application for the entire execution of the termination protocol. All resource managers get the commit message, but a heuristic was made at one of the resources: what the heck is an application supposed to do with a heuristic mixed exception? It doesn't know what RMs are involved, or which one failed, and can't do anything about it anyway. It is the responsibility of the administrator of the system at which the heuristic occurred to resolve it. For your bank example, it's the bank's problem to make it right (consistent across both accounts), hopefully before you notice it. What really needs to happen is the coordinator and the participant need to set off an alarm so that manual intervention can occur quickly.

    Now we're encroaching on religious territory that has been argued over for years, so let's not get stuck in a rathole. I'm just trying to point out that heuristics raise the need for offline techniques, and you, as the application, generally can't deal with them in meaningful ways. So even if you consider heuristics, it's still not necessarily true that you *need* to block the application past the log of the commit message.
  31. Improving 2PC[ Go to top ]

    \Greg Pavlik\
    > Now we're encroaching on religious territory that has been argued over for years, so let's not get stuck in a rathole. I'm just trying to point out that heuristics raise the need for offline techniques, and you, as the application, generally can't deal with them in meaningful ways. So even if you consider heuristics, it's still not necessarily true that you *need* to block the application past the log of the commit message.
    \Greg Pavlik\

    You're right that usually manual intervention is required.

    I was thinking that the it would be better to wait for the transaction outcome before continuing any more work but the data would be available anyway before the outcome is returned.

    Thanks for the explanation.

    Riad
  32. Both approaches are valid[ Go to top ]

    \Greg Pavlik\
    > Now we're encroaching on religious territory that has been argued over for years, so let's not get stuck in a rathole. I'm just trying to point out that heuristics raise the need for offline techniques, and you, as the application, generally can't deal with them in meaningful ways. So even if you consider heuristics, it's still not necessarily true that you *need* to block the application past the log of the commit message.
    \Greg Pavlik\

    You are correct that it is possible to return prior to the actual commit. However, some applications will desire waiting until the outcome has completed (because they intend to immediately query and find the data they input). So what would work best in this scenario is really two commit methods (or a commit mode on the transaction). Something like:

    commit(CommitMode mode);

    CommitMode.RETURN_BEFORE_OUTCOME
    CommitMode.RETURN_AFTER_OUTCOME

    This would satisfy both desired modes of operation.
  33. Improving 2PC[ Go to top ]

    BEA's WebLogic app-server already performs XA prepare and XA commit phases in parallel, and I can confirm previous posts that WebLogic has long used asynchronicity for servicing sockets.

    That said, I too feel that some apps could benefit from more J2EE standards based asynchronous capabilities. Work manager threading JSR 237 and J2EE 1.4 asynchronous connectors are a step in the right direction.

    Tom Barnes, BEA
  34. WebLogic does async 2PC?[ Go to top ]

    Wow - never knew that, good to know. In what version did Weblogic start doing this?

       -Mike
  35. Improving 2PC[ Go to top ]

    As I understand correctly 2PC voting phase should be executed first before commit. (all reource managers should agree to commit or rollback and perform some book keeping tx logs) before moving to commit phase.
    So in this case it's impossible to execute prepare and commit in paralell.

    There is couple of aproaches to miniminzes 2PC blocking nature

    2PC commit optimizations

    2PC commit optimization

    If you are intrested there is tons of these papers on www.acm.org or http://citeseer.nj.nec.com

    I believe J2EE app servers vendors should get subscriptions to the www.acm.org and start reading what people did in research long time ago :).
  36. Improving 2PC[ Go to top ]

    As I understand correctly 2PC voting phase should be executed first before commit. (all reource managers should agree to commit or rollback and perform some book keeping tx logs) before moving to commit phase.

    > So in this case it's impossible to execute prepare and commit in paralell.
    >
    > There is couple of aproaches to miniminzes 2PC blocking nature
    >
    > 2PC commit optimizations
    >
    > 2PC commit optimization
    >
    > If you are intrested there is tons of these papers on www.acm.org or http://citeseer.nj.nec.com
    >
    > I believe J2EE app servers vendors should get subscriptions to the www.acm.org and start reading what people did in research long time ago :).

    I don't think you'll see presumed commit in an application server anytime soon ;-)
  37. Improving 2PC[ Go to top ]

    \Giedrius Trumpickas\
    As I understand correctly 2PC voting phase should be executed first before commit. (all reource managers should agree to commit or rollback and perform some book keeping tx logs) before moving to commit phase.
    So in this case it's impossible to execute prepare and commit in paralell.
    \Giedrius Trumpickas\

    The idea isn't to do prepare and commit in parallel, but to do all prepares in parallel, determine the outcome, and then do all commits in parallel. Like
    this:

       - Fire prepare at all xaResources in non-blocking mode.
       - Gather responses
       - If all "yes"
          - outcome is commit
       - If any single one "no"
          - outcome is rollback
       - If outcome commit:
          - Record intend-to-commit durably in App Server transaction log
       - * Fire outcome at all xaResources in non-blocking mode.
       - Reap responses
       - ** Return to response to caller

    * - Of course, track state properly to eliminate RO resources, etc.

    ** - Of course, deal with failures appropriately.

    This is as opposed to how most do it (except, apparently Weblogic - good to know):

       - Foreach XAResource:
         - fire prepare
         - wait for vote
         - If no, outcome is rollback
       - If all "yes"
          - outcome is commit
       - If outcome commit:
          - Record intend-to-commit durably in App Server transaction log
       - Foreach XAResource:
         - fire outcome
         - wait for result
       - Return to response to caller

    The difference between the two is waiting for the longest prepare() and commit(), and waiting for the sum of all prepare() and commit() calls.

        -Mike
  38. make threading more efficient?[ Go to top ]

    Blocking on a socket in a thread is as fast as you can get.
    You are operating at the correct priority and
    can dispatch immediately.

    If the problem is too many threads then perhaps
    it is the threading implementation that sucks.
    A high performance web server has been built
    on top of Erlang's process concept. That may
    be a better model going forward (http://yaws.hyber.org/).
  39. make threading more efficient?[ Go to top ]

    \Thoff squared\
    Blocking on a socket in a thread is as fast as you can get.
    You are operating at the correct priority and
    can dispatch immediately.
    \Thoff squared\

    If you are considering one thread, you're correct.

    Consider a thousand threads, and now you're talking about a different story.

    \Thoff squared\
    If the problem is too many threads then perhaps
    it is the threading implementation that sucks.
    A high performance web server has been built
    on top of Erlang's process concept. That may
    be a better model going forward (http://yaws.hyber.org/).
    \Thoff squared\

    I disagree with the above. Things can be done to optimize thread implementations, but the fact is that threads will always have a fair amount of 'weight' attached to them. The fundamental problem is that a thread switch is always going to involve a certain amount of context switching that can't be avoided.

    A secondary advantage of an NIO-based approached is that using thread pools on I/O gives developers finer grained control over throttling than is possible today. You can set up thread pool models which can prevent your server from being overloaded with requests, and to gracefully elongate response time rather than just start dumping connections. Thread-per-connection models like Java has lived with have a tendency to end up in thread-thrashing modes when you're talking about thousands of clients, and the server can easily be overloaded without the developers having any control over this beyond starting to refuse connections (either explicitly or,more likely, implicitly).

        -Mike
  40. make threading more efficient?[ Go to top ]

    Mike, did you read anything on erlang processes?
    Performance against apache is at http://www.sics.se/~joe/apachevsyaws.html.
    Thousands of clients aren't a problem.

    If your threads are OS threads then there is very
    little you can do to optimize threads.

    Threads in real-time systems, for example, are very efficiently
    scheduled. Low microseconds. There is little weight.
    Threads on OSs lile linux and windows are a different
    story, which is why i say threads should be fixed because
    a thread per concurrent activity is the simplest and most
    responsive architecture.

    Thread pools have their own disadvantages. Thread pools
    require context switching. The number of items
    in the pool is difficult to size correctly. There's
    no way to make certain work happen on a priority scheme.
    There's no way to gurantee latency.
    You can starve new work if existing work takes too long.
    You can starve for very long periods of time while
    dealing with node failures.
    You set up deadlock because applications running in pool threads
    can deadlock on shared locks. You are subject to priority inheritance
    and latency issues because of shared locks between applications.
    You can not use per thread policies like thread local
    variables. You get poor flow control of pool threads just
    dump work to other threads.
  41. make threading more efficient?[ Go to top ]

    With the Jetty HTTP Server, we experimented for some time with
    a NIO based non-blocking listener/connector. The idea was that
    with persistent HTTP connections, the majority of connections were
    idle at any given time and those idle connections should not have
    a thread allocated to them (wasting resources etc).

    The approach that we took was to put all idle connections into
    non-blocking state and to include them in a select set. Only when
    input was available on a connection would a thread be allocated, the
    connection switched to blocking semantics and servlet handling
    allowed to continue as normal.

    This approach was very successful at reducing the thread requirements
    of the server by about 90% for the same load. However, it reduced the
    maximum throughput of the server by about 30%. It turned out that the
    extra costs of manipulating the SelectSets and of switching modes of
    the connections were much more than the savings by reducing the threads
    allocated to the server.

    Our conclusion was that the true benefits of the non-blocking model
    would only be obtained once there was a non-blocking style of
    servlet API that allows content to be generated without using blocking
    semantics. I do not see how this can easily happen with the current
    servlet model, where the servlet has control and pushes content to the
    container.

    The other interesting observation, was that JVMs are getting very
    good at handling thousands of threads, while many operating systems
    require a lot of tuning before they can handle thousands of TCP/IP
    connections. The per connection resources on a system are likely
    to remain high and the increment imposed by allocating a java thread
    may not be dominant factor in limiting throughput on a given server.
  42. make threading more efficient?[ Go to top ]

    \Greg Wilkins\
    However, it reduced the
    maximum throughput of the server by about 30%. It turned out that the
    extra costs of manipulating the SelectSets and of switching modes of
    the connections were much more than the savings by reducing the threads
    allocated to the server.
    \Greg Wilkins\

    Greg, your numbers don't jibe well with mine. Taking a Selector approach will induce some inefficiency, no doubt about it. But in my own testing, using a Selector and a thread pool only incurred about a 5% overhead over plain old thread-per-socket.

    To give you an idea where I'm coming from, in my own JMS NIO server side stuff, the raw I/O event dispatcher can service around 6,000 requests per second on an HP-UX L3000 4 CPU machine when running with a NOP worker thread (e.g. the worker doesn't do anything). This was with small messages around 100 bytes or so.

    Using a thread per socket gets around 7,500 requests per second. This is obviously more than a 5% difference, but this is with NOP workers. Add in real workloads on the server and the actual throughput difference drops down to around 5%.

    Is it possible that your thread pool or Selector dispatcher was maybe a tad inefficient? 30% is a really big number. In my own work, the dispatcher is optimized so that the code path from popping out of the Selector to executing the first line of the Worker thread is only about 40 lines of Java code and one thread context switch. Some optimizations are thrown in as well to take advantage of the fact that we're doing JMS, where you can have high incoming request rates per connection - the worker can optionally suck in "N" requests (if they're their and can be had in a non-blocking manner) before handing the connection back to the Selector dispatcher.

    \Greg Wilkins\
    Our conclusion was that the true benefits of the non-blocking model
    would only be obtained once there was a non-blocking style of
    servlet API that allows content to be generated without using blocking
    semantics. I do not see how this can easily happen with the current
    servlet model, where the servlet has control and pushes content to the
    container.
    \Greg Wilkins\

    There I agree. The reduction in threads is nice, but the real win is when higher level code can take advantage of non-blocking semantics.

    \Greg Wilkins\
    The other interesting observation, was that JVMs are getting very
    good at handling thousands of threads, while many operating systems
    require a lot of tuning before they can handle thousands of TCP/IP
    connections. The per connection resources on a system are likely
    to remain high and the increment imposed by allocating a java thread
    may not be dominant factor in limiting throughput on a given server.
    \Greg Wilkins\

    Um, I'd say no and yes. Certainly just about every OS has very low defaults for handling TCP/IP connections, but threads are still pretty heavy weight. No matter how efficient JVMs get with threading, they still all rely on OS threads. And every OS I've worked with really wasn't designed with the idea of individual processes having thousands of threads. JVMs have gotten better, but "better" in this context doesn't equate to "as good as the alternatives". You pay for those threads in big memory hits, the OS scheduler, etc.

    Go back to one of my original propositions - improving efficiency to be able to run on more modest hardware. A thread per connection is a horrible model in this light. In that model you're pounding on the hardware to little effect.

    FYI none of this is new - people have been writing socket servers in C and C++ for quite some time :-) There's a reason why people abandoned a thread (or process!) per connection in that realm - it's just too heavy.

    That said I think we agree that the biggest bang for the buck will really come from introducing optional non-blocking semantics to certain J2EE bits.

         -Mike
  43. make threading more efficient?[ Go to top ]

    I have another beef with NIO:
    #1 it doesn't support MulticastSockets (it does support DatagramSockets)
    #2 SSLSockets are not supported, to do this you have to create your own socket factory
    #3 You can't select from files and sockets in the same selector

    I'm sure about #1, not sure about #2/#3. It would be interesting in hearing from other people using NIO.
    Cheers,
    Bela
  44. re: make threading more efficient?[ Go to top ]

    \Mike Spille\
    Taking a Selector approach will induce some inefficiency, no doubt about it. But in my own testing, using a Selector and a thread pool only incurred about a 5% overhead over plain old thread-per-socket.
    \Mike Spille\

    When well tuned, we've seen about the same 5% NIO overhead in testing Engine/J, and that includes an extra parse of request headers. But we have had to be very careful about sizing each stage of the pipeline (read/execute/write). NIO selectors and keysets can be cpu-heavy and drag on throughput - particularly on Windows. I suspect that NIO cross-platform abstractions could have a real cost relative to the underlying OS facility. In the end we've found that resource allocation (threads, mostly) for each stage of the pipeline needs to be dynamic and adapt to the size and makeup of current client load in the context of the specific application (as demonstrated by the SEDA project). Our next release (probably January) is more adaptive and should work well in a wider variety of environments.


    \Greg Wilkins\
    Our conclusion was that the true benefits of the non-blocking model
    would only be obtained once there was a non-blocking style of
    servlet API that allows content to be generated without using blocking
    semantics. I do not see how this can easily happen with the current
    servlet model, where the servlet has control and pushes content to the
    container.
    \Greg Wilkins\

    We've been able to wrap most servlet containers with non-blocking I/O for http/ajp, to the point where servlet threads will almost never block reading requests or writing responses. For responses that means buffering most or all of the data, though the average response size is much less than a thread's stack size - yielding a net win or wash in memory because the container is handling more than 1 connection per thread. Engine/J also provides a "shortcut" for static content that uses NIO apis to move data directly from the file system cache to socket buffers, so buffering large static responses is not an issue and static content is served with fewer data copies.
  45. re: make threading more efficient?[ Go to top ]

    \Tim Craycroft\
    NIO selectors and keysets can be cpu-heavy and drag on throughput - particularly on Windows. I suspect that NIO cross-platform abstractions could have a real cost relative to the underlying OS facility.
    \Tim Craycroft\

    The bulk of my testing has been on Unix server variants, with a heavy emphasis on HP-UX, which uses a pretty vanilla Sun JVM. NIO seems to work very well there. Can't say much about Windows, except that I've never used it in a serious server capacity, and I thank my personal deity every night for it :-)

        -Mike
  46. Weblogic has been doing non-blocking I/O since the beginning, so the benefits, AFAICT, are already there.

    The problem is not so much putting it in the spec, it's getting people to use it. There's a reason why everybody nowadays uses pre-emptive multi-tasking: writing synchronous code is cheaper than writing asynchronous code.
  47. Programmers are definitely more comfortable with synchronous
    calls. On several projects i've started with async and
    was forced to move to sync. Then of course we always end
    up still having to worry about latency, priority, starvation,
    interrupting high priority work, handling more things at once.
  48. \Gugliemlmo Lichtner\
    The problem is not so much putting it in the spec, it's getting people to use it. There's a reason why everybody nowadays uses pre-emptive multi-tasking: writing synchronous code is cheaper than writing asynchronous code.
    \Gugliemlmo Lichtner\

    Think in terms of the J2EE solution of choice that you use today. Do you care how much it "costs" the J2EE provider to use synchronous I/O, asynch I/O, or carrier pigeons? Do you care how hard the Weblogic, Websphere, JBoss, Jonas, or whoever programmers work?

    No, of course not - all you care about is the sticker price, performance, and feature set. If IBM throws another $50 million in development into going a massive asynch I/O route and it makes my application go faster on equivalent hardware, I don't care how much IBM spent to achieve it.

    In fact, you can use a lot of asynchronous techniques and NIO "behind the covers" in J2EE today and get efficiency and performance gains right now, and some do. This is one of the reasons why JBoss is near the bottom of the pack in performance - IMHO they put a higher value on developer convenience than customer needs, where as a BEA or IBM is used to biting a big complex development effort to get better performance (and hence a better market share).

    All I'm pushing for is to open the door in J2EE a bit so that asynchronicity and non-blocking I/O doesn't have to be so deeply hidden. This may make it harder for J2EE component providers, but that doesn't really concern me. The important part is you can get better efficiency and still allow for pluggability of J2EE bits from different vendors (tough to do right now if you're going an aggressive non-blocking I/O route at the moment). In addition, for some applications having such hooks exposed can be a real boon. Oh, for 95% of the J2EE world it wouldn't matter, but for us remaining 5% can it can make a huge difference if you've got thousands of clients and bit throughput requirements.

         -Mike
  49. Great idea. Doomed, but great.[ Go to top ]

    Mark,

    As far as I can tell, you are proposing that the J2EE spec be changed so that certain core details (asynchronisity and non-blocking I/O) are less deeply hidden, so that app server providers can make use of them.

    I think it's a great idea.

    I also think there's no chance in hell it will happen.

    I'm at my most cynical here (which is very cynical indeed), but it seems to me that they one, key, core, fundamental philosophy underwriting the J2EE specs is "we will do everything for people: and so we will not let them do things for themselves". In theory this is about encapsulation of low-level details to make life easier for everyone. But in practice it disempowers any developers that want to stray a millimetre from the norm.


    Sean
    PS: Unless, of course, IBM and BEA get together to sponsor it, in which case it'll whizz through the JCP process in 3 months. Which, now that I think about it, is probably the nicest thing I could possibly say about IBM and BEA.
  50. Great idea. Doomed, but great.[ Go to top ]

    Sean - agreed, agreed, and agreed. I think in theory this is a very interesting idea, and I believe it could be made to work. It's got potential and I think it's practical. But like you, I seriously doubt it would ever happen. There way too much momentum in the "ease of use" direction, not only for users but for implementors as well.

         -Mike