Proposed Asynchronous Servlet API

Discussions

News: Proposed Asynchronous Servlet API

  1. Proposed Asynchronous Servlet API (29 messages)

    Greg Wilkins, lead developer on the Jetty web container, has blogged about the growing need for the Servlet Specification to address asynchronous communication between the client and server. This need has been driven by the growing popularity of AJAX and especially "comet-style" interactions where clients continually re-instantiate an outstanding request to the server to allow the backend update the client with asynchronous event data. Many of the servlet containers are addressing this need with proprietary solutions, such as Jetty's Continuation mechanism. In his blog entry Greg argues that now is the time for Specification contributors (himself amongst them) to get together to take a unified approach, and proposes one potential solution for discussion. His outlined use cases:
    • Non-blocking input - The ability to receive data from a client without blocking if the data is slow arriving. This is actually not a significant driver for an asynchronous API, as most request arrive in a single packet, or handling can be delayed until the arrival of the first content packet. More over, I would like to see the servlet API evolve so that applications do not have to do any IO.
    • Non-blocking output - The ability to send data to a client without blocking if the client or network is slow. While the need for asynchronous output is much greater than asynchronous input, I also believe this is not a significant driver. Large buffers can allow the container to flush most responses asynchronously and for larger responses it would still be better to avoid the application code handling IO.
    • Delay request handling - The comet style of Ajax web application can require that a request handling is delayed until either a timeout or an event has occured. Delaying request handling is also useful if a remote/slow resource must be obtained before servicing the request or if access to a specific resource needs to be throttled to prevent too many simultaneous accesses. Currently the only compliant option to support this is to wait within the servlet, consuming a thread and other resources.
    • Delay response close - The comet style of Ajax web application can require that a response is held open to allow additional data to be sent when asynchronous events occur. Currently the only compliant option to support this is to wait within the servlet, consuming a thread and other resources.
    • 100 Continue Handling - A client may request a handshake from the server before sending a request body. If this is sent automatically by the container, it prevents this mechanism being meaningfuly used. If the application is able to decide if a 100-Continue is to be sent, then an asynchronous API would prevent a thread being consumed during the round trip to the client.
    I am still not exactly sure how a standard solution should look, but I'm already pretty sure how it should NOT look:
    • It should not be an API on a specific servlet. By the time a container has identified a specific servlet, much of the work has been done. More over, as filters and dispatchers give the abilility to redirect a request, any asynchronous API on a servlet would have to follow the same path.
    • It probably will not be based on Continuations. While Continuations are a useful abstraction (and will continue to be so), a lower level solution can offer greater efficiencies and solve additional use-cases.
    • It is not exposing Channels or other NIO mechanisms to the servlet programmer. These are details that the container should implement and hide and NIO may not be the actually mechanism used.
    Message was edited by: joeo@enigmastation.com

    Threaded Messages (29)

  2. Async connections with a web server is a bad idea. The HTTP protocol, and the way popular browsers handle it, is the main culprit. I have seen a variety of attempts in this realm and they all circle around the idea of polling. The problem with polling is that it slams web servers and eats bandwidth. If you HAVE to have this technology I would suggest you check out DWR version 2. A better way is to this is to use a applet (egad) that polls a (non-web) server with UDP packets IMOH.
  3. Async connections with a web server is a bad idea. It depends on what you are doing - its certainly not necessarily the case. With long polling & Comet its the server who decides who gets data and when, not the client.
    If you HAVE to have this technology I would suggest you check out DWR version 2.
    I thought DWR was using long poll / Comet too?
    A better way is to this is to use a applet (egad) that polls a (non-web) server with UDP packets IMOH.
    I see plenty of firewall hell in that approach - but sure, whatever floats your boat. Whether its HTTP or not you still have to get through firewalls and have to deal with the actual network traffic, whatever approach you use. Though you absolutely can do asynchronous messaging over HTTP efficiently; particularly with HTTP keepalive & pipelining. The long poll / Comet approach works well - e.g. check out the ActiveMQ & Jetty based Ajax support. So its probably the simplest and most efficient option to go with a small Ajax library and HTTP with a Comet server like Jetty/ActiveMQ rather than an applet, a custom transport and some kinda open port in your firewalls to deal with raw UDP. James LogicBlaze Fuse: the Open Source SOA runtime
  4. [blockquote] I thought DWR was using long poll / Comet too? [/blockquote] Long poll and Comet are just polling. Polling doesn't scale. Polling hurts the server due to open connections. For example: 1000 connected clients, all doing polling/long poll/Comet back to the server means the server has 1000 open connections. Add to this that many servers, and some operating systems, reclaim timed out connections rather lazily and you have a recipe for disaster. Apache and LightTPD are both looking into solutions for this. [quote]check out the ActiveMQ & Jetty based Ajax support.[/quote] More polling.
  5. [blockquote]
    I thought DWR was using long poll / Comet too?
    [/blockquote]

    Long poll and Comet are just polling. Polling doesn't scale.
    Sure it does. The only real issue is how many boxes you need :). Depending on your userbase size and techology choices it might be too expensive, but it'll scale if you can afford it :) e.g. we're working with a customer who want 100,000 concurrent users subscribed to 100-200 different topics all via an Ajax client with JMS publish/subscribe on the other side. The main problem is you need each process to handle lots of sockets (as well as deal efficiently with NIO and threads like Jetty/ActiveMQ do); so configuring your linux kernel or using solaris allows you to support 10,000 users per process/box. So we might need a few blades to deal with this load, but then thats true of most things ;).
    Polling hurts the server due to open connections. For example: 1000 connected clients, all doing polling/long poll/Comet back to the server means the server has 1000 open connections.
    Number of open connections != load. They are just sockets. If long poll only gets a request once per hour for each 1000 connections, thats a pretty light load.
    Add to this that many servers, and some operating systems, reclaim timed out connections rather lazily and you have a recipe for disaster.
    Why? Any Comet client in Ajaxwill automatically re-poll again if a socket times out so its no biggie - the only downside is it reduces the long-poll timeout.
    Apache and LightTPD are both looking into solutions for this.

    [quote]check out the ActiveMQ & Jetty based Ajax support.[/quote]

    More polling.
    Can you point to another solution which works on the internet across firewalls which doesn't use long polling (Comet)? FWIW there's not a huge difference from a network perspective between using a good Comet server such as Jetty/ActiveMQ or using a native JMS client or custom messaging server (assuming no firewalls); sure JMS is faster & can handle higher loads, but the basic architecture is similar - there is a server side with a socket per client. With HTTP pipelining & keepalive, the flow of packets with Comet is quite similar to regular JMS; the main difference is the asynchronous and unnecessary GET requests that travel from the client to the server; which are only there to comply with the HTTP protocol - other than that its a pretty efficient and firewall friendly protocol. BTW you do understand the difference between long polling / Comet and just simple polling right? With long polling if there is no message available, things just suspend until a message is available up to some timeout (such as 1 hour) so there's zero load on the server (assuming it deals with threads and NIO nicely like Jetty does) other than a socket is used up. James LogicBlaze Fuse: the Open Source SOA runtime
  6. BTW you do understand the difference between long polling / Comet and just simple polling right? With long polling if there is no message available, things just suspend until a message is available up to some timeout (such as 1 hour) so there's zero load on the server (assuming it deals with threads and NIO nicely like Jetty does) other than a socket is used up.
    James - remember too that modern (recent) operating system versions and JDK version (1.5) deal with this *MUCH* better than we saw even two or three years ago. Basically, NIO was unworkable (e.g. scatter/gather bugs) before JDK 1.5 and TCP/IP had severe limits until the Linux 2.6 kernel. Nowadays, with the right configs, 10,000 connections is no problem for a server, and I'd be very interested in seeing what the actual hard limits are. For historical reference, see: http://www.volano.com/benchmarks.html Peace, Cameron Purdy Tangosol Coherence: The Java Data Grid
  7. BTW you do understand the difference between long polling / Comet and just simple polling right? With long polling if there is no message available, things just suspend until a message is available up to some timeout (such as 1 hour) so there's zero load on the server (assuming it deals with threads and NIO nicely like Jetty does) other than a socket is used up.


    James - remember too that modern (recent) operating system versions and JDK version (1.5) deal with this *MUCH* better than we saw even two or three years ago. Basically, NIO was unworkable (e.g. scatter/gather bugs) before JDK 1.5 and TCP/IP had severe limits until the Linux 2.6 kernel. Nowadays, with the right configs, 10,000 connections is no problem for a server, and I'd be very interested in seeing what the actual hard limits are.

    For historical reference, see:

    http://www.volano.com/benchmarks.html

    Peace,

    Cameron Purdy
    Tangosol Coherence: The Java Data Grid
    I totally agree. Modern JVMs deal with both large amounts of threads and NIO *much* better than they used to. I've often found BIO to be faster on modern linuxes and Java 5 - up to a few thousand clients - than NIO as threads seem pretty cheap nowadays; though there's definitely a tipping point when NIO starts becoming better which depends on your OS and JVM. At some point the cost of the threads and the context switching starts to make thread pooling and NIO a better option - but it does depend on your OS/JVM. James LogicBlaze Fuse: the Open Source SOA runtime
  8. Hey James, Cammeron, did either of you happen to attend the JavaOne session on Sun's Grizzly NIO framework? This is the underpinning of their Glassfish app server and is being offered up as an open source reusable NIO framework: BOF-0520 (08:30p-09:20p): "Customizing the Grizzly NIO Framework" – Jean-Francios and friends wrote the low level HTTP handlers in Glassfish that provide really great performance. Here's last year's slide presentation: https://developers.sun.com/learning/javaoneonline/2005/webtier/TS-3227.pdf Grizzly can integrate with the Apache Tomcat HTTP Connector architecture and thus can be dropped in as a replacement for the Coyote HTTP connector. Here's a link to some benchmarking comparisons: Can a Grizzly run faster than a Coyote? http://weblogs.java.net/blog/jfarcand/archive/2006/03/can_a_grizzly_r.html Besides raw performance comparisons, though, I'd be curious to see the difference in resource consumption as connections mount up into the thousands. On Windows a runtime stack frame is created per each new thread so one would expect to consume a couple of pages of memory per minimum per every thread that's alive. An NIO executor thread pool approach would only need enough active threads to service whatever channels are active. (If a benchmark test keeps all connections constantly active, though, then that would not be a very realistic simulation of real world behavior where majority of connections would be idle at any given time.)
  9. Hey James, Cammeron, did either of you happen to attend the JavaOne session on Sun's Grizzly NIO framework?

    This is the underpinning of their Glassfish app server and is being offered up as an open source reusable NIO framework:

    BOF-0520 (08:30p-09:20p): "Customizing the Grizzly NIO Framework" – Jean-Francios and friends wrote the low level HTTP handlers in Glassfish that provide really great performance.

    Here's last year's slide presentation:

    https://developers.sun.com/learning/javaoneonline/2005/webtier/TS-3227.pdf

    Grizzly can integrate with the Apache Tomcat HTTP Connector architecture and thus can be dropped in as a replacement for the Coyote HTTP connector.

    Here's a link to some benchmarking comparisons:

    Can a Grizzly run faster than a Coyote?
    http://weblogs.java.net/blog/jfarcand/archive/2006/03/can_a_grizzly_r.html

    Besides raw performance comparisons, though, I'd be curious to see the difference in resource consumption as connections mount up into the thousands. On Windows a runtime stack frame is created per each new thread so one would expect to consume a couple of pages of memory per minimum per every thread that's alive. An NIO executor thread pool approach would only need enough active threads to service whatever channels are active. (If a benchmark test keeps all connections constantly active, though, then that would not be a very realistic simulation of real world behavior where majority of connections would be idle at any given time.)
    I didn't make the session but I've taken a look at Grizzly. There's a few NIO frameworks around; we've been using Jetty for a while which in version 6 has a good NIO layer. It might be interesting to compare performance of Grizzly with Jetty 6 :) Though AFAIK Grizzly hasn't yet implemented the Continuation feature of Jetty 6 which is crucial for efficient AJax/Comet and asynchronous messaging on the web. James LogicBlaze Fuse: the Open Source SOA runtime
  10. jetty continuations[ Go to top ]

    Though AFAIK Grizzly hasn't yet implemented the Continuation feature of Jetty 6 which is crucial for efficient AJax/Comet and asynchronous messaging on the web.

    James
    LogicBlaze
    Fuse: the Open Source SOA runtime
    Hmm. The jetty continuation was somewhat less than thrilling to me. Well, it amounts to a hack - the whole business of relying on throwing an exception, etc. Plus, it's an explicit API that has to be coded to in a servlet. In my own NIO-based AJAX/JMS bridge I'm not even going to mess with servlets at all - they're a waste of time from my perspective. I just want my AJAX client to be plugged directly into JMS with none of the HTTP/servlet baggage in the way - or the baggage of existing jetty/tomcat implementations. So I'll have a special "pipe" of sorts that exist between the nio front-end of the bridge and a conventional JMS connection to the JMS server at the back-end of the bridge. One end of this pipe (the clien-side facing end) is an nio channel and is registered with a select. So when a JMS msg is published to the client, the select will sense the new data on the channel and dispatch an executor worker thread that will start reading and then writing it to the appropriate destination client channel connection. The other end of the pipe (the JMS-side facing end) will support conventional stream io interface and permit a dedicated JMS session thread to remain attached to it. Of course the "pipe" provides memory buffering and synchronization control of this JMS-side thread. I guess the difference of perspective is that I'm thinking in terms of just a basically conventional JMS bridge - instead of a NIO front-end to an HTTP server. All my business logic will take place in message driven beans - which will never be aware that their messages got bridged over from AJAX clients. No need for crappy servlets or any other crappy HTTP non-sense. And if I go Flex 2 for the client, I'll even be dumping traditional crappy HTML/JavaScript and instead will be marshalling objects back and forth between ActionScript and Java. Perhaps in ten to fifteen years HTTP/HTML/JavaScript will just be a bad memory from primitive by gone days.
  11. Re: jetty continuations[ Go to top ]

    Though AFAIK Grizzly hasn't yet implemented the Continuation feature of Jetty 6 which is crucial for efficient AJax/Comet and asynchronous messaging on the web.

    James
    LogicBlaze
    Fuse: the Open Source SOA runtime


    Hmm. The jetty continuation was somewhat less than thrilling to me. Well, it amounts to a hack - the whole business of relying on throwing an exception, etc. Plus, it's an explicit API that has to be coded to in a servlet.

    In my own NIO-based AJAX/JMS bridge I'm not even going to mess with servlets at all - they're a waste of time from my perspective. I just want my AJAX client to be plugged directly into JMS with none of the HTTP/servlet baggage in the way - or the baggage of existing jetty/tomcat implementations.

    So I'll have a special "pipe" of sorts that exist between the nio front-end of the bridge and a conventional JMS connection to the JMS server at the back-end of the bridge.

    One end of this pipe (the clien-side facing end) is an nio channel and is registered with a select. So when a JMS msg is published to the client, the select will sense the new data on the channel and dispatch an executor worker thread that will start reading and then writing it to the appropriate destination client channel connection. The other end of the pipe (the JMS-side facing end) will support conventional stream io interface and permit a dedicated JMS session thread to remain attached to it. Of course the "pipe" provides memory buffering and synchronization control of this JMS-side thread.

    I guess the difference of perspective is that I'm thinking in terms of just a basically conventional JMS bridge - instead of a NIO front-end to an HTTP server. All my business logic will take place in message driven beans - which will never be aware that their messages got bridged over from AJAX clients. No need for crappy servlets or any other crappy HTTP non-sense. And if I go Flex 2 for the client, I'll even be dumping traditional crappy HTML/JavaScript and instead will be marshalling objects back and forth between ActionScript and Java. Perhaps in ten to fifteen years HTTP/HTML/JavaScript will just be a bad memory from primitive by gone days.
    I totally hear you. We went through a similar thought process a while ago on the ActiveMQ project trying to figure out how to do an effective Ajax <-> JMS bridge. You absolutely could use a pure NIO framework; but there is significant code required to deal with the HTTP side of things which is non trivial. So I'd definitely recommend starting with an NIO based HTTP framework first and bridging that to JMS. While you may think the servlet exception thing is a bit of a hack, the great thing is we can use the servlet deployment model and either work in a regular servlet container, whether BIO or NIO (which eats up threads but still works) or work in a continuation based NIO container like Jetty. So we should be able to deploy the ActiveMQ Ajax bridge in Tomcat and Grizzly too though Jetty is our preferred container due to the continuation support. James LogicBlaze Open Source SOA
  12. The proposal I have made is not to give async IO in servlets. Rather it is to give an async API for processing servlet requests. This may or may not be related to async IO and NIO. The reason many people don't like Jetty continuations is that they happen within a servlet and have to backout and retry the request to the servlet. Thus my proposal is to create an API that allows some async handling to happen before the servlet stack is called. Think of it simply as an event that a request has arrived. The event handler is required to either immediately schedule the servlet container - or it may wait for other events and/or resources before scheduling the servlet container for that request - or it may handle the response itself. The IO would still be via the standard servlet API and the application would not be involved in how the bytes are read. only in the scheduling of when the processing is done.
  13. Here's a link to some benchmarking comparisons:

    Can a Grizzly run faster than a Coyote?
    http://weblogs.java.net/blog/jfarcand/archive/2006/03/can_a_grizzly_r.html
    I suppose it's the big idea of publishing benchmarks: people will read them and for some reason believe them, even if they are not done properly (99% of vendor's published benchmarks are like that, you should look at third party benchmarks IMO). For starters, the configuration used for the Tomcat connectors is bad (to start with the obvious: the poller size is way inferior to the amount of expected keepalive connections - sigh ...), while the Grizzly one seems to be tuned (and apparently has HTTP caching built right into the connector, which IMO is not a very good idea, although it will help during benchmarks, that's for sure). The claim that large file serving with java.io can perform just as well as sendfile is also a bit funny (with an abysmally low throughtput of 10MB/s for the results ...). Maybe this NIO implementation is good. I don't know, as I did not try it (other Tomcat developers did, and they claim performance is a bit disappointing). For example, it could have problems scaling to large amount of threads, which is needed sometimes in Servlet land. BTW, as was said many times in earlier articles postings, "continuations" is not an appropriate name for the feature that is discussed in this thread, since it simply reinvokes the service method of the servlet.
  14. Java needs Asynchronous API[ Go to top ]

    I do think Java needs Asynchronous API like what C# has provided. Asynchronous API is almost a "MUST" in Rich Client Applicaton. Btw, We has been using NIO since JDK1.4.2, and it works well. Daikei Architech co, Ltd http://www.archi-tech.info/
  15. With ICEfaces we're making extensive use of what we've been calling "blocking HTTP" connections. This is the same as what others are referring to as "long polling" or "hanging HTTP" (a "hang" is a bad thing, so I hope that term doesn't catch on). We use this to perform application-initated updates of the page via AJAX (a process some are calling "COMET"). Modern operating systems really can handle tens of thousands of sockets; the problem, as James is pointing out, is the servlet API. It causes blocking connections to eat threads, but with non-blocking IO, there's no fundamental reason why this should be the case for a Java HTTP server, as witnessed by Jetty. The ICEfaces Enterprise Edition provides an asynchronous HTTP server that runs alongside any existing application server to provide scalable application-initiated update capability, but what is needed long term is standardization in the servlet API so that advanced AJAX applications can run within any Servlet container.
  16. Modern operating systems really can handle tens of thousands of sockets; the problem, as James is pointing out, is the servlet API. It causes blocking connections to eat threads, but with non-blocking IO, there's no fundamental reason why this should be the case for a Java HTTP server, as witnessed by Jetty. Stateful firewalls and load balancers, even for a major public web site, don't handle more than a few thousand simultaneous connections. If you ratchet that up to a hundred thousand open connections, you will feel serious pain. Saving threads in the container is the least of your worries.
  17. Stateful firewalls and load balancers, even for a major public web site, don't handle more than a few thousand simultaneous connections. If you ratchet that up to a hundred thousand open connections, you will feel serious pain. Saving threads in the container is the least of your worries.
    Why would you need stateful firewalls in front of a public site? Blackhole everything except :80 and :443, etc. Let the load balancer hand off the connection management to the servers behind it (it doesn't have to stay "in the middle"), and *voila* no problems. True? Peace, Cameron Purdy Tangosol Coherence: The Java Data Grid
  18. I take it you've never managed a major public site before. Attempts to compromise are a constant onslaught. For one, DOS and other attacks can't be handled in this fashion. For another, Linux (and certainly Windows) servers can still be vulnerable, even if they are just accessible on ports 80 and 443. Finally, public sites aren't just simple static html servers. If you hang your public proxy or content or whatever servers out there, you've restricted your choices when it comes to the backend plumbing.
  19. The main problem is you need each process to handle lots of sockets (as well as deal efficiently with NIO and threads like Jetty/ActiveMQ do); so configuring your linux kernel or using solaris allows you to support 10,000 users per process/box. So we might need a few blades
    From 2001: http://www.kegel.com/c10k.html
    And computers are big, too. You can buy a 1000MHz machine with 2 gigabytes of RAM and an 1000Mbit/sec Ethernet card for $1200 or so. Let's see - at 20000 clients, that's 50KHz, 100Kbytes, and 50Kbits/sec per client. It shouldn't take any more horsepower than that to take four kilobytes from the disk and send them to the network once a second for each of twenty thousand clients. (That works out to $0.08 per client, by the way. Those $100/client licensing fees some operating systems charge are starting to look a little heavy!) So hardware is no longer the bottleneck.
  20. What is COMET?[ Go to top ]

    I didn't know, this is the best I could find: http://www.irishdev.com/NewsArticle.aspx?id=2166 I dislike the name...and since a request/response can handle most intermittent updates, such as employed by GMail, I don't know how useful this stuff is. I guess the goal is real-time responsiveness like a full IM over HTTP. Doesn't that sound a lot like square-peg in round-hole? I hope GTalk integration in GMail isn't supposed to be a good example of COMET. It really sucks compared to real IM services. I've used it a couple of times, the latency is awful, but maybe it was the firewalls between me and my talkee. If COMET is looking to GTalk to be what GMail/GMaps was for AJAX, keep looking...
  21. Re: What is COMET?[ Go to top ]

    This is quite a good link... http://alex.dojotoolkit.org/?p=545
    I didn't know, this is the best I could find:

    http://www.irishdev.com/NewsArticle.aspx?id=2166

    I dislike the name...and since a request/response can handle most intermittent updates, such as employed by GMail, I don't know how useful this stuff is. I guess the goal is real-time responsiveness like a full IM over HTTP. Doesn't that sound a lot like square-peg in round-hole?

    I hope GTalk integration in GMail isn't supposed to be a good example of COMET. It really sucks compared to real IM services. I've used it a couple of times, the latency is awful, but maybe it was the firewalls between me and my talkee. If COMET is looking to GTalk to be what GMail/GMaps was for AJAX, keep looking...
    FWIW I often use Gmail with IM and it works pretty good for me. Note Comet is not designed as a replacement for XMPP though; its more for asynchronous messaging with Ajax so that you can do things like implement Gmail itself as opposed to the Chat piece - but GMail shows it can do both pretty well across firewalls using any Ajax client. If you have an XMPP server and an XMPP client and the firewalls are all allowing it then sure, use XMPP which is a little more efficient than Comet. The beauty of Comet isn't the performance - its that its the best solution today with current web infrastructure & firewalls for messaging to a browser without a plugin - i.e. i.e. works nicely with HTTP and current browsers & firewalls. James LogicBlaze Fuse: the Open Source SOA runtime
  22. Re: What is COMET?[ Go to top ]

    Seriously, the amount of architectural gymnastics done to accomodate the fact that firewalls only allow 80 and 443 traffic (and often limiting the protocol being used on them to HTTP) is pretty amazing. Why does XML SOA exist? Because of the exact same reason. It's all an attempt by developers to grab back flexibility from the security nazis that took away all our ports and protocols.
  23. I can't wait[ Go to top ]

    I don't see why not. I am sure there are lots of systems out there that would end up using this. It shouldn't be too hard - just switch the api from using InputStream/OutputStream to nio. Guglielmo Enjoy the Fastest Known Reliable Multicast Protocol with Total Ordering .. or the World's First Pure-Java Terminal Driver
  24. Let's see if I'm on track here.[ Go to top ]

    First, you have the Servlet API and it's current implementation within the moden servers. Typically, this is implemented via a thread per connection model. It's not clear to me how Asynchronous IO helps mitigate this problem, I don't see how you can break the thread/connection using only an Async IO implementation. In the past, most Async implementations goals were to reduce the number of threads or processes used to handle requests, as they considered them fairly heavyweight -- particularly for simple static content. Using Async IO for static content has proven to be very efficient, but that doesn't translate to generic random request processing. With Async IO based content serving, you have a process simply responding to events based on content availability from the OS kernel. At a high level you could consider these IO requests "threads", but they're obviously lower level than that. Now perhaps I'm mistaken as to what the goal of Async IO is with regards to the Servlet API. Also, Suns Glassfish uses Grizzly, which is an NIO based Connector for Tomcat (and embedded within Glassfish and SJAS 8). I'm curious how this doesn't meet up with the goals of the proposed changes to the API (mind, I'm not against them, I'm just curious how Grizzly doesn't address the goals). My only thinking is that if you're using a buffered response, you can buffer the entire request response and then serve that buffer asynchronously, freeing threads up earlier than if you were not using an asynchronous response. But this has an obvious impact on memory. I don't know if there's anything in the Servlet spec that says the OutputStream made available through the HTTPServletResponse is actually a direct connection to the socket representing the client. Next, we have discussions about, essentially, persistent connections to servers through HTTP. The issue here being that with firewalls, we simply can't have servers opening connections to client programs for occasional event notifications, so the clients make a persistent connection to the server and essentially leave the socket open waiting for content. The link provided about "Comet" seems to be a browser specific technique where you have the server sending down Javascript code that is executed "on the fly" by the browser using a "never closing" connection. Otherwise it seems that you have the client continually polling the servers for updates with assorted different definitions of "continually". The common wisdom in these cases is that the client connects to the server, and the server blocks the connection for some "reasonable" (several minutes) length of time until a request is ready for the client. The idea here is that the client doesn't make continual and rapidly repeating connection requests with the connections closing if no new information is available. We're looking at this exact problem currently, but we're looking at it through Web Services colored glasses. Web Services tend to be pretty "request/response" oriented rather than streaming, so we're basically planning on doing the above technique where the client calls the server, blocks, and then it times out or the server bundles up some chunk of response data and sends it down, and then "closes" the request. After which the client "calls back". It gets chatty when it gets busy, but blocks quiets down with no activity. As for scaling, there's a couple of issues. One, on the server end, yea, you need to manage a lot of sockets. But servers are cheap. When you hear about one of the new T2000 handling 70000 FTP requests, that's boatload of sockets. And, frankly, as long as the bandwidth lasts, this does scale specifically because I can throw servers at the problem of handling the front door to the main servers. But it does become more of a problem when I need a thread for each pending request. Threads are more expensive than sockets. This is the part that is unclear to me. Obviously Async IO helps facilitate this specific problem, but since the Servlet API is a thread based API, I don't see how Async IO within the Servlet model can help us, even if all of the threads are simply blocking on a socket (in our case a JMS listener). So, we may have to create a custom server and not use the Servlet API for this application to handle this specific case. But the other concern I have that seems to be handwaved away is what about mid line proxy servers? If all of a sudden a gazillion clients that were before using generic HTTP requests in the past are now opening persistent connections, and these connections are being opened behind/through proxy servers, then all of a sudden the load is not simply on our hosting servers, but on a bunch of "innocent", third party proxies. Does anyone have any thoughts on the impact these designs have on that kind of infrastructure? My understanding of Keep-alive and pipeling is basically the browsers makes a single connection to the host, and uses that connection to pump down requests (like, for example, the images on a page), and the host responds by piggybacking responses down the same connection for several requests. But after all is said and done, and the page is loaded, the keep-alive connection is closed and the server resources (and the proxy resources) are freed up to serve other clients. So, even proxies supporting keep-alive connections don't endure the same burden as long term long term connections that are being discuess here. Am I mistaken in my understanding of keep-alives?
  25. First, you have the Servlet API and it's current implementation within the moden servers. Typically, this is implemented via a thread per connection model. It's not clear to me how Asynchronous IO helps mitigate this problem, I don't see how you can break the thread/connection using only an Async IO implementation.
    Just to be clear, there is a big difference between NIO (e.g. non-blocking I/O) and Asynchronous I/O (completion ports, etc.)
    In the past, most Async implementations goals were to reduce the number of threads or processes used to handle requests, as they considered them fairly heavyweight
    With Java NIO, you can theoretically manage 10,000 read, write and read/write connections with a total of one thread.
    My only thinking is that if you're using a buffered response, you can buffer the entire request response and then serve that buffer asynchronously, freeing threads up earlier than if you were not using an asynchronous response. But this has an obvious impact on memory.
    Right, but only for responses that have been generated but not yet "cleared" to the client. Open sockets just lying around in "long polls" don't use any significant amount of memory (i.e. anywhere from hundreds of bytes up to the max TCP/IP buffer size).
    My understanding of Keep-alive and pipeling is basically the browsers makes a single connection to the host, and uses that connection to pump down requests (like, for example, the images on a page), and the host responds by piggybacking responses down the same connection for several requests. But after all is said and done, and the page is loaded, the keep-alive connection is closed and the server resources (and the proxy resources) are freed up to serve other clients.
    This is up to the client and the server. They can both choose to leave the socket open indefinitely. BTW This helps to explain some of the benchmarks you may have seen with a specific number of clients (i.e. one less than the total number of keep-alives supported ;-). Peace, Cameron Purdy Tangosol Coherence: The Java Data Grid
  26. With Java NIO, you can theoretically manage 10,000 read, write and read/write connections with a total of one thread.
    Yes, but it's not clear to me how you can manage this under the Servlet API. With a custom non-servlet server, then sure, assuming the request processing for each channel is lightweight enough. But if I have an NIO front end on the server and it calls a generic servlet, the servlet is going to consume a thread and can conceptually "do anything it wants", like block on another server resource (say, a JMS synchronous getMessage() call, or a particularly long SQL query). Basically, the problem I see is I can't see the advantages that a NIO based servlet container can provide, save serving up static resources. Servlets kind of have carte blanche as to what they can do. Even if the Servlet wanted to itself do NIO, it's still has some thread of execution essentially dedicated to it to manage that, and can't easily release that back to container and have it "call back" when "something interesting" happens. When the container calls "doGet", the servlet isn't going to be coming back until it's "done". The nature of NIO systems is to treat the sockets and file channels as event streams. Handle each event in turn, then start over with a new list of events. Even a single threaded system can do that. If the NIO server needs to "read" something from something else, it adds the new channel to its list and keeps moving, treating it like any other channel. But in the servlet model when the server dispatches to the actual servlet code, there's no guarantee it's EVER coming back, especially in the "keep the socket open forever" modes we're discussing here. If we want the servlet code to participate in the overall sharing the is necessary in an NIO system, it needs to be aware that it's in one, and active contribute, like getting file channels that the NIO kernel is aware of, etc. So, in fact, aren't we essentially going back to "cooperative" multi-tasking scenarios, where the code itself is "in on it" regarding scheduling, and needs to free itself up to the container in order to enable running multiple servlets simultaneously on the same thread?
  27. An advantage of NIO is where you want to let clients maintain a persistent socket connection to the server so that, say, it is feasible to do server-side push of notification messages to clients, i.e., bi-directional messaging (inclusive of publish/subscribe, etc). NIO makes it possible to scale the server to handle up to ~10,000 simultaneous persistent client connections. Of course they will not be active all at the same time so the single thread that is handling the NIO select() call will just hand off active nio channels to a Java 5 Concurrency executor. So a few threads in a thread pool could handle many, many more socket connections. The executor worker thread would need to parse the data stream from the channel so that it could detect when it has fully buffered an HTTP request (or JMS message). Different worker threads from the pool cold process data from the same channel before a given such item is fully read into an intermediate buffer. At that point, though, the completed buffer could be handed off to a servlet or Message-Driven-Bean for processing (which either will need to execute on precisely the same thread for the entire time they process the item). The NIO approach is more complicated, but again its advantage is that it can sustain a very large number of persistent connections so as to enable bi-directional messaging. In an HTTP/AJAX context, the client could use HTTP Streaming (the old Netscape server push protocol for HTTP which is still used by some commercial video encoders to stream simple MJPEG video) as the means to maintain a persistent connection from its end. (For the IE browser, the connection would have to be restablished after each message received as a server-side push - Firefox/Mozzila/Netscape, though, will maintain the connection between server push messages.)
  28. With Java NIO, you can theoretically manage 10,000 read, write and read/write connections with a total of one thread.


    Yes, but it's not clear to me how you can manage this under the Servlet API. With a custom non-servlet server, then sure, assuming the request processing for each channel is lightweight enough.
    Note that the Servlet API extensions that Jetty has implemented could be standardised so that standard Servlets could work nicely with an NIO server efficiently like Jetty already has done. See the title of this thread :)
    But if I have an NIO front end on the server and it calls a generic servlet, the servlet is going to consume a thread and can conceptually "do anything it wants", like block on another server resource (say, a JMS synchronous getMessage() call, or a particularly long SQL query).

    Basically, the problem I see is I can't see the advantages that a NIO based servlet container can provide, save serving up static resources. Servlets kind of have carte blanche as to what they can do. Even if the Servlet wanted to itself do NIO, it's still has some thread of execution essentially dedicated to it to manage that, and can't easily release that back to container and have it "call back" when "something interesting" happens.

    When the container calls "doGet", the servlet isn't going to be coming back until it's "done". The nature of NIO systems is to treat the sockets and file channels as event streams. Handle each event in turn, then start over with a new list of events. Even a single threaded system can do that.

    If the NIO server needs to "read" something from something else, it adds the new channel to its list and keeps moving, treating it like any other channel.

    But in the servlet model when the server dispatches to the actual servlet code, there's no guarantee it's EVER coming back, especially in the "keep the socket open forever" modes we're discussing here.

    If we want the servlet code to participate in the overall sharing the is necessary in an NIO system, it needs to be aware that it's in one, and active contribute, like getting file channels that the NIO kernel is aware of, etc.

    So, in fact, aren't we essentially going back to "cooperative" multi-tasking scenarios, where the code itself is "in on it" regarding scheduling, and needs to free itself up to the container in order to enable running multiple servlets simultaneously on the same thread?
    So I think the part you're missing is the Continuation feature Jetty has implemented that Greg is trying to get standardised; allowing a Servlet developer to suspend a servlet and resume it when it can complete. So rather than the servlet blocking, it suspends the servlet call, leaving the container to carry on doing something else - then later on when an event occurs it informs the container to continue with the servlet. We use this feature in ActiveMQ's Ajax support to provide publish-subscribe in Ajax to a federated JMS network on the server side. How it works is basically, when the servlet executes, if there is a message immediately available in the users subscription in ActiveMQ, we return it to the Ajax client. If there is no message, we don't block the servlet but throw a continuation and add a listener for when a message is available. Then later on when a message arrives, we resume the servlet and carry on as before. i.e. the servlet works great in a non-blocking way and the client blocks until a message arrives but we don't tie up server side resources (other than the socket and a few POJOs for the request/response which are all pretty cheap). James LogicBlaze Fuse: the Open Source SOA runtime
  29. AIO v NIO[ Go to top ]

    Just to be clear, there is a big difference between NIO (e.g. non-blocking I/O) and Asynchronous I/O (completion ports, etc.)
    Cameron, I was under the impression that NIO was a portable interface to platform AIO apis - you know, AIO for Java... Please set me straight. Cheers, Steve.
  30. Grizzly currently supports asynchronous request processing. Just take a look at [1]. This is far from perfect, but I'm planning to implement continuation on top of it based on what the Servlet EG will decide. BTW, an interesting implementation of [1] can be found in openESB (look at the section on HTTP SOAP binding Component) [2]. -- Jeanfrancois [1] http://weblogs.java.net/blog/jfarcand/archive/2006/02/grizzly_part_ii.html [2] http://weblogs.java.net/blog/jfarcand/archive/2006/05/after_javaone_l_1.html