Special Report:

Server-Centric AJAX Frameworks & GAE

By Jose Maria Arranz

TheServerSide.com

Google App Engine (GAE) is a very attractive Java hosting service for web applications, it combines price (the first level is free), easy of deployment, manageability and “infinite” scalability. Having said that, it also has many limitations like a very long number of black-listed Java classes, limited threading, only one database technology etc. However, there is another very important limitation that needs to be addressed: only replicated sessions are supported; that is, sticky sessions, a.k.a. session affinity, does not work in GAE.


In theory in GAE any web request can be redirected to any node of the grid (computer or JVM instance to be more exact) where your application is running, GAE is responsible of automatically synchronizing session data in nodes. As you can figure, GAE serializes changed data in the session of the last requested node to be shared between nodes, by using memcache and persistent storage of GAE under the hood. When a different node is being requested, session data can be read again and GAE de-serialize this data rebuilding Java objects in this node. Because serialization is the mechanism to synchronize Java objects between nodes your Java objects saved in sessions must be serializable.    

By using replicated sessions GAE provides elastic scalability, when your application is too busy GAE can automatically add more nodes to your application and existing users can use these new nodes automatically when they send new requests. But elastic scalability has a price because serialization and de-serialization are time consuming tasks and data transport between nodes consumes internal network bandwidth inside Google. When there is significant data in the session, elastic scalability may result in a severe performance penalty if  session data is frequently transported between nodes.

To mitigate this problem GAE is not that elastic, that is, most of your consecutive requests are sent, fortunately, to the same node.  When you are not using your web site for a while (some minutes) and you send requests again, these new requests usually are sent to a different node you “feel” this node change because your GAE application suffers a very significant delay.

One year ago the mysterious “StringBuffer” blogger made an experiment  to understand how GAE dispatches requests between nodes, the results showed  that GAE almost never switches same user requests between nodes when these requests are very consecutive. Although this experiment does not represent the real world behavior of persons (one second per request is a very small time lapse) it is very significant to “prove” that in real world, GAE works most of the time like in “sticky sessions” mode, because in a web application (a web site may be different) users execute several requests within small intervals.

This behavior pattern opens a new hope for AJAX intensive server side Java frameworks, this kind of frameworks holds and manages the view state mainly in server. Any approach has pros and cons, of course view state and management in server consumes more memory and a bit more server CPU for view logic, but there are some important advantages like better security and lower client load because most of the application is in server, speeder when complex view logic is executed (Java is by far more 'performant' than JavaScript), easier coding with no need of client/server custom bridges (because view and data are in the same memory space) and the goodness of Java coding (static typing, IDE support, strong OOP…).   

Server side frameworks run fine in sticky sessions mode because serialization is not mandatory and if the framework (and the web application) is serializable, serialization only happens for failover or servlet container reloading.

As you already know AJAX intensive server-side frameworks promote Single Page Interface applications (or with small number of page transitions), these applications are very stateful from the point of view of the view state.

In GAE serialization is mandatory for data in sessions, when there is very much view state in server, serialization and node switching may affect seriously to performance. A simple AJAX request going to make a very minor change in view implies the view has changed, and to notify GAE about this change, the session attribute saving the view state, must be reset, GAE detects this update serializes the view. If the next request implies node switching the cost is even higher because of transport and de-serialization. Of course some kind of view partitioning in session or delta strategies are possible but they could be really complex.  

There is an alternative in GAE, hybrid applications, stateful and at the same time stateless.  
The idea is simple: DO NOT SAVE THE VIEW STATE IN SESSION, save only user data going to be shared between nodes (usually non-persistent or cached data). This may sound crazy because we are talking about server-centric (stateful) AJAX intensive web applications, when GAE dispatches one AJAX request to a different node, no view state is going to be there (only user data saved in session)…

What if we could rebuild the view state in this new node?

This is the key of this apparently crazy proposal; if our AJAX event contains the enough info to rebuild the view state in server we have fully avoided the burden of serialization, transport between nodes and deserialization. The easiest approach is to load again the client web page in the new state expected when the AJAX event was sent. If we can rebuild the view state in server with the state info in client our application is now stateless from the point of view of the view.
Hybrid applications can be real right now.

I have partially cloned as SPI with ItsNat web framework, the e-commerce conventional page based web site of a very big Spanish retailer, following similar techniques and code shown in this tutorial. The Spain's law says that web sites of big companies must be fully accessible following the old WAI approach, that is, fully working with JavaScript disabled, as anyone can understand, JavaScript disabled and SEO friendly are two objectives very similar. The tutorial following the spirit of The Single Page Interface Manifesto  showed how a SPI web site can be at the same time page based, these ideas and technical background were applied to this new SPI challenge.

This web site resulting has:

  • Navigation with no reload.
  • Bookmarking of states the same as pages
  • SEO compatible (try to disable JavaScript to understand how is "seen" by web crawlers).
  • Back/Forward support (browser's history navigation in general) with NO reload.
  • Fully working with JavaScript disabled.
  • Layout exactly the same as the original site.
  • Remote view/control of other users using the web site (typical "free" bonus of ItsNat).

This link  points to the demo running in a single node of a conventional Java hosting (please read the information in the overview page including the terms of use and what part has been cloned as SPI).

The next challenge is how can it run in GAE?

The capability of working also with JavaScript disabled is the key, because the URLs in client know what the view state to be rendered is.

ItsNat supports GAE and replicated sessions calling ItsNatServletContext.setSessionReplicationCapable(boolean) with a parameter true, in this mode any received request implies one internal session attribute saving the view state is reset. When session replication capable is set to false (default), no view state is saved in sessions, in this case sessions are only for security (client browser identification) and virtually empty.   

In spite of built-in DOM caching techniques provided for ItsNat to save server memory for static parts of the web site (these parts are in server as plain markup shared between users), the view is usually complex enough to be a performance problem when serialized/transported/deserialized. So I decided not to enable session replication capable for GAE version too…

When an AJAX event has been sent to a different node, there is no view state is session, following ItsNat terminology the ItsNatDocument containing the DOM tree is not there (ItsNat uses an internal local unique number to recognize when the client document matches with the document in server). ItsNat offers a chance to process and recognize this AJAX orphan event, if a global EventListener was registered by the developer this AJAX event returns null when ItsNatEvent.getItsNatDocument() is called. Our global EventListener, when this orphan event is detected, changes the URL state to the new expected state (provided in event as a parameter) and reloads the page to the new expected state.


In summary, our web site hosted in GAE will work as SPI if the user sends frequent requests, when no request is sent for a few minutes the next request is usually dispatched by GAE to another node, this node detects this AJAX request as orphan and reloads the page in the new expected state, upcoming requests are going to be dispatched again as SPI.  Furthermore, if JavaScript is disabled, the application will work as page based.

Link to GAE version

Notes:

  1. This application is running on the free level of GAE, quality of service of free level is not the best, for instance when a new node is requested launching the web application takes some time, if the application is already launched in that node (the JVM is warm or other users are using the application) this time should be lower.
  2. Because the web site is partially cloned many links point to the original web site, an alert shows you are leaving the page or reloading. I recommend you to navigate the non-GAE version to recognize the cloned SPI part. In GAE version go to the SPI part and wait two or three minutes, then GAE will dispatch the next request to a new node and the page will reload (this is the expected behavior described in this article), an alert of “leaving the page” is shown (in this case to reload), of course this alert is not needed in a normal web application.
  3. Current ItsNat version (0.7.0.6) has a minor issue with session instances in GAE (fixed on the upcoming version), GAE may change the session instance between requests, the HttpSession object must be requested from HttpServletRequest.getSession().
  4. Remote Control does not work correctly in GAE.

 

06 Jun 2010