If you are really meaning that an application is placing 1MB in the httpsession, I would suggest to periodically save part of that information to a database or file, and cleaning up the memory, otherwise, I hardly believe that any system like that would scale (500 concurrent users would imply 500MB just in the httpsession, going back and forth between nodes...NO GOOD, no matter what replication mechanism you use, even if you use Cameron's :-) It would be, simply, a bad design.
Again, you're missing the point. The application server has to provide the API, but it's up to the application server vendor how smart the implementation is behind the API. If you kept all of the HTTP sessions in memory, then yes, you'd run out of memory if they were all 1MB+.
However, you do have choices, such as monitoring the number / sizes of sessions, and rolling them out to a DB or just to a local disk (which is much faster and less expensive and more scalable than a database.)
To say "it's a bad design" is a cop-out. If an app server can't handle even 1MB of user data, can you please tell me why someone would pay thousands of dollars per CPU for that app server? You can't just say "well, the customers are stupid to do it that way." If the customers (in this case, application developers) need to store 1MB in a session, then you need to find a way to support it, or you rightly won't sell much software.
Sorry for the rant, but I think that Oracle has enough resources to solve the problem, instead of blaming the customer. ;-)
Cameron-> I am sure that you are totally aware that even worse scalability issues can arise when using a synchronous model for session replication. Among others: worse performance for "normal" sessions and coupling between nodes... These, need to be taken care of with much care since they affect not only big sessions but any session being replicated.
I'm not sure what design you are assuming, but a latency-implicit operation (such as backing up data onto another node in a cluster) is certainly not a "scalability issue". In fact, it is quite similar to doing something with a JDBC connection (high latency but very low resource utilization on the client,) which is why web applications that use a database typically have more threads (since at any time, most will be in a blocking state waiting for the database.)
As far as the size of the session, that does not necessarily have to impact performance. If you think about it, it is the amount of data being changed during in a request that affects the minimum theoretical cost of making sure that a session has at least one up-to-date backup in the cluster. (We do have customers with 1MB+ sessions, and we explicitly designed to support such situations, including the obvious abilities to roll out to disk and/or databases.)
If an application really needs to store a lot of information in the session (which is something that I think that can be eluded most of the times) there are several better solutions that application servers provide (directly or thru frameworks) and that you can take advantage of.
Again, I'm not sure why you would suggest recoding an application to use some proprietary framework when there is a standard API that you (the app server vendor) get to implement called java.servlet.http.HttpSession. That is your personal invitation to provide "a better solution." ;-)
Saving data to a database (like I suggest in the article) will scale and guaranty no data loss much better than any in-memory mechanism.
No, saving session data to a database won't scale any better than the database scales, and the database is already typically the single-point-of-bottleneck, and it is almost always the most expensive point in the application infrastructure. Storing non-transactional non-persistent data into an expensive transactional + persistent data store seems like an ideal way to waste money and slow an application down.
However, you don't have to take my word for it. There are load-testing tools available to test the relative merits of different solutions. If you provide the 100-server cluster and the Oracle licenses, I'll donate the Coherence licenses to do a scalable performance test. ;-)
BTW - if you're located in the bay area, I'll be in east bay tomorrow (Tuesday) night at the
East Bay BEA user group. I'd be glad to discuss the topic further.
Peace,
Cameron Purdy
Tangosol, Inc.Coherence: Shared Memories for J2EE Clusters