Performance and scalability: Architectural advice requested
We are a couple of smart and educated but j2ee-inexperienced developers, and we've found ourselves handling a large project of our own design without much external assistance. We have a general architectural question, and we're hoping that some kind j2ee experts out there could at least point us in some direction, or tell us about resources (e.g. books) we could consult. We have an application that we hope will ultimately support thousands of persistent connections from clients. The connections use the adobe flash protocol RTMP, which is built on top of tcp/ip. We're using bits and pieces from the Red5 project (http://osflash.org/red5) to handle these connections, which in turn uses apache MINA (http://directory.apache.org/subprojects/mina/index.html). On the server we have hundreds of stateful objects which, for for the sake of discussion, we'll call "Programs". Each Program will be associated with hundreds of clients (connected through the Red5/MINA setup just described) at any given point in time. Each Program receives a type of RPC from its clients where the RPCs are encoded in Actionscript Message Format. They also issue events, where a server-side observer translates the events to RPC calls back to the clients. We are particularly worried about scalability and some other architectural issues. For example, how do we ensure that our application, given this architecture, can support enormous numbers of clients? Meaning, how can we run such an application on a server cluster? I mentioned that we're j2ee-inexperienced but, more importantly, we're very imaginitive and aggressive when it comes to implementing our ideas. If anyone could please direct us to any resources that might help us, we would be very grateful. Alternatively (or, in addition, rather) we would be interested in hiring any J2EE experts on a consulting basis to give us advice and direction. We're located in Chicago. Please contact me (aduston at uchicago dot edu) if you're interested.
- Posted by: adam duston
- Posted on: September 08 2006 00:21 EDT
Hi Adam, When you talk about scaling applications, there are two basic ways - Vertical Scaling (using bigger machines) or horizontal scaling ( more machines). Historically, vertical scaling has not been extremely successful as a single JVM instance is usually not able to utilize a large number of CPUs. Hence you are looking at horizontal scaling. The most common application style is to break the application into two physical tiers, 1st Tier - Connection Tier - Manages Connections to clients 2nd Tier - Business Logic Tier - does Actual Processing Lets look at both in detail. In the first tier, all you are worried about is having a stateful communication with the clients. So it takes care of client connectivity and state management. It doesnot do any processing, but passes thru the requests to a stateful "busines logic Tier". If you scale connection tier, i.e. have multiple instances of it, your servers will be listening to multiple ip address and port combinations. So the problem is how do you publish a single URL to the clients, and still distribute requests between these different servers. There are multiple ways of doing this - from round-robin DNS servers to OS clusters to Hardware or Software Load balancers. In my experience, Hardware or Software load balancers are usually preferred. These load balancers give a single Virtual IP and port, which in turn distributes connection to different servers on the back end. These load balancers support "sticky" settings, which returns the client back to the same server all the time, hence if the connection layer is stateful, the state wont be lost. If you can make the connection layer stateless, or store state in Database ( probably less scalable), you dont need to worry about stickyness. Now problem here is that although you can ensure that each server get an equal or proportional number of clients, but since the load imposed on the system by each client is different, the servers could still be unevenly loaded and while one server might be struggling at 100% utilization, other might be idling. Hence, you should have another physical Busiess Logic layer ( or App Tier) and ideally this should be stateless. Another load balancer ( or your custom code) between Connection Tier and App Tier can ensure that the requests are distributed evenly between app tier machines. These are generic architecture concepts and no J2EE features are reuired for this. Now, If the App tier is stateful - and state is required only for a user session, you could have stickie settings between connection and app tier as well ( Or you could put both on the same machine). Now, if your statefullness is across clients as you seem to indicate, you will need to consider some way of distributing different kind of objects on different machiens and either use seperate connection for those objects or use a JMS to distribute these messages to appropriate servers. Hence you put request for different objects on different queues and the server having that object will listen to that queue. Any clarifications - I will be glad to answer. Pranshu http://prashujain.wordpress.com
Dear Pranshu, It is very kind of you to reply to us. You are awesome and you've given us some very valuable advice. I should also mention that you're correct about these not being j2ee-specific concerns; I misspoke. Please replace the term j2ee" in my original post with the term "huge, mega-multiuser, distributed application". So, we thank you once again for sharing your distributed application experience and expertise with us. There is one thing that is not entirely clear from your response. Or, perhaps more accurately, there is a question that I failed to ask. Our Program objects each last for about twenty or thirty minutes. During this time they are associated with hundreds of different clients. During this twenty or thirty minutes they need to send regular messages to the clients every two minutes, and they need to send other messages even more frequently than that since we need to notify clients of new clients joining or departing from the Program. We are wondering about statefulness. The alternative to keeping our Program objects in main memory during their 20-30 minute lifespan would be, of course, to persist them to the database in between significant events. The problem with doing this is that each Program object would have to be "re-awoken" from the database every minute or two, and during each "re-awakening" it would have to be re-associated with its set of associated connections. I am speaking from a point of very little experience and partial blindness, but this seemed untenable in our design. It seemed more reasonable to keep the Program objects in main memory during their lifespans. Would you please comment on this persistence topic in particular? I should mention that each Program object does not have a direct reference to objects representing connections; we use at least a couple levels of indirection. In other words, each Program has an an associated set of abstractions called "ProgramClients" and each ProgramClient knows how to talk to the connection layer, or how to talk to something else which talks to the connection layer; you get the idea. I believe that you are correct: we will have to figure out how to distribute different Program objects on different machines. Thus far I have been following Martin Fowler's First Law of Distributed Object Design: "Don't distribute your objects". But I'm afraid that the requirements of our project might demand distributing objects. You mention in your final paragraph "...use separate connection for those objects...". This would seem to involve reshuffling connections at different times: unsticking and the re-sticking them, so to speak. This way, we could choose some nominal scheme for distributing our Programs on different business logic tier machines, and then shuffle the connections around. Is this mode of operation inadvisable for any reason? Is it better to keep the connections stuck to the same connection tier machines and then manipulate message queues to make the correct mappings between connection and Program object? One final bit of information: we are currently building this application to run on one machine. I have been particularly worried because of scalability. But it is beginning to sound as if the concerns of application logic and scalability are somewhat orthogonal. Am I right? Should we treat them as as orthogonal concerns? We really, truly thank you for your attention. We have been wandering around blindly in a desert. Please let us know if there are any books you can recommend also, since we can easily exhaust the generosity of experts such as yourself. Once again, we would like to hire a consultant such as yourself for a few hours per week. We are not rich (at least not yet! just wait until we finish building our application! ha ha) but we might be able to meet your hourly rate, if that interests you. Thanks again, Adam
Hi Adam, I too believe that keeping the object in main memory will be better than serializing and deserializing the object all the time, and hence should be avoided if possible. I am struggling to understand this without the aid of a diagram, so excuse me while I over-simplify things. 1) The number of "objects" is small enough for all of these to be kept in memory. 2) There is a "publish subscribe" relationship between the object and the clients, whatever the message server sends, all clients recieve. 3) Clients have some way to connect to an existing object , so they get to see a list of existing agents, choose to get which one to connect to and connect. 4) Do the clients also send messages to object ? Possibly not. If the data flow is one-way, between object to client, this becomes a classic Publish subscribe model for a JMS. There will be one topic per object, the client choses which topic it wants and subscribes to it. Now the question is about the life-cycle of the object. This depends on who instanciates it and what terminates it. One way could be to have these objects created in a pool ( the pool preventing these from being garbage collected)and terminated on an event -like timer. Once you use a JMS, you dont need to worry about client connected to the same object. Now the only remaining problem is how to distribute these objects across machines. Whatever is the method for creating these objects Needs to instanciate these in different machines. The only way is to make the object creation method/service as stateless. That way this service, which is load balanced, can be invoked from the connection layer. The load balancing can be done by a hardware load balancer or a JMS queue . So re-stating the architecture: Client ==> Hardware Load Balancer (sticky)==> Connection server farm (stateful)==> JMS ==> App server farm (stateless services) Before you walk this path, do consider the cost of a cluster-able JMS server. There definitely are multiple ways to address this problem, but in my opinion, this approach is simpler. Pranshu