Hi everyone. I am seeking help thinking through a performance comparison between Stateless session beans and Servlets, with respect to scalability, clustering, etc.
There seems to be a resounding idea in the industry that EJB is for large scale systems (larger than servlets can handle). For the sake of discussion, lets not make that assumption, instead, lets try to prove it.
First, I will make the statement that stateless session beans are essentially equivalent to Servlets marked as SingleThreadModel, purely from a performance and scalability perspective. Thus, both are pooled, both execute on behalf of only one thread at a time.
Again, from the performance perspective then, what makes Stateless Session Beans more scalable than Servlets?
- Clustering? No, you can cluster a servlet engine too.
- activation/passivation? No, cause there is nothing to passivate in a stateless session bean. However, the pool of SSB's can be dynamically grown and shrunk, but then so can a pool of servlets.
So again, from the performance/scalability perspective, what makes EJB better than servlets for building large scale apps?
I think in your post you somehow missed the point of the comparison: the comparison is between EJB and Servlets, not statless session beans and Servlets.
Stateless session beans, as a part of the EJB architecture, can be used to build more scalable applications. Among the major benefits are:
- strong transaction processing engines, optimized local transactions, etc.
- optimized resource management: pooling, caching, etc.
- component architecture: EJB is a component architecture, and as such can generally produce stronger decoupling, leading to better scalability. You can use component models such as JavaBeans in Servlets, but they themselves do not support scalable distribution of components like EJB.
Have I missed any key adventages (regarding scalability)?
...stronger decoupling, leading to better scalability
Care to elaborate how a stronger "decoupling" leads to a better scalability?
To me decoupling gives less head-ache for code-maintenance, and makes the application easier to extend and enhance when new business requirements are added. However, it generally says nothing about how the actual system would process and respond to request under increased load. The only case where "decoupling" may lead to perceived system performance retention under high load is the use of MOM.
I agree with the rest of you statements, with explain how EJB, to which SLSB is an integral part, is a component model and as such separates logical tiers and facilitates code resuse.
In my view, the difference between good performance and good scalability, is that a scalable design allows you to address bottle-necks locally rather than globally.
To take an extreme example, consider a single-tiered, non distributed web application. You can say it is "scalable" because you can add more and more web-servers in order to handle high load. But this kind of scalability is not likely to be cost efficient: you need to duplicate everything, while usually certain parts of the system will bottleneck much faster than others. A scalable design allows you to scale these parts without duplicating the entire system.
Perhaps the most common example of this concept is the multi-tiered architecture: an application is designed as a set of decoupled tiers, each of which can be scaled to a different degree. The same idea holds in inter-component design. When components are decoupled you can deploy different components on different hardware, allowing you to reduce a bottleneck caused by a specific component without duplicating the entire component tier.
Is this one the most hotly-contested topics from the "EJB for J2EE Architects" course? What is the standard answer from TheMiddleware Company? Any other interesting discussion from the course that you want to share with the J2EE community?
Stateless Session Beans(SLSB) support a distributed architecture better than servlets which is important when looking at scalability. For example you can deploy a group of SLSBs that provide certain functionality on one server and another group on another server. Using JNDI to lookup the beans means they can be easily located. Although the servlet architecture supports clustering it is not as flexible as using JNDI/SLSBs.
In a word, nothing. Nothing makes SLSBs more scalable than servlets. However, your site will perform better if you can separate the code that is CPU intensive from that which isn't.
Example: Say you have a site which has 5 pages. 3 pages are just your plain old "about us", "contact us" etc. pages, while the other 2 actually do something like producing a chart, running a search etc. Users of the simple pages of a well-architected site shouldn't notice a hit in performance when other users are thrashing the non-simple pages. That is, you should still expect to see the contact details quickly even though someone is searching through the Encyclopaedia Brittanica request a count of the number of times the word "the" appears.
To ensure that performance is maintained you are going to need to physically separate the tough stuff from the simple stuff. Clustering servlet engines wouldn't work as performance on individual servers can still be slow. This is because there is nothing stopping a huge search running on the same CPU as 5 threads that just want to load a simple page. Those 5 threads will be murdered by the tough job.
So, this is where SLSB's come into their own. By extracting out the CPU-intensive code to a remote server and leaving the web server to just serve up the HTML, you will have an architecture that will ensure that overall performance of a site is not impaired when complex jobs are run.
Now, you could just have a separate servlet engine that runs the complex stuff and is proxied by the main web-server. This could then cluster and scale just like an EJB tier. However, servlet engines and the servlet framework are not designed to execute business logic. They don't have the in-built transactions, pools, security etc that comes with an EJB app server. You could build all this yourself but why bother? It is much more economical to just buy it from BEA, IBM etc. Other benefits of separating your business logic out into SLSBs is reuse of code. Once you reduce the need for state you will increase the amount of code reuse as SLSB methods can be called from a multitude of clients, not just web browsers.
Good question. More of these discussions are welcome.
Thanks for your responses, my comments:
I think the comparison is with SLSB's, since Servlets cannot be compared with Entity beans or SFSB's. In reference to some of your points:
Not sure if its fair to say that EJB provides optimzed resource management, since a good servlet engine should do that as well. Also, the transaction processing engine can only be as fast as the transactional stores it works with (typically your JDBC driver), which is also available to a servlet engine.
Your point on a component architecture is interesting, and seems to be a consensus among other posts. I'll respond below...
I believe this is discussed in the architects course, but I am not sure what the answer is in that class. I actually work on TheServerSide full time, and havn't attended too many training classes. :) The best way to find out more is to take a class, hehehe.
Darren, and Adell,
You guys seem in agreement with Gal on the benefits of being able to move around java beans as components to different servers.
This is a great hypothesis, thanks for this. It conflicts with my previous belief that co-location benefits of putting your EJB-engine in the same VM as the servlet engine outweigh the other alternatives.
It seems to me that the performance impact of separating the EJB-Layer on a separate machine from the Servlet Layer is very high.
Isn't it better then to simply scale horizontally (duplicating everything as you say) to scaling vertically? It would seem that the NET performance benefits of co-location are better.
Again, I the purpsose of my starting this discussion is to be able to refine the argument for and against the notion that servlets are equivalent to stateless session beans, from a performance perspective.
I agree with Floyd 100% WRT the supposed merits of physically separating the web tier from the EJB tier. It sounds good on paper but is not practical at all. With the EAR deployment format and the Ant build tool how do you separate and deploy these tiers onto separete machines in a easy way? Is it worthwhile with today's low hardware cost? I would like to see some real cases to be convinced.
It seems to me that the performance impact of
>> separating the EJB-Layer on a separate machine from the
>> Servlet Layer is very high.
I have to agree here. My experience has been that the cost of making RMI calls is inordinately high. It works great for "toy" problems but it's a real pain for real-world stuff. I've run into a few scenarios where where the benefit of a dedicated compute server overrode the RMI costs, but not many.
>> Is it worthwhile with today's low hardware cost?
There are some huge web sites out there deployed on 6-8 Sun E-10000 machines. I think that you underestimate the amount of money you can save on a site like that with just a little tuning.
EJB will never win a performance or efficiency argument. That argument is just too subjective. People who look at the architecture will find all of the inefficiencies of EJB -- EB caching is worthless, RMI is slow, etc... People who look at overall performance measurement will find wildly conflicting measurements. In some cases they are nearly equivalent while in others servlets are several orders of magnitude more efficient. What you won't find is very many comparisons where EJB outperforms servlets.
You can't make a case for EJB's without talking about concurrency of the development process and the value of autonomous components. What's the point?
"I think the comparison is with SLSB's, since Servlets cannot be compared with Entity beans or SFSB's"
I did not say that you should compare Servlets to entity beans or SFSBs either. I said you should compare them, if anything, to EJB as a single unit. You can't break down EJB into components and compare them to Servlets. It's like trying to compare a hammer (Servlets) with my arm: my arm can't do anything without the rest of my body (well, atleast some of it). Such a comparison has no validity in abstract terms, and is also completely irrelevant in practical terms (when do you have a SLSB without the rest of EJB?).
"Not sure if its fair to say that EJB provides optimzed resource management, since a good servlet engine should do that as well"
I'm not trying to be fair. EJB provides this functionality, in a standard manner, by integrating with the connector architecture. Servlet engines "could" do it. Servlet engines "could" provide all the funtionality you would ever want. If you're going to compare the standards, you should only compare what's readily available using nothing but the standard: and the Servlet standard does not provide a way to get a controlled resource.
I don't think Darren's point about component architecture is the same as mine. I disagree with Darren about the performance implications: I don't think remote components *should* be used to increase performance, and in practice I don't think they will. The remoteness overhead is too big. I also don't like Darren's example: I've seen many pure-servlet deployments where Apache was used to serve static content and another server (say JRun) was used to run servlets. I also think that in high-end servers the CPU overhead of serving static data is very small anyway, since the strong BUSes allow for most the data negotiation to go directly between the network card and the memory. I also don't like Darren's definition of scalability, which seems rather arbitrary to me.
"Isn't it better then to simply scale horizontally (duplicating everything as you say) to scaling vertically? It would seem that the NET performance benefits of co-location are better"
Definately, many times it is. These applications do not require a very scalable design. They do not require multiple tiers. If you can put anything (except for the DB, which is a different case alltogether) in a box and duplicate that box to handle all your load, and still remain cost efficient, you don't need a design as scalable as what EJB has to offer. This doesn't change the fact that EJB, as a component architecture, does lead to better scalabillity, redundant as it may be. It is one more benefit that EJB brings. If you don't need that, and you don't need the rest of what EJB has to offer, don't use EJB. If you need it, or you need other things that EJB has to offer, you can sleep well at night knowing your application can scale over several different axes.
I think Floyd's comparison is still valid. To rework your metaphor, EJB is a toolbox and SSB are one tool in that box. You can compare a hammer (the servlet) to that tool pretty easily. You can use SSB without using entity beans or SFSBs and a lot of times people do. Obviously SSB won't work without the infrastructure of "EJB" as a whole, but by the same token a servlet class won't work without the servlet engine's framework. Floyd's comparison was between an object in one container framework versus an object in another container framework.
You are right, I did not mean to compare Servlets *without* their framework to EJB's *with* it's framework. Servlets without their framework are just useless classes: they don't have anybody to direct calls to them, manage their sessions, etc. On the same note, SLSBs do not exist in a vaccume. Without their container, that provides transactional infrastructure, resource management, etc, they are useless.
You'll note that in my post I did not specifically address anything other than SLSBs and the container infrastructure. However, I still have to say that while this comparison is possible, I don't know how relevant it is. EJB comes as a complete pack. I find it hard to imagine any situation where you would be confined to using nothing but SLSBs. Therefore I think that the really interesting question is "why are EJB-based applications more scalable than pure-servlet based applications?".
Thanks for being so active on this thread. I agree with your point on resource management, you are describing a development feature provided by EJB (standardized access to resources VIA JCA), which is a very important feature.
Here is my reason for beginning this thread in the first place. We are all familiar with the age-old 'should we use EJB' argument. The reasons for using EJB tend to boil down into three categories (feel free to modify):
1) Performance and Scalability.
- pooling, load balancing, clustering
2) Development Model and Framework Features.
- built in transactions, security, JCA support
- other time saving development model goodies
- distribution (multiple client support)
3) Proprietary Value Ads.
- Management and monitoring facilities
- QoS guarantees
In starting this thread, I am trying to specifically determine whether item #1 is a relevant point of comparison. In doing so, it will allow us to have more intelligent conversations about EJB and let project managers better decide whether to use it or not to use it. If performance has nothing to do with it, everyone should know this, so that they can make better decisions.
My position on this from watching this thread is that performance/scalability should not be factor in EJB's favour, since properties about EJB that are performance related are also present in servlet engines.
Gal, I would classify your points about Resource Management and Component Architecture as part of category #2 from above. Do you agree? If so, is it correct then to say that performance/scalability is not a valid reason for choosing to use EJB on a project?
ps - note that I am in no way putting down the development model features of EJB, I love those!
I understand why you started this thread, and I think it is indeed important to break some of the myths that's been going aroung EJB.
In my opinion, anyone who picks up EJB in order to increase overall performance is probably headed for a big disappointment. EJB may have a few good performance points, but in overall I think the performance hits of tier seperation (remote calls, copying of arguments, container mediation) will outweigh them.
However, I do think EJB leads to a more scalable design, particularly because it is a distributed component architecture. IMO this has implications on (1), in addition to the obvious implications on (2). As I mentioned in my earlier post, I define scallablity as the ability to scale a system in a cost efficient way. EJB allows greater flexiblity with regards to the physical topology of the deployment of components. Thus, EJB can be used to scale better by deploying certain components on stronger boxes, or on more boxes, than others. This is the key to scalability in any n-tier architecture. Therefore, I do not agree that category #1 is irrelevant: the performance part is, but the scalability part isn't.
I'm looking towards hearing the community's opinion on this.
This discussion is bringing up some very interesting points.
I think grouping performance & scalability together in point #1 is not valid. They should be looked at separately; Scalability (I am referring to vertical scalability here) is definetly a reason for selecting an EJB architecture over servlets. Although increasing scalability can lead to decreased performance, the benefits of increased scalability can outweigh the benefits of the increased performance. I am not saying performance is not important, it's just that performance is relative and you can still have satisfactory performance in some systems with a properly distributed (distributed by functionality) architecture.
I tend to work more on very large enterprise systems, where a componentized architecture is very important and beneficial in terms of both development & deployment. In several instances duplicating functionality that needs to be shared, on several servers is not an option and not the most efficient (cost & logistics) solution.
I tend to agree with Gal's last comments, you cannot look at performance on it's own when selecting an EJB solution, you have to look at the overall benefits.
In my opinion, for small/medium sized systems, vertical scalability may not be necessary, it may be just as easy to duplicate JavaBeans across servers in which case using servlets & java beans would be sufficient. But in very large systems where some sub-systems and/or shared components are a whole system on their own, it is neither practical nor cost efficient to use horizontal scaling. In some systems I have worked on, certain systems require different security levels and are placed behind different firewalls, putting other shared components on these servers is not practical. Also, duplicating components for large systems makes both deployment and maintenance much harder.
In summary, I still think that by selecting stateless-session beans you can get both good performance and other EJB benefits such as improved functional distribution and scalability over using a web-based (servlet/javabean) architecture.
I think for project managers we need to make a distinction between scalability and efficient use of hardware resources (memory, CPUs, machine footprint etc.).
From the business point of view scalability concerns are:
Is the application scalable (can support X times the # of users in future and how much will it costt)? development, deployment and maintaince cost, time to develop/market, etc.
When we say one technology is more scalable than the other the underline assumption is that in comparision the scalable technology will require less hardware resources to support X # of users.
However there is usually a significant development cost for making a application more scalable (multi-tier etc.).
I think cost of scalibility (software vs hardware) needs to be quantified for the project for picking the technology instead of just picking it based on scalability alone ("efficient use of hardware resources").
To make things interesting the hardware and software cost are constantly changing...
With EJBs and additional services provided by containers the cost of developing multi-tier applications is coming down.
At the sametime the cost of hardware is also coming down signifantly with new inovations.
Unfortunately in most organizations the software development cost/feature data is anecdotal so it is hard to justify cost savings in development efficiency. Where as decreasing the hardware cost and footprint can be easily quantified by CIO's. Due to this in my opinion some organizations tend to optimize the efficient use of hardware resources, which in some cases leads to significant additional cost in development and budget over runs to tweak the system. Don't get me wrong there are systems where making an application more scable can reduce overall cost significantly. When considering cost savings one also need to consider the development efforts and deployment complexity/cost.
You guys have finely elucidated the point of scalability vs. performance as related to EJB. Thanks for that. I understand larger class of problem at work here for which a distributed object system like EJB is the only solution. Cool!
Now I was wondering, would you guys be interested in participating on a writeup of one or two projects you had worked on in which this type of clustering was used? I want to start writing 'real world project stories' on TheServerSide, and this sounds like a great story to begin with.