By saying "the application/architecture should be scalable", what exactly do we refer to?What is the benchmark of measuring the scalability of the application?
It refers to the ability to add either additional CPUs or server boxes at any one of your tiers, you should see the application handle a proportionate amount of load more.
For instance if you simply add an additional CPU to a single cpu box you are using as the front-end. YOu should expect to be able to handle Twice as much front-end traffic - if the midle-tier and back-end boxes can handle the increase load. Typically the only time you will see a real jump is by doubling CPUs in the terms of 2 to 4 and 4 to 8 and so forth.
Scalable design, means that you can determine which tier is the bottleneck, and you can throw in an identically configured server box (of the system being overloaded), and the software, application and hardware implementation will automatically be recognized and the load will be distributed evenly at that tier. This you you she greater performance throughout the application.
Clustering is one technology example of how systems can be grouped to appear as "one system"...and thus become scalable. But to be truly scalable, the application itself must be able to maintain state throughout all tiers or appear to stateless.
If a J2EE server is 'truly' scalable, it is just the addition of extra processors needed. The difference between scalable and clusterable is big. The ability to scale (linear) is what you want as a customer. Let's say you have 10.000 requests/hour and your system is running fine. If you expect your site to increase the amount of request to 20.000, wouldn't you want to know how many more CPUs you would have to buy?
Remember, if you need additional boxes, you will also need additional RAM, NIC-cards, etc. If it's truly scalable, then it would only require more CPUs