|
|
 |
Tips on Performance Testing and
Optimization By Floyd Marinescu, Senior Architect
The purpose of this document is to explain how to go
about performing scalability testing, performance testing, and optimization, in a typical
Java 2 Enterprise Edition (J2EE) environemnt.
Definitions:
Response Time the time it takes between initial request
and complete download of response (rendering of entire web page).
Load a measurement of the usage of the system. A server is
said to experience high load when its supported application is being heavily
trafficked.
Scalability - A scalable application will has a response
time that increases linearly as load increases. Such an application, will be
able to process more and more volume by adding more hardware resources in a
linear (not exponential) fashion.
Automation testing tools Tools (Silk from Segue Software,
WebLoad, etc) used to simulate a user by requesting pages or going through
pre-programmed workflow on your site.
Load testing tools Most automation testing tools can also
be used as load testing software, like WebLoad. These tools will simulate any
number of users using your site and provide you with important data like average
response times.
Profiler. A profiler is a program that examines your
application as it runs. It provides you with useful run time information such as
time spent in particular code blocks, memory / heap utilization, number of
instances of particular objects in memory, etc.
A Process for performance testing
1) Functional Testing. Most applications begin tests be
first completing functional tests. That is, ensuring that all the usecases /
workflow in your application work.
2) Load and Scalability Testing. Load and scalabilty
testing has too forms:
- Test Response time as you increase the size of our database
- Testing response time as you increase concurrent
users
3) Interpreting the results. After measuring
response time at varied database sizes and loads, you can now make
interpretations based on the average response time of these tests and the
resource utilization of the server during the tests.
4) Optimization. After identifying problems in
the last step, you now interpret the results and track down the problem.
Load and Scalability Testing The purpose of load and
scalability testing is to ensure that your application will have a good response
time during peak usage. You can also test how your application will behave over
time (as your website contains more and more data in your database). To
begin testing, write some testing scripts that will populate your database with
an average amount of data. Run your performance tests, measure your response
time. Then populate your database with an extreme amount of data (3 to 4 times
more data than you can foresee having in 3 years). Run your performance
tests again. If response times are significantly larger for the second test,
then something is wrong.
To run your performance tests, you will want to simulate server usage
at different loads. As a rule of thumb, I simulate low load (one to 5 concurrent
users), medium load (10-50 concurrent users), high load (100 concurrent
users) and extreme load (1000+ concurrent users). Note that these numbers
are arbitrary and depend on your business needs. Also, simulating 10 concurrent
users with load testing software isnt representative of 10 people, since
each robot in the load test may wait just milliseconds before hitting the
server again. Thus, using a load tester to simulate 10 users is probably more
representative of the web surfing patterns of 30-40 people.
Once you have tested at all three load levels, you can now compare
average response times to see if your system is scales, that is, if the response
time increases linearly. Interpreting the
results
The fun part of this process is interpreting the results of your load
testing. Let us examine some of the different possibilities:
- Response time increases too much when
database is over populated
Response time should not increase too much if you move from a database with
100 rows in its tables to 50,000. Database indexing technology makes finding a
row in a table take a matter of milliseconds, even if there are hundreds of
thousands of rows. Thus, if your response time increases too much after moving
from a moderately populated database to an over populated database, then you
probably havent indexed your appropriate columns yet.
- Response time increases exponentially as
load increases
If your system becomes un-useable as you increase concurrent users, then your
system is not scalable. Interpreting these results are difficult, as the problem
could be with hardware, deployment configuration, architecture, etc. Make
sure you watch the server resources during the tests:
- Watch memory requirements
- Watch CPU usage
If CPU is
over used, need faster processor, or more processors. If the cpu is
underused, then the problem is probably input/output (I/O) related. Check
your database connections, your running thread count, and the network
configuration of your test boxes.
If after checking your configuration, verifying that the slowdown is not a
hardware bottleneck, and looking over your architecture for code to optimize,
its time to run a code profiler.
Optimization
The database, your architecture,
configuration and hardware will need to be optimized. As mentioned in the
previous section, the easiest way not to scale is to have a database that isnt
tuned. A database administrator (DBA) is always a vital person to have on any
dev team, but if you dont have one, here is what you can do:
Look though your EJBs and verify that
your database isnt doing linear searches for any of the sql queries that you
have encoded. To do this, copy your SQL from your code and in your database sql
window, run an EXPLAIN clause:
Explain select * from table where tablefield =
somevalue
Although the explain syntax differs from
database to database, there is always something similar. After running this line
of code, your DB will tell you if it is searching an index or a linear search.
Make sure you verify that every piece of SQL in your application is using your
DBs indexes, and if not, create the indexes.
After optimizing the database, and optimizing your hardware
configuration (as discussed in the previous section), the next step is
optimizing your code, and this is done with a profiler.
A profiler is a program that analyzes your application as it runs. A
Profiler provides you with information you could not otherwise get access to,
such as:
- How many objects of each class are in memory and garbage collection
behaviour
- This information can help you identify classes, which
should be pooled.
- Can help you tune your java heap.
- How much time your application is spending in particular classes
This is the most important feature. Your profiler will point its finger and show you
which classes are the bottlenecks.
One such program that really helped me is called
Optimize-It. Optimize-It can be used
with any java program or any java-based application server. Configuration
with Weblogic is easy, and Optimize-It
can be used to profile an
application on a remote server.
Optimizing your architecture is extremely project specific, but here
are some tips:
- Make sure you have minimized your network calls, especially database
calls
- It is better to make one large database call rather than many small ones.
- Make sure ejbStore isnt storing anything for read only operations.
- Use Details Objects to get entity bean state.
- Make sure to take advantage of caching where possible
Your app. Server probably allows you to cache entity beans
in memory, make sure you take advantage of this, as it will dramatically reduce
database calls and speed up data access.
- Make sure you are using session beans as a faade to your entity beans.
You can
encapsulate the workflow of one entire usecase in one network call to one
method on a session bean (and one transaction).
|
Testing and Optimizing TheServerSide.com
|
|
TheServerSide.com experienced numerous
scalability problems before it was launched. Using the tips suggested in this paper, we
fixed all the problems resulting in TheServerSide being one of the fastest
java-based portals out there.
The first step in testing TheServerSide was populating the database
with test data. After populating it to a moderate amount and to an extreme
amount (added 16,000 messages and 40,000 users to our database), we found a
serious problem. The response time for our top level pages jumped from 2 seconds
to 12 seconds, at a single user.
Having not read this document, we did the most common of mistakes, we
the first doubled our CPU speed and memory on our box. This only brought
response time down to 8 seconds, and was therefore not the cause of the
bottleneck.
The problem we had indicated that something was wrong in the database.
After checking how our database handled our queries, we discovered that our
primary key columns (and others) were not being indexed properly. This
means that the database had to do linear searches, even for
ejbFindByPrimaryKey(), which is the most common of
calls. After making changes to the database and our primary
key strategy (our PostGreSQL database was buggy in handling 8 byte integer
indexing), we were able to index all the appropriate columns, and bring the
response time down to 3 to 4 seconds.
Once we had optimized the database, we began running proper load
testing. We used WebLoad, a powerful load testing tool. The evaluation
copy allows testing with 12 concurrent users ( probably representative of 30
40 real people). After running the tests at the maximum (12 concurrent
users), we found that our site was extremely unscalable. The response time
jumped from 3 to 4 seconds single user to 15 to 20 seconds per page under
heavier load. These numbers were obviously not good enough.
Having optimized the database and upgraded the hardware, we now turned
to examine our architecture. Minor modifications were made, but we still
couldnt find the cause of the problem. Once we began using a profiler, that all
changed. After running Optimize It remotely (had a window on my local
machine telling me the stats of the server running at our ISP) I discovered the
cause of our problem. 30% of the CPU time was being spent in socket
communications with our database. Optimize It allowed me to trace back and
see which objects/methods were initiating these calls, and I identified a design
problem that was causing us to query for a count in the database every time we
wanted to display a message on TheServerSide.com. After fixing this silly
problem, I brought down the number of database calls invoked to display a page
from 15+ down to 1, and suddenly our response time went down to about 1 second.
This was exactly what we wanted.
|
Conclusion
Performance testing and optimizing your application can be pretty
challenging. Luckily, there are tools on the market that can make the process
easier. By using these tools and following the simple steps in this paper,
you should be able to effectively track down the bottlenecks in your system.
Special thanks go to Mark Turnbull,
the best DBA I know.
Floyd Marinescu is Senior Architect at The
Middleware Company (http://www.middleware-company.com),
which helps firms by training and consulting with them in XML, EJB and J2EE
technologies. Floyd designed and implemented TheServerSide.com, the new J2EE
community based on an EJB architecture, and enjoys participating with the
community and tracking the business aspects of the Middleware
industry.
PRINTER FRIENDLY VERSION
Copyright 2000
The Middleware Company

Published exclusively on
TheServerSide.com
|