April 2004
Introduction
|
John Goodson is a JDBC Expert Panel Member and VP of Research and Development at DataDirect Technologies.
Check out his Tech Talk on TheServerSide.com, where he discusses Database Access Technologies.
|
Developing performance-oriented JDBC applications is not easy. JDBC drivers
do not throw exceptions to tell you when your code is running too slow.
This series of performance tips presents some general guidelines for improving JDBC
application performance that have been compiled by examining the JDBC implementations
of numerous shipping JDBC applications. These guidelines include:
- Using Database MetaData methods appropriately
- Retrieve only required data
- Select functions that optimize performance
- Manage connections and updates
Following these general rules will help you solve some common JDBC system performance
problems.
Using Database Metadata Methods
Because database metadata methods that generate ResultSet objects are slow
compared to other JDBC methods, their frequent use can impair system performance.
The guidelines in this section will help you optimize system performance when
selecting and using database metadata.
Minimizing the Use of Database Metadata Methods
Compared to other JDBC methods, database metadata methods that generate ResultSet
objects are relatively slow. Applications should cache information returned
from result sets that generate database metadata methods so that multiple executions
are not needed.
Although almost no JDBC application can be written without database metadata
methods, you can improve system performance by minimizing their use. To return
all result column information mandated by the JDBC specification, a JDBC driver
may have to perform complex queries or multiple queries to return the necessary
result set for a single call to a database metadata method. These particular
elements of the SQL language are performance expensive.
Applications should cache information from database metadata methods. For example,
call getTypeInfo once in the application and cache away the elements of the
result set that your application depends on. It is unlikely that any application
uses all elements of the result set generated by a database metadata method,
so the cache of information should not be difficult to maintain.
Avoiding Search Patterns
Using null arguments or search patterns in database metadata methods results
in generating time-consuming queries. In addition, network traffic potentially
increases due to unwanted results. Always supply as many non-null arguments
to result sets that generate database metadata methods as possible.
Because database metadata methods are slow, applications should invoke them
as efficiently as possible. Many applications pass the fewest non-null arguments
necessary for the function to return success.
for example:
ResultSet WSrs = WSc.getTables (null, null, "WSTable", null);
should be:
ResultSet WSrs = WSc.getTables ("cat1", "johng", "WSTable",
"TABLE");
In the first getTables() call, the application probably wants to know if the
table WSTable exists. Of course, a JDBC driver takes the call literally and
interprets the request differently. A JDBC driver interprets the request as:
return all tables, views, system tables, synonyms, temporary tables, or aliases
that exist in any database schema inside any database catalog that are named
‘WSTable’.
The second call to getTables() more accurately reflects what the application
wants to know. A JDBC driver interprets this request as: return all tables that
exist in the ‘johng’ schema in the current catalog where the name
is ‘WSTable’.
Clearly, a JDBC driver can process the second request much more efficiently
than it can process the first request.
Sometimes, little information is known about the object for which you are requesting
information. Any information that the application can send the driver when calling
database metadata methods can result in improved performance and reliability.
Using a Dummy Query to Determine Table Characteristics
Avoid using getColumns() to determine characteristics about a table. Instead,
use a dummy query with getMetadata().
Consider an application that allows the user to choose the columns that will
be selected. Should the application use getColumns() to return information about
the columns to the user or instead prepare a dummy query and call getMetadata()?
Case 1: GetColumns Method
ResultSet WSrc = WSc.getColumns (... "UnknownTable" ...);
// This call to getColumns() will generate a query to
// the system catalogs... possibly a join
// which must be prepared, executed, and produce
// a result set
. . .
WSrc.next();
string Cname = getString(4);
. . .
// user must retrieve N rows from the server
// N = # result columns of UnknownTable
// result column information has now been obtained
Case 2: GetMetadata Method
// prepare dummy query
PreparedStatement WSps = WSc.prepareStatement
("SELECT * from UnknownTable WHERE 1 = 0");
// query is never executed on the server - only prepared
ResultSetMetaData WSsmd=WSps.getMetaData();
int numcols = WSrsmd.getColumnCount();
...
int ctype = WSrsmd.getColumnType(n)
...
// result column information has now been obtained
In both cases, a query is sent to the server. But in Case 1, the query must
be prepared and executed, the result description information must be formulated,
and a result set of rows must be sent to the client. In Case 2, a simple query
must be prepared and only result description information must be formulated.
Clearly, Case 2 is the better performing model.
To somewhat complicate this discussion, let us consider a DBMS server that
does not natively support preparing a SQL statement. The performance of Case
1 does not change, but the performance of Case 2 increases slightly because
the dummy query must be evaluated instead of only prepared. Because the Where
clause of the query always evaluates to FALSE, the query generates no result
rows and should execute without accessing table data. For this situation, method
2 still outperforms method 1.
In summary, always use result set metadata to retrieve table column information
such as column names, column data types, and column precision and scale. Only
use getColumns() when the requested information cannot be obtained from result
set metadata (i.e. table column default values).
Look out for next month's performance tip, on 'Retrieving Only Required Data'.
PRINTER FRIENDLY VERSION
|