Java Development News:
Retrieving Only Required Data
By John Goodson
01 May 2004 | TheServerSide.com
John Goodson is a JDBC Expert Panel Member and VP of Research and Development at DataDirect Technologies. Check out his Tech Talk on TheServerSide.com, where he discusses Database Access Technologies.
Developing performance-oriented JDBC applications is not easy. JDBC drivers do not throw exceptions to tell you when your code is running too slow.
This series of performance tips presents some general guidelines for improving JDBC application performance that have been compiled by examining the JDBC implementations of numerous shipping JDBC applications. These guidelines include:
- Using Database MetaData Methods Appropriately
- Retrieving only Required Data
- Selecting Functions that Optimize Performance
- Managing Connections and Updates
- Designing and Running Benchmarks for Performance
In the last performance tip, we discussed using database metadata appropriately. These general rules about retrieving data should help you solve some common JDBC system performance problems.
To retrieve data efficiently, return only the data that you need, and choose the most efficient method of doing so. The guidelines in this section will help you to optimize system performance when retrieving data with JDBC applications.
Retrieving Long Data
Unless it is necessary, applications should not request long data because retrieving long data across a network is slow and resource-intensive.
Most users don’t want to see long data. If the user does want to process these result items, then the application can query the database again, specifying only the long columns in the select list. This method allows the average user to retrieve the result set without having to pay a high performance penalty for network traffic.
Although the best method is to exclude long data from the select list, some applications do not formulate the select list before sending the query to the JDBC driver (that is, some applications send select * from <table name> ...). If the select list contains long data, most JDBC drivers must retrieve that data at fetch time, even if the application does not ask for the long data in the result set. When possible, the developer should attempt to implement a method that does not retrieve all columns of the table.
For example, consider the following JDBC code:
ResultSet rs = stmt.executeQuery ( "select * from Employees where SSID = '999-99-2222'"); rs.next(); string name = rs.getString (4);
Remember that a JDBC driver is not intuitive. It has no idea what result columns an application might be trying to retrieve when the query is executed. A driver only knows that an application can request any of the result columns. When the JDBC driver processes the rs.next() request, it will most likely return at least one (if not more) result rows across the network from the database server. In this case, a result row will contain all the column values for each row – including an employee picture if the Employees table happens to contain such a column. Limiting the select list to contain only the name column results in decreased network traffic and a faster performing query at runtime.
Additionally, although the getClob() and getBlob() methods allow the application to control how long data is retrieved in the application, the developer must realize that in many cases the JDBC driver emulates these methods due to the lack of true LOB locator support in the DBMS. In such cases, the driver must retrieve all of the long data across the network before exposing the getClob() and getBlob() methods.
Reducing the Size of Data Retrieved
Sometimes long data must be retrieved. When this is the case, remember that most users might not want to see 100 KB (or more) of text on the screen.
To reduce network traffic and improve performance, you can reduce the size of any data being retrieved to some manageable limit by calling setMaxRows(), setMaxFieldSize(), and the driver-specific SetFetchSize(). Another method of reducing the size of the data being retrieved is to decrease the column size. If the driver allows you to define the packet size, use the smallest packet size that will meet your needs.
Remember: be careful to return only the rows and columns you need. If you return five columns when you only need two columns, performance is decreased – especially if the unnecessary rows include long data.
Choosing the Right Data Type
Retrieving and sending certain data types can be expensive. When you design a schema, select the data type that can be processed most efficiently. For example, integer data is processed faster than floating-point data or decimal data. Floating-point data is defined according to internal database-specific formats, usually in a compressed format. The data must be decompressed and converted into a different format so that it can be processed by the database wire protocol.
Retrieving Result Sets
Most JDBC drivers cannot implement scrollable cursors because of limited support for scrollable cursors in the database system. Unless you are certain that the database supports using a scrollable result set, (for example, rs), do not call rs.last() and rs.getRow() methods to find out how many rows the result set has. For JDBC drivers that emulate scrollable cursors, calling rs.last() results in the driver retrieving all results across the network to reach the last row. Instead, you can either count the rows by iterating through the result set or get the number of rows by submitting a query with a COUNT column in the SELECT clause.
In general, do not write code that relies on the number of result rows from a query because drivers must fetch all rows in a result set to know how many rows the query will return.