Discussions

News: JDBC Tips: Managing Connections and Updates

  1. JDBC Tips: Managing Connections and Updates (14 messages)

    In this JDBC Performance Tip, Part 4 of the JDBC Peformance Series, John Goodson looks at how to manage connections and updates to improve system performance for your JDBC applications. He shows you how to manage transactions, commits in transactions, choose the right transaction model, use updateXXX methods and getBestRowIdentifier().

    Read Part IV - Managing Connections and Updates

    Threaded Messages (14)

  2. PreparedStatement caching[ Go to top ]

    Is the PreparedStatement cached at the connection level or is it at the datasource level? For example, in the following code:

    conn1 = ds.getConnection();
    ps1 = conn1.prepareStatement();
    ..use ps1
    ps1.close();
    conn2 = ds.getConnection(); // assume that this is not the same conn as conn1
    ps2 = conn2.prepareStatement();

    Is ps2 the same as ps1 (retrieved from PreparedStatement cache)? Or is it a brand new PreparedStatement since the connection is a different connection (one for which this query has not yet been cached)?

    Thanks
  3. the PreparedStatement cached at the connection level.

    So, ps2 is not the same as ps1 .
  4. Cached statements[ Go to top ]

    My understanding is database server caches the query plan when you use a prepared statement therefore it shouldnt matter what connection you get. Also, microsoft say that there is no difference between prepared and regular statements as sql server caches the query anyway...
  5. Cached statements[ Go to top ]

    2 years ago, while monitoring our application to finetune the database access, we found out that our rate of reparsing and planing PreparedStatements was about 70% in Oracle8i. That seemed to be because the PreparedStatement is cached at Session level in the database. Please, correct me if I'm wrong as this would mean that using PreparedStatements can WORSEN your performance in a tipical web-app.
  6. Cached statements[ Go to top ]

    unless I am mistaken, a database session corresponds to a JDBC connection (at least that's the case for Informix). In that case, for typical web applications where you just grab a connection from the pool, PreparedStatements (PS) are actually bad, since the rate of PS Cache reuse is probably low, unless you keep a small pool size so that the likelihood of getting the same connection is high. So, to summarize, the lesson is that if you are doing a batch job where there's just one connection from start to end and the same sql is going to be run many times, then use a PS. If you are gonna use a connection from a large pool, then do
      not
    use a PS since the likelihood of PS cache reuse is small.
  7. the rate of PS Cache reuse is probably low
    This is precisely what statement pooling solves. See section 11.6 of the JDBC 3.0 spec. In a nutshell, PreparedStatements are transparently pooled/shared across Connections.

    God bless,
    -Toby Reyelts
  8. JDBC 3.0 spec[ Go to top ]

    I just read section 11.6 and nowhere does it explicitly require the vendor to keep PreparedStatement pools across connections. In fact, there is text in 11.6.2 that leads one to believe that the PreparedStatements are associated with a connection, not across connections (the datasource):

    "A pool of statements is associated with a PooledConnection object, whose behaviour is determined by the properties of the ConnectionPoolDataSource object that produced it."

    I do not see how this is any different from current implementations. I wish the spec unambigously required that PreparedStatements had datasource cache scope instead of connection scope. I suppose this feature is database vendor dependant and hence cannot be standardized in the spec.
  9. JDBC 3.0 spec[ Go to top ]

    [...] In fact, there is text in 11.6.2 that leads one to believe that the PreparedStatements are associated with a connection, not across connections [...]
    Figure 11-2 also clearly indicates this.
  10. JDBC 3.0 spec[ Go to top ]

    I do not see how this is any different from current implementations.
    It looks like I was mistaken. My apologies.

    God bless,
    -Toby Reyelts
  11. Cached statements[ Go to top ]

    It depends on parameter values and database size too. Cached statement uses "average" query plan because it can not select the "best" index or scan without parameter values. Cached statement can be executed more slow than reparsed statenment, it is common for range queries.


    SELECT price FROM MY_TABLE WHERE price > 0

    Table scan is faster than index scan in this case and optimizer knows it too.

    SELECT price FROM MY_TABLE WHERE price > ?

    Index scan can be faster, but optimizer do not know it.
  12. With Oracle, and perhaps most DBMSs (can't speak for MS SQL), if you use a Statement instead of a PS then in many circumstances you will prevent the DBMS from gaining an advantage with caching.

    e.g.
      SELECT name, address FROM customer WHERE id = 4

    This query will only get reused if the same customer id is used.

    If it were a bind variable the DBMS would cache the query and substitute values for the bind variables
      SELECT name, address FROM customer WHERE id = ?

    This query is identical for all customer instances and so may get reused frequently.

    You also get problems with special characters if you dont use bind vars.
    e.g. if the user looks for a person's name of O'Donnell in Oracle you need to change occurrances of ' with ''. Ampersands also cause problems.
      SELECT id FROM customer WHERE name LIKE 'O''Donnell'


    In my experience it's a no-brainer, always use PreparedStatements unless there is a very good reason to use a Statement.

    Chris.
  13. Based on an Oracle white paper the cache of prepared statements is not kept at the connection level: "The hard parse is performed just once, by the session that happens to be the first to call the parse function, all subsequent parses from any session of the same SQL statement will be a soft parse, that will utilize the information stored in the library cache." "A session is loosely speaking a connection to the Oracle server authenticated by a specific username and password. All SQL processing takes place within a session. Using connection pooling, it is possible that multiple sessions use the same connection; this distinction is however not relevant for the discussions in this white paper." Paulo Merson
  14. For the best system performance, design the application to run under a single Connection object.
    Now this sounds weird. I suppose in some scenarios this is true (a desktop app for example). But in a web app, for example, (with let's say 100 simultaneous hits)I don't think that serializing all DB queries on a single connection would mean performance. Or would it ? Isn't a well tuned connection pool the most appropriate solution for this ?
  15. For the best system performance, design the application to run under a single Connection object.
    Now this sounds weird. I suppose in some scenarios this is true (a desktop app for example). But in a web app [...] I don't think that serializing all DB queries on a single connection would mean performance.
    In context, I read this simply as "don't close connections and reconnect needlessly". Taken at face value it is indeed nonsense. Databases thrive on concurrency (up to a point) and don't come anywhere near their maximum throughput when you serialize all access through a single Connection.

    In a web application or EJB environment, best leave this stuff to the connection pool.