Jakarta Team announces Commons Pool 1.1 and Commons DBCP 1.1

Discussions

News: Jakarta Team announces Commons Pool 1.1 and Commons DBCP 1.1

  1. The Jakarta Commons team has announced the release of version 1.1 of the Commons Pool and Commons DBCP components.

    Commons-Pool provides a generic object pooling interface, a toolkit for creating modular object pools, and several general purpose pool implementations.

    There were a lot changes since the 1.0.1 release on 12 Aug 2002.

  2. A lot of corner cases were fixed.

  3. Performance improvement by optimizing pool synchronization, the critical code paths were optimized by reducing pool synchronization but we also added more synchronization where needed.

  4. New minIdle feature: the minimum number of objects allowed in the pool before the evictor thread (if active) spawns new objects.

  5. New maxTotal feature: a cap on the total number of instances controlled by a pool. Only for GenericKeyedObjectPool where maxActive is a cap on the number of active instances from the pool (per key).

  6. UML Class & sequence diagrams.

  7. This release contains bug fixes to all known issues.

  8. View more at http://jakarta.apache.org/commons/pool

    Commons-DBCP provides database connection pooling services. Together with Commons-Pool it is the default JNDI datasource provider for Tomcat.

    There were a lot changes since the 1.0 release on 12 Aug 2002.

  9. All existing features can now be configured by JNDI Context providers (Tomcat).

  10. The double close() of a pooled connection is more effectively blocked.

  11. Prepared statement pooling is now implemented in BasicDataSource.

  12. Access to the underlying connection is blocked by default.

  13. New minIdle parameter for a minimum number of idle connections ready for use.

  14. New connection default properties: defaultCatalog & defaultTransactionIsolation.

  15. Missing driverClassName will now give the following error "No suitable driver".

  16. Bad validationQuery will produce a meaningful SQLException.

  17. UML Class & sequence diagrams, configuration documentation.

  18. This release contains bug fixes to all known issues.

  19. View more at http://jakarta.apache.org/commons/dbcp

    Jakarta Commons/Net
    The Jakarta Commons team is also pleased to announce the release of version 1.1.0 of the Jakarta Commons/Net component.

    Commons/Net is an Internet protocol suite Java library which supports Finger, Whois, TFTP, Telnet, POP3, FTP, NNTP, SMTP, and some miscellaneous protocols like Time and Echo as well as BSD R command support.

    Release notes, documentation, and download links are available on the Commons/Net project site:

    http://jakarta.apache.org/commons/net/

Threaded Messages (32)

  • JCA / JMX[ Go to top ]

    Why don't you provide a framework instead, that makes writing JCA-compatible connectors easier / more convenient?

    What about managability?

    Holger Engels
  • It is nice to see that Jakarta Commons Pool is a good open source connection pool.
    For example, in 1.1 version (I am not sure about previous versions) all possibly time consuming tasks (allocating new object/connection, validating object/connection, destroying object/closing connection) are done out of the synchronisation blocks, so the pool will not be blocked for other requests during these tasks. This means that the scalability of the Jakarta Commons DBCP is almost indefinite (although actual tests should be performed to be sure). This can't be said for some other open source connection pools.

    But still, there is a room for improvment:

    1. Following JDBC 3.0 configuration parameters are missing:
    -maxStatements - Maximum number of cached prepared statements per physical connection,
    -initialPoolSize - Number of physical connections to allocate when the pool is created
    -minPoolSize - Minimum number of physical connection that should be allocated at any time.

    2. Some configuration parameters are not named as per JDBC 3.0 spec (Jakarta vs JDBC 3.0):
    -maxActive vs maxPoolSize
    -timeBetweenEvictionRunsMillis vs propertyCycle

    3. Some usefull configuration parameters are missing:
    -initQuery - SQL query to perform when physical connection is allocated,
    -maxPendingCount - Maximum number of connections that can be allocated parallely at the same time. This protects the pool form unnecessary allocating a lot of connections when a lot of clients request the connections at the same time
    -maxBusyTime - Maximum time a connection can be borrowed from the pool. This protectes the pool from errors in code when developer forget to close (return to pool) a connection and connection is not garbage collected.
    -maxLiveTime - Maximum time a connection can live, either it is busy or idle. Keepeng connections that live more than couple of hours is useless, because if new connections are allocated every couple of hours that will not affect performace. Whitout this parameter connections can live indefinitely and in 24x7 environments they can allocate database resources that are not needed any more. Also there could be other problems with long living connections (bugs, memory leaks either on DB or JDBC driver side...).

    So, for our projects we develop a home grown connection pool that has all of the features of Commons DBCP 1.0.1 and other open source connection pools, plus corrections mentioned above. We will look forward to implement features and new ideas introduced in DBCP 1.1 !

    Open Source is great, not only people can use good software, they can also learn much from the source!

    Mileta
  • -maxLiveTime - Maximum time a connection can live, either it is busy or idle. Keepeng connections that live more than couple of hours is useless, because if new connections are allocated every couple of hours that will not affect performace. Whitout this parameter connections can live indefinitely and in 24x7 environments they can allocate database resources that are not needed any more. Also there could be other problems with long living connections (bugs, memory leaks either on DB or JDBC driver side...).


    I second that -- an important feature! Databases like MySQL and Oracle kill their connection handles after a few hours: The next client that fetches such a connection from the pool will be the one to discover that it is broken. The MySQL driver supports implicit auto-reconnect to avoid that effect, but AFAIK the Oracle driver doesn't.

    So automatically closing connection handles after a maximum life time makes sense. Other pools like Resin's and Proxool support that feature too. Else, one needs to resort to test-on-borrow validation queries to avoid getting timed-out connections: arguably overkill just for avoiding that particular timeout effect.

    Generally, I'm pleased to see the Commons DBCP is actively developed! It's been quite some time since the last public release, especially given a number of bugs that should really have been addressed earlier.

    BTW, DBCP's BasicDataSource is nice way to define a local DataSource within an application, as an alternative to using a container-defined JNDI DataSource. It's bean-style configuration allows for nice integration into bean-centric IoC containers like the Spring Framework.

    Juergen
  • So automatically closing connection handles after a maximum life time makes sense. Other pools like Resin's and Proxool support that feature too. Else, one needs to resort to test-on-borrow validation queries to avoid getting timed-out connections: arguably overkill just for avoiding that particular timeout effect.


    You can make test-on-borrow validation overhead smaller if you introduce another parameter, say 'testTime'. When connection is about to be borrowed, if less then testTime is ellapsed since the last time connection is successfully returned to the pool, then test-on-borrow validation is not performed and connection is assumed to be valid. So if physical connection is frequently borrowed there will be little test-on-borrow validations. This way client can get bad connection but the chance for that is small. You can trade this chance for performance by tuning 'testTime' prameter.
  • Comparison..?[ Go to top ]

    How would people compare this to something like C3P0 (http://sourceforge.net/projects/c3p0), which seems to have some traction in other projects? Any good or bad points between the two?
  • Comparison..?[ Go to top ]

    How would people compare this to something like C3P0

     
    There's also Proxool (http://proxool.sourceforge.net) which seems to have quite a lot of reliability parameters. Currently it is implemented as a java.sql.Driver, but Proxool 0.8 will offer a javax.sql.DataSource a la DBCP's BasicDataSource.

    *sigh* Why do we still need to care about such basic things as JDBC connection pools in 2003? Why is there no stable and proven open source connection pool around but just rather recent projects?

    Juergen
  • there is one....[ Go to top ]

    ---Why is there no stable and proven open source connection pool around but just rather recent projects? ---


    http://www.bitmechanic.com/projects/jdbcpool/

    In my previous project it was (still) working since
    last century. Also, one of the first free O/R mapper
    SQL2JAVA is also from there. Thanks to James Cooper
    et al.

    Alex V.
  • So, for our projects we develop a home grown connection

    > pool that has all of the features of Commons DBCP 1.0.1
    > and other open source connection pools, plus corrections
    > mentioned above.
    > Open Source is great, not only people can use good software,
    > they can also learn much from the source!

    Can I ask you then why haven't you contributed these patches in DBCP project, but instead made your own home grown proprietary pool?
  • Can I ask you then why haven't you contributed these patches in DBCP project, but instead made your own home grown proprietary pool?


    We developed our connection pool 3 and a half years ago, before jakarta pool existed.

    Mileta
  • Can I ask you then why haven't you contributed these patches in DBCP project, but instead made your own home grown proprietary pool?


    >> We developed our connection pool 3 and a half years ago, before jakarta pool existed.

    I forgot to said that it evolved though that time a lot, adding original features and borrowing ideas from other connection pools.

    Mileta
  • Can I ask you then why haven't you contributed these patches in DBCP project, but instead made your own home grown proprietary pool?


    Any pool that implement javax.sql.DataSource is not proprietary. If you provide JNDI ObjectFactory for it, then your pool can be used by any app server or other tool/framework (like Hibernate) that support DataSource interface and/or JNDI.
  • Does anyone know if they fixed the load problems of the dbcp package?
    See this paper for more information:
    http://stealthis.athensgroup.com/presentations/White_Papers/Jakarta_Pooling.doc
    We had the same problems described and no work-around instead of using proprietary classes (in this case oracle).
  • Does anyone know if they fixed the load problems of the dbcp package?

    > See this paper for more information:
    > http://stealthis.athensgroup.com/presentations/White_Papers/Jakarta_Pooling.doc
    > We had the same problems described and no work-around instead of using proprietary classes (in this case oracle).



    1. Double Connection Use
     User error

    2. Failing Silently
     It is a bug, I think it is fixed.

    3. REALLY Closing Connections
      It is not a feature, application must not depend on pool implementation and
      it must work without pool too (I asume pool is an optimization only).

    4. Resource Leaks
       User error

    5. Finalizer Bug
     The same as 4. is not it ?

    I think this way:

    1. Do not trust any pool and close JDBC resources yourself, most of workarounds in pool break transactions and app will not scale if connections are closed by timer or GC.

    2. Pool must not break application if it scales without pool, you must test application with pool and without it. If pool optimizes "connect and authenticate", it doe's evrything you need.

    3. Use frameworks like Spring to manage resources.
    Pool is not a major optimization if application opens connection once per thread with fast authentication configured ( LAN or client and DB are on the same machine ). It can be more scalable without pool + workarounds ( synchronization, autoreconnection, timers, validation, broken transactions ... ).
     

    The good pool is just a pool and nothing more like timer, autoreconnector or resource manager. It is not a good idea to use pool as workaround for bug in application.

    BTW It is not a very big problem to find this kind of bugs in application or
     to remove all "close()" from JDBC code and let to do it for AOP,
    I did it myself a few times and it was no pain.
    It takes more time to find and to download pool with workarounds, but it doe's not solve problems any way.
  • Can anyone explain why we need to rely on selfwritten queries to validate a connection? This seems to be a quite common task, so why is there no JDBC API for this which could do that with a much smaller performance impact?
  • Can anyone explain why we need to rely on selfwritten queries to validate a connection?


    Simply because there is no JDBC API for this task.

    >> so why is there no JDBC API for this which could do that with a much smaller performance impact?

    To validate a connection you need a roundtrip to the DB server. Possible JDBC API call for validating connections would also execute a query statement under the hood. Different databases need different validation queries:

    Oracle: SELECT * FROM DUAL
    Sybase, SQL Server: SELECT 1

    and so on...
  • Different databases need different validation queries:

    >
    > Oracle: SELECT * FROM DUAL
    > Sybase, SQL Server: SELECT 1

    Yes, and presumably the driver implementation knows which database it is written for? There is no layer better suited to verifying of a connection is valid or not than the driver itself. The application itself is probably the worst layer to do it in.

    IMHO...
  • Yes, and presumably the driver implementation knows which database it is written for? There is no layer better suited to verifying of a connection is valid or not than the driver itself. The application itself is probably the worst layer to do it in.


    Of course, but either you implement connection checking in JDBC layer or ConnectionPool layer does not metter if you use ConnectionPool. You may benefit of connection checking in JDBC layer only if you write fat clients that do not use connection pools.
  • Of course, but either you implement connection checking in JDBC layer or ConnectionPool layer does not metter if you use ConnectionPool. You may benefit of connection checking in JDBC layer only if you write fat clients that do not use connection pools.


    That would depend on the implementation of the pool. One common pool implementation for DB2 doesnt do any connection checking, nore re-connects at failure - instead it throws a StaleConnectionException to the client.
  • Can anyone explain why we need to rely on selfwritten queries to validate a connection?

    >
    > Simply because there is no JDBC API for this task.

    Connection.isClosed() can be used to test connection (most of drivers implement it as "return this.closed;"), but autoreconnect has meaning with "autocommit" only and I do not think it is avery useful feature in pool, it doe's not mange transactions.
  • Connection.isClosed() can be used to test connection (most of drivers implement it as "return this.closed;")


    I would hesitate to use Connection.isClosed() to check connection in the pool, as it may not make a roundrip to the database to actualy check the TCP connection. But, it could be used just before validation query is issued in oreder to save us form the attempt to make a roundrip to the db if connection is actualy closed.

    >> but autoreconnect has meaning with "autocommit" only and I do not think it is avery useful feature in pool, it doe's not mange transactions.

    I am not sure you fully understand connection checking in the connection pool layer. When client requests a connection from the pool, connection is checked just before it is returned to the client and before any statements are created.
    This is usualy done by issuing a validation query against the database. Validation query is simle query that do nothing and is usually db specific.
    This tests physical TCP connection as well as logical database connection.
    If the validation query is completed successfully, connection is returned to the client, otherwise connection is closed and removed form the pool and other connection or new one is returned to the client.

    Commonly, there is no autoreconnection in the pool after statement execution failure because of network error. Although you can implement it.

    Btw, Juozas, I am now just playing with CGLIB and it is great!

    Regards,
    Mileta
  • I am not sure you fully understand connection checking in the connection pool layer. When client requests a connection from the pool, connection is checked just before it is returned to the client and before any statements are created.

    > This is usualy done by issuing a validation query against the database. Validation query is simle query that do nothing and is usually db specific.
    > This tests physical TCP connection as well as logical database connection.
    > If the validation query is completed successfully, connection is returned to the client, otherwise connection is closed and removed form the pool and other connection or new one is returned to the client.
    >
    > Commonly, there is no autoreconnection in the pool after statement execution failure because of network error. Although you can implement it.
    >

    Yes ,it will not break transaction if connection is validated before to start transaction and will not try to reconnect in transaction, but I am not sure
    pool will optimize "connect" if it executes validation queries and runs background threads, some of my applications run on production without pools and performance problems (I just use memory cache to solve performance problems).
    Web application doe's not need to connect to DB more than once per request and it is not a problem for performance, I asume nobody uses inet connection and very "clever" authentication for web applications to connect to db.
    So I think it is better without pool than pool with workarounds.

    > Btw, Juozas, I am now just playing with CGLIB and it is great!
    >
    > Regards,
    > Mileta
  • but I am not sure pool will optimize "connect" if it executes validation queries and runs background threads


    I connection pool is done right, it only has one background thread for performing tasks that cannot be done synchronously with client requests. Chencking connections can be done in client thread just before connection is returned to the client. In order to prevent performance problems of frequent connection checks, you can have checkTime paramter which tells the pool not to check connections if less then checkTime is elapsed since last time connection is successfully returned to the pool. Yes, this way client still can get bad connection but probability of that is small and can be tuned.

    >> So I think it is better without pool than pool with workarounds.

    If you have an application that have 'long running queries' (say 5-15 sec for some analytic query) or more then 10-20 simultaneous users the pool is a must.
    If pool is written well (there is no time consuming tasks (connecting, checking and closing connections) in synchronous blocks, there is only one background thread for housekeeping, there is configurable connection checking (in my opinion the best time for connection checking is just before connection is returned to the client)), there is no workarounds, it is just ensuring that client gets valid connection. If you have only one connection, you still have to ensure that this connection stays valid all the time.

    Mileta
  • but I am not sure pool will optimize "connect" if it executes validation queries and runs background threads

    >
    > I connection pool is done right, it only has one background thread for performing tasks that cannot be done synchronously with client requests. Chencking connections can be done in client thread just before connection is returned to the client. In order to prevent performance problems of frequent connection checks, you can have checkTime paramter which tells the pool not to check connections if less then checkTime is elapsed since last time connection is successfully returned to the pool. Yes, this way client still can get bad connection but probability of that is small and can be tuned.
    >
    > >> So I think it is better without pool than pool with workarounds.
    >
    > If you have an application that have 'long running queries' (say 5-15 sec for some analytic query) or more then 10-20 simultaneous users the pool is a must.
    >

    I have used pools for some projects a year ago, DBCP is one of them, but
    I use this way for last two web applications :

    1. Popular MVC design pattern.
    2. Lazy ThreadLocal connections and transactions.
    3. No Connection pool.
    4. Pluggable JDBC decorators for logging, monitoring and performance tuning.

    ThreadLocal connections work this way:
    Static method opens *new* connection and sets ThreadLocal field on the first
     "getConnection()" for thread/request.
    Controler *closes* connection on the end of request if ThreadLocal connection is open (I use java.lang.ThreadLocal to implement it).

    I can not forget to close connection (there is single place in app to close it), Requests never block (I do not need to wait connections), I never have too many open connections (Thread pool/maxProcesors on web server limits it),I do not have broken connections, no thread, no validation and no workarounds. Possible I will need a pool in the future, but I do not think so after I have tested the most simple way (connections and authentication are very fast on our servers). The cache helps too, ~90% of requests do not need to open connection.

    BTW It must be possible to optimize 5-15 sec analytic queries, there are a few ways to do it:
    1. Generate static content on data import.
    2. Cache query results in memory.
    3. Add more indexes and optimize query itself.
    4. Lucene for search queries.

    I think you know a lot of good ways yourself and never execute this kind of queries for each request, It is much better optimization than connection pool.

    Pool is a good optimization if your connection is very "slow", but I do not think you use slow internet connection for JDBC, XML RPC is more popular for this use case, is not it ?

    Test boths ways yourself, I am not sure it will work for you, but it works for me and it works better.




    > If pool is written well (there is no time consuming tasks (connecting, checking and closing connections) in synchronous blocks, there is only one background thread for housekeeping, there is configurable connection checking (in my opinion the best time for connection checking is just before connection is returned to the client)), there is no workarounds, it is just ensuring that client gets valid connection. If you have only one connection, you still have to ensure that this connection stays valid all the time.
    >
    > Mileta
  • ThreadLocal connections work this way:

    Static method opens *new* connection and sets ThreadLocal field on the first
     "getConnection()" for thread/request.
    Controler *closes* connection on the end of request if ThreadLocal connection is open (I use java.lang.ThreadLocal to implement it).

    If I understand it right, this way you are opening a new connection for every request. This may not be a problem for your environment, but for some it could be a problem. Opening connection can take a couple of seconds sometimes.
    I like your idea of holding connection in a ThreadLocal variable. I was thinking about that for some time, but I was intended to get connection from pool instead of from JDBC driver. This would prevent leaking connections from the pool when calling con.close() is missing and GC is not run (con.close() is executed in connection wrapper finalizer). Also, this way bad connections can be detected earlier and removed from the pool instead of returning them to the pool and then let connection checking detects them.

    Mileta
  • I have tried to run this test on my development environment: LAN, Db runs on very old PC ( a very good server for test :)

    int COUNT = 100;
            long time = System.currentTimeMillis();
            for( int i=0; i < COUNT ; i++ ){
                Connection con = DriverManager.getConnection(url,user,password);
                con.close();
            }
            System.out.println( "CONNECT " + ((System.currentTimeMillis() - time)/COUNT ) + " ms." );
            
            time = System.currentTimeMillis();
            Connection con = DriverManager.getConnection(url,user,password);
            
            try{
                
                for( int i=0; i < COUNT ; i++ ){
                    
                    PreparedStatement ps = con.prepareStatement("SELECT 1");
                    try{
                        
                        ResultSet rs = ps.executeQuery();
                        try{
                            
                            rs.next();
                            rs.getInt(1);
                            
                        }finally{
                            rs.close();
                        }
                        
                    }finally{
                        
                        ps.close();
                        
                    }
                    
                    
                }
            }finally{
                con.close();
            }
            
            System.out.println( "VALIDATE " + ((System.currentTimeMillis() - time)/COUNT ) + " ms." );

    Output is :

    CONNECT 33 ms.
    VALIDATE 1 ms.

    It is a good optimaization, but I do not think it will be a very good if I will add more workarounds. It will save 1 - 2 ms. for 2% - 10% of requests on my production server, but can produce a lot of problems, I will increase cache size if I will have performance problem.
  • Why use DBCP?[ Go to top ]

    Why would one use DBCP? Doesn't most Application Servers provide connection pooling by default?
  • Why use DBCP?[ Go to top ]

    Why would one use DBCP? Doesn't most Application Servers provide connection pooling by default?


    If connection pooling is the only feature a project need from an app server, why you should use an app server?

    Mileta
  • Why use DBCP?[ Go to top ]

    Why would one use DBCP? Doesn't most Application Servers provide connection pooling by default?


    And maybe your app server's connection pool does not have a feature that you need?

    Mileta
  • Why use DBCP?[ Go to top ]

    Why would one use DBCP? Doesn't most Application Servers provide connection pooling by default?


    It's often preferable to keep everything local to the application for easier setup and deployment. If you define a local DataSource within your application, you can literally just drop your WAR into Tomcat or Resin or JBoss etc and you're done.

    With a container-managed JNDI DataSource, you first have to deal with the target container's specific mechanisms for setting up JNDI resources. And afterwards, you have to care about the specific quirks of the target container's connection pool -- that may not have occured with the pool that you've developed and tested on.

    Of course, container-managed DataSources are required for distributed transactions that span more than one database. They also make sense if multiple applications need to share access to common databases. But for purely local usage in a single application, you have the choice to use a local DataSource instead.

    BTW, IoC containers like the Spring Framework allow to expose a DataSource dependency in your components and pass in a specific DataSource instance at initialization time. That way, switching from a local DataSource to a JNDI one or vice versa is just a matter of configuration.

    Juergen
  • Why use DBCP?[ Go to top ]

    What do you recommend me for pooling with Tomcat 4.1 and SQL Database?
    Which driver do I use?
    Regards
  • Why use DBCP?[ Go to top ]

    What do you recommend me for pooling with Tomcat 4.1 and SQL Database?

    > Which driver do I use?
    > Regards

    I can recommend postgresql if you need RDBMS or MySQL if your tables are "small" and your data is something like content on this site. Try to implement trivial connection factory yourself, your will plug pool later if you will need it. I use http://voruta.sourceforge.net myself, but I can not recommend it for you at this time (you can try it, if you do not afraid to experiment). The good choise must be hibernate + Spring framework, It is a very active community and you will have a good support on forums, good documentation with design patterns and examples.
  • Why use DBCP?[ Go to top ]

    Hi Juergen,

    >Of course, container-managed DataSources are required for distributed >transactions that span more than one database. They also make sense if >multiple applications need to share access to common databases. But for purely >local usage in a single application, you have the choice to use a local >DataSource instead.

    From above do I understand that DBCP cannot be used for distributed systems and for single applications only coz it doesnt support concurrency ?!!!