Discussions

News: Designing and Testing JDBC Code

  1. Designing and Testing JDBC Code (28 messages)

    Simon Brunning solicited advice on how to design your DAO code, and Bob Lee stepped up to the plate. This piece goes over his solution, where he uses the command pattern for JDBC logic. He shows how it works in code, and how he is able to write tests that work on an in-memory instance of HSQLDB (allowing unit tests to run in no time).

    Excerpt: Testing Commands

    "Connecting to a remote database from tests can be slow and difficult to maintain, especially if you share a database with other developers. I prefer to test against hsqldb, a lightweight 100% Java database engine that can run completely in memory. I can set up and destroy the entire database within the scope of a single test. Keeping database agnostic (so I can test against hsqldb and still use Oracle in production) can be trying at times but is well worth the effort. My tests set up a minimal database (i.e. only what's needed by the test) and execute commands against it:

      public void testDeleteCommand() throws Exception {
        Connection c = createConnection();
        c.createStatement().execute(CREATE_DATABASE_SQL);

        new DeleteCommand("testId").execute(c);

        // assert that the command succeeded.
        ...
        c.close();
      }

      Connection createConnection() throws Exception {
        // create a new in-memory database.
        Class.forName("org.hsqldb.jdbcDriver");
        return DriverManager.getConnection("jdbc:hsqldb:.", "sa", "");
      }

    Read Bob Lee in Design and Test JDBC Code

    Threaded Messages (28)

  2. I’m already running out of memory and CPU and now I have to create a database instance in memory every time I need to test something? Sounds like a nonsense to me. Database connection might be slow but still faster then creating a new one in memory. If free alternative is needed the MySQL would solve the problem. Any way it is better to test the code in the development against the same DB which deployed in production environment.

    I might be missing something, please advice.
  3. Designing and Testing JDBC Code[ Go to top ]

    I’m already running out of memory and CPU and now I have to create a database instance in memory every time I need to test something? Sounds like a nonsense to me. Database connection might be slow but still faster then creating a new one in memory. If free alternative is needed the MySQL would solve the problem. Any way it is better to test the code in the development against the same DB which deployed in production environment.

    >
    > I might be missing something, please advice.

    hsqldb is *very* light. You should try it. Like I said, all of my test cases for a DAO run in under a second. This is ideal for my development style: make a small change, compile, run tests, repeat. I need quick turnaround, and the tests shouldn't require any manual setup.

    You still need to run integration tests against your production database. I've run into a couple bugs in Oracle's driver myself, but I can usually work around these independent of my application logic.
  4. Also, writing database independent code may not be achievable at all. If my company bought a expensive database (eg, Oracle) or is simple strongly commited to solution (eg, PostgreSQL) I don't see why I shouldn't write JDBC code that relies on certain specific database features.
  5. I’m already running out of memory and CPU and now I have to create a database >instance in memory every time I need to test something? Sounds like a nonsense >to me

    If your test cases run out of memory and CPU it seems to me that there may be problem with granuality or scope of your tests. We are talking about unit tests here - they should be small and there should be many of them, each running independently.

    >Any way it is better to test the code in the development against the same DB >which deployed in production environment.
    In most of the application I did, over 90% of DAO code could have been tested using any SQL database ... the remaining one indeed needed to be tested against target DB. Also system tests and acceptance tests should use the target DB, target OS and target drivers, I don't see such need for unit tests usually.
  6. I’m already running out of memory and CPU and now I have to

    > create a database instance in memory every time I need to test something?
    > Sounds like a nonsense to me

    If you're running short of memory and CPU launching hypersonic, then you have larger problems than whether to use an in memory database or not. That said, I agree that it's a good idea to test on the database that you'll be deploying on eventually. What you can do in that case is make sure you don't commit your sql commands during a test and then rollback in the tearDown().
  7. I've never seen the point of putting SQL in constants/property files - how often do you change a SQL statement without changing the code that uses it? separating them just means you have to edit stuff in two places rather than one.
  8. Reasons for separated SQL[ Go to top ]

    1. Database changes, such as name changing can be handled independently of the code.

    2. If you use a OR-mapping tool such as Ibatis DBLayer, you map your objects directly to (propably) domain objects through introspection. No explicit mapping in the Java code is needed. As a matter of fact, the mapping is handled automatically with good naming conventions, thereby completely avoiding the "rewrite sql whenwhever the object model changes" thingy.

    3. If you use a fairly good OR tool, separation of the SQL statements makes it possible to import and inherit from other statements, thereby avoiding massive duplication of code.

    4. It's ugly to have SQL in the Java code. I feel almost as bad when I see that as I do when I see a servlet pump out html through println():s. I admit that this last point is more emotional than logical.
  9. I've never seen the point of putting SQL in constants/property files - how >often do you change a SQL statement without changing the code that uses it? >separating them just means you have to edit stuff in two places rather than >one.


    Oh yeah, one thing I forgot.

    5. When you just call an sql map with a symbolic name you can ideally change all your database calls from directly executed sql into for example stored procedures *without changing a single line of compiled code*. If your dba is into that sort of things.
  10. 5. When you just call an sql map with a symbolic name you can ideally change all your database calls from directly executed sql into for example stored procedures *without changing a single line of compiled code*. If your dba is into that sort of things.


    For me the main and by far the most important reason to seperate Java code and SQL is human factor. If you work in big organizations there are good chances there will be a DBA group that will want to see your SQL statements before allowing you to connect to production database. There may be SQL code audit that is independent from Java one (and done by people that don't know Java). Finally there may be someone who have no idea about Java but is realy good in optimizing SQL statements (inserting all those /* hints */ in Oracle) - this may be a necessity when you hit performance issues.

    Cheers,
    Krzysztof
  11. I prefer SQL in code too. I have used external files for SQL myself, but I was not very happy, It is not very friendly way for refactoring and code review. String constants in code are not very clear too.
    I use compromise, SQL in javadoc. I can not find big problems with this way at this time. Looks like author is a single user :), but see http://voruta.sourceforge.net probably you will like it too.
  12. There is one very good reason to place SQL in separate files:
    - Generating dynamic SQL. If you store SQL in XML file or template (Velocity, ...), you can much easier and more maintenable deal with dynamic SQL statements, like search filters, meta data based queries,...

    And long SQL statements in Java code look very ugly.

    MC
  13. I have to agree that SQL in Java looks very ugly and is hard to maintain. We are also thinking of storing SQL in XML instead.

    Bob's advice makes sense to me if your DAO's are db agnostic and you can easily switch from one database to another. My experience is that ~10% of the DAO methods end up having custom SQL queries that are not supported by HSQLDB.
  14. My experience is that ~10% of the DAO methods end up having custom SQL queries that are not supported by HSQLDB.


    Select queries is not a very big problem, it easy to mock without database to test result set processing code,
    but it can be a good idea to use HSQLDB as mock for updates.
  15. BTW
    6. for your motivation:
    It is very easy to validate SQL form external file. Sometimes it is meaningless to write tests manualy for SQL validation, if it tests nothing more)
    It is very usefull to trace query plan too for performance tuning, it is more easy if SQL externalized.

    But refactoring is an enemy for this way.
  16. Keeping Database Agnostic??[ Go to top ]

    This strikes me as one of the ideal world solutions. In an ideal world creating database agnostic code would be okay, but in the real world many organization commit to and invest in an RDBMS expecting that it's features will get used. What is the point of investin in Oracle when you are only going to use the lowest common denominator of features across all systems? This is the think that worries me most about J2EE is the almost compulsive need to be everything-agnostic which results in applications that fail to take advantage of the environment they are running in. A good architecture will still allow for the DB-specific features to be used without coupling other layers of the application to them.

    As for testing I think that you should test against the database that you intend to deploy on in case of any deviations in functionality.
  17. Designing and Testing JDBC Code[ Go to top ]

    I think the mentioned approach has some problems that make value of it doubtful.

    First, databases still do speak different dialects of SQL. When testing, one is interested in being as close to actual application environment. Things that work for HSQLDB may not work for Oracle or MS SQL and visa versa. Using a DB engine that is different from an actual one decreases test value. It's easy to set up possible to set up remote DB server used for testing to allow each engineer running in dedicated schemas or databases so that unit tests run by engineers don't interfere. Unit tests normally don't put much load on a DB server, so sharing DB is not a problem.


    Second, it reinvents the wheel in poor way. There is very useful, open source project DBUnit that allows for fixtures running in defined test database content (I'm not affiliated with DBUnit). Than it becomes a matter of placing unit test setUp code in a base class that all unit tests extend:
    <code>
      protected void setUp() throws Exception {
        super.setUp();

        // init database content if there are other test cases
        IDatabaseConnection dbUnitConn = null;
        Connection conn = null;
        try {
          // run dbunit initializing database
          conn = getConnection();
          dbUnitConn = new DatabaseConnection(conn);
          FileInputStream datasetInputStream = new FileInputStream(System.getProperty("test.dataset"));
          assertNotNull(datasetInputStream);
          DatabaseOperation.CLEAN_INSERT.execute(dbUnitConn, new FlatXmlDataSet(datasetInputStream));
          conn.commit();
          datasetInputStream.close();
        } finally {
          closeDBUnitConnection(dbUnitConn);
          closeConnection(conn);
        }
      }
    </code>
  18. Designing and Testing JDBC Code[ Go to top ]

    In my experience, most DAO problems found in unit tests are a result of the application being out of sync with the database structure.

    I fail to see how this boundary is going to get tested against a database who's structure does not necessarily reflect that of the actual deployment database.

    As a side note, this may be a way to test non-DAO code, however it's better to design an application such that data access and business logic are separated so business logic can be tested independently of a database.
  19. Designing and Testing JDBC Code[ Go to top ]

    In my experience, most DAO problems found in unit tests are a result of the application being out of sync with the database structure.

    >
    > I fail to see how this boundary is going to get tested against a database who's structure does not necessarily reflect that of the actual deployment database.
    >

    It is very trivial to find this kind of problems without writing test code,
    I use "EXPLAIN ..." to execute queries on app startup.
     
    > As a side note, this may be a way to test non-DAO code, however it's better to design an application such that data access and business logic are separated so business logic can be tested independently of a database.

    Yes, it is a good way, but sometimes it is not very easy to separate data access and business logic without loosing performance, DB level data validation is very dependant on business logic.
    It is possible to mock some DAO code for test, but I am not sure it can help a lot, doe's somebody uses it ?
  20. Designing and Testing JDBC Code[ Go to top ]

    In my experience, most DAO problems found in unit tests are a result of the application being out of sync with the database structure.


    Jeff,

    I solved this problem pretty neatly. The DB schema is kept in a source control. Every time one runs a server-side test the schema is getting recreated. DBUnit populates it with data so that every fixture runs against against fresh, current schema populated with known and the same data. Works very well.

    For Oracle separation of developers can be done by dedicating schema names that match developer's machine names.

    Regards,

    Slava Imeshev
  21. Answer:[ Go to top ]

    Don't -- use ORM.
  22. This code is quite nasty[ Go to top ]

    from Bob's blog entry:

    class DeleteCommand implements DatabaseCommand {

        String id;

        DeleteCommand(String id) {
          this.id = id;
        }
      
        public void execute(Connection c) throws SQLException {
          PreparedStatement ps =
            connection.prepareStatement(DELETE_SQL);
          ps.setString(1, id);
          ps.execute();
        }
      }

    Err, Bob, ever though about closing your statements? In a finally clause perhaps?

    I know this is just a sketch of how you might do things but you just know someone's going to copy it and use it in a production app and then someone else is going to have to go hunting for resource leaks.

    The Spring framework has a much nicer pattern for this kind of thing where the 'command' object just gets passed a ResultSet and the framework handles actually executing the query and safely disposing of resources...
  23. This code is quite nasty[ Go to top ]

    Err, Bob, ever though about closing your statements? In a finally clause perhaps?


    Well, my DAO calls Connection.close() in a finally clause which implicitly closes the other resources.

    > The Spring framework has a much nicer pattern for this kind of thing where the 'command' object just gets passed a ResultSet and the framework handles actually executing the query and safely disposing of resources...

    I prefer to work at Connection granularity so that I have the flexibility to work with multiple Statements and ResultSets per command (which I do excersize). Also, under these circumstances I would explicitly close such resources.
  24. This code is quite nasty[ Go to top ]


    > Well, my DAO calls Connection.close() in a finally clause which implicitly closes the other resources.
    >

    I hope someone can correct me on this but it isn't my understanding that closing a connection implicitly closes statements. Sure, any statements that were created for a given connection will not be usable after the connection has beem closed, but that's not the same thing as saying that closing the connection is guaranteed to release any database resources that the statment is using. In the case of Oracle what you most often see is a "too many open cursors" error. (I guess this kind of maybe perhaps implies that you might get away with doing a delete without closing the statement but I wouldn't bank on it ).

    This can give rise to some really elusive bugs that may only mainfest themselves under load, or when a particular other error condition applies, or even appear to be a problem in some other application that happens to share a database with yours. To my mind it always pays to play safe and make sure you close everything under all conditions and any framework or pattern gets my vote if keeps connection objects away from your business code (and your junior developers).
  25. This code is *not* quite nasty[ Go to top ]

    I hope someone can correct me on this but it isn't my understanding that closing a connection implicitly closes statements. Sure, any statements that were created for a given connection will not be usable after the connection has beem closed, but that's not the same thing as saying that closing the connection is guaranteed to release any database resources that the statment is using. In the case of Oracle what you most often see is a "too many open cursors" error. (I guess this kind of maybe perhaps implies that you might get away with doing a delete without closing the statement but I wouldn't bank on it ).


    First sentence from the Javadocs for Connection.close(): "Releases this Connection object's database and JDBC resources immediately instead of waiting for them to be automatically released."

    That said, it depends on the driver and connection pool. They're both supposed to clean up their resources when you close the connection, but some buggy ones don't unfortunately.

    If I was working with a driver that failed to clean up resources properly, I certainly wouldn't code "Statement.close()" and "ResultSet.close()" in finally blocks over and over everywhere I wrote JDBC code. That would be error prone and redundant.

    I would write a Connection proxy class that would keep track of the JDBC resources and then close them all when I called Connection.close(). My DAO or utility class could wrap the connection before passing it to my command object.
  26. I've seen this problem often enough and recently enough to believe that it's reasonably common for connections not to cascade the cleanup, despite what the javadocs say. I don't think this behaviour was explicitly mentioned in the JDBC specs until JDBC 3.0 (although it was implemented by a lot of drivers). I think you're quite right to mention connection pools - my guess is there are a fair number around that don't keep track of the resources created by the connections they hand out.
    Call me old fashioned but I've been bitten enough times to believe that its essential to aggressively release database resources yourself and not rely on the connection to do it for you. Even if the driver and connection pool you are using right now do the right thing, do you really want to have to debug a resource leak at some time in the future if you change them?
    I agree completely that you don't want to have duplicate finally blocks littered all over your code. Proxying the connection is one way to achieve this, but then you'll still need remember to have finally blocks within which you call close on your (proxied) connection. I'd rather have something like Spring's JDBCTemplate that I can construct on a datasource and call with some sql and a callback object. That way my application specific code is completely separated from both acquisition and disposal of database resources
  27. I've seen this problem often enough and recently enough to believe that it's reasonably common for connections not to cascade the cleanup, despite what the javadocs say. I don't think this behaviour was explicitly mentioned in the JDBC specs until JDBC 3.0 (although it was implemented by a lot of drivers). I think you're quite right to mention connection pools - my guess is there are a fair number around that don't keep track of the resources created by the connections they hand out.


    You're right. Drivers and pools not cleaning up resources properly is a cause for concern. You've inspired me to check my current setup.

    I'm not sold on passing a ResultSet to my command object though, though I do agree it's a good and safe solution. Like I said, I can use a Connection proxy and be just as safe, and have the ability to work with multiple ResultSets at once and to reuse the same Connection between multiple child commands without going back to the connection pool every time and relying on it to give me back the same connection (do connection pools typically even do this without a tx context?).
  28. This code is *not* quite nasty[ Go to top ]

    Hi,
    Another argument for closing result sets and statemets is to avoid this scenario:
    You have a connection pool and there are a lot of queries to the database. The connection are shared between users are closed very rarely, only when unused for a long period of time. This is the case of most application servers. If the result sets are not closed you end up with a lot of open cursors on the database.
    Eventually, you will get a "Maximum open cursors ..." exception.
    Mircea.
  29. This code is *not* quite nasty[ Go to top ]

    You have a connection pool and there are a lot of queries to the database. The connection are shared between users are closed very rarely, only when unused for a long period of time. This is the case of most application servers. If the result sets are not closed you end up with a lot of open cursors on the database.


    As we said, it's the connection pool's responsibility to make sure the resources get cleaned up before it gives the connection to another client.