Java Development News:
Pragmatic Project Automation
By Greg Ostravich
01 Oct 2004 | TheServerSide.com
Dave Thomas and Andy Hunt, who brought you “The Pragmatic Programmer” have designed a series of books to introduce readers in a hands-on way to CVS, JUnit, and project automation. The books are designed as a recipe book for beginners and a way for veteran Java developers or Architects on a project to convince others of the value of project automation, unit testing, and version control. This review is for the Project Automation book in the series. I have read the JUnit book and have not ever automated any builds for projects I have worked on in the past. Although I have never automated projects, I have attended talks at my local Java User Group on project automation so I understand what it is and some of the different tools (CruiseControl, Anthill) that are out there. I haven’t read the CVS book in the series because I use MS Visual SourceSafe as my Configuration Management tool at work. I look forward to being able to automate the builds for some of my projects at work as a result of reading this book and to know what are the best tools for my environment.
The book starts out with a great fictional success story where the lead developer on a project is going on a family vacation before a big demo and forgets to check in his file for the build. Because the project is so automated, he realizes his mistake and leaves a message for the remainder of the team. Because many of the aspects of the project have been automated, it is very easy to isolate and fix the problems. This makes the demo a success. In the past, I have seen large builds go up in smoke during integration and I can see where there is value in automatically creating a build to make sure everything is still working.
Mike Clark categorizes all automation as Commanded Automation – scripting something that was done manually. On top of that, you have triggered automation and scheduled automation. Triggered is when a build is done after a file has been checked in, scheduled is periodically scheduled through what I assume must be a cron job or an ‘at’ job depending on your operating system. The things required for automation are some sort of Version Control system, Automated Tests, Scripting, and Communication devices to let you know the status of a build. The book mentions a cool Lava Lamp system for the Communication Device but if you check the Pragmatic blog there are also plug-ins for your system tray that can be effective as well. The reasons to automate are so you can free your time from mundane tasks of the build process and so each build is consistent and repeatable. The book also discusses when to automate – be pragmatic – and how often to automate. Mike Clark includes an automation road map that includes various monitoring tools - Visual tools (Lava Lamps, System Tray icons), RSS, log4j and cell phones or pagers.
The first kind of build the book talks about is a one-step build. He discusses that the one-step build is necessary for consistency, not just efficiency and gives us an acronym to describe this process. CRISP. CRISP stands for Complete, Repeatable, Informative, Schedulable, and Portable. Without going into too much detail on the acronym, the analogy the book gives is that it is like baking a cake. You start from scratch every time you bake it, using the exact same recipe and ingredients. You have feedback while you’re baking your cake – a thermometer for example to make sure the temperature is right, poking the cake with a fork to see if it is done. The portability means you can bake it in any type of oven and the same recipe still comes out OK. The book doesn’t necessarily extend this analogy to Windows versus UNIX versus OS X builds, but says the build on a particular UNIX box should work on any UNIX box and the IDE and other machine specific details shouldn’t matter. One item the book talks about in the acronym – Schedulable – isn’t included in the analogy. I suppose a baking timer that kicks off automatically to cook when you’re gone might be analogous to scheduling builds. Scheduling builds gives the developers confidence in their product and one less headache to deal with – why did the build fail this time? You can more easily narrow down when a build might have failed – who checked code in last? It is also easier to debug a failed build because there is not as much pressure on the developer. The reason there is less pressure, is the developer receives immediate feedback on a failed build. Otherwise, the developer waits to do the build before a demo and when it fails they try to remember what they might have done to break the build in the many changes up to that point. The book also points out that a ‘build’ from an IDE (usually just a compile – not really a build) has some drawbacks including difficulty in scheduling for automation.
Project Build using Ant
The book has a good suggestion for the directory structure for your project. This was covered in the JUnit book as well but it definitely needs to be in each book of the series since they each stand alone. In the discussion of Ant, the book gives you code snippets so you can build your file one piece at a time. This accurately reflects how most people I know learn Ant. Instead of starting with a fully loaded pre-existing build file, they take the skeleton and slowly add to it until it becomes a full scale build. I realize that may change with the advent of tools like AppFuse that build everything for you. The nice thing about Mike’s book is it contains some information about JUnit and sample Ant builds that was hard for me to find elsewhere in the past. Even though I own an awesome Ant book by Erik Hatcher and Steve Loughran called “Java Development with Ant”, I really appreciated the examples and explanations in Mike’s book. One example of this is the <batchtest> Ant task that runs all the Unit tests in the directory. In the past, I’ve had a class that specified my Unit Tests explicitly. If you use the <batchtest> you can automatically create a suite of Unit tests using a fileset.
Continuous Automation – Groovy!
In the section on continuous automation, Mike Clark introduces us to Groovy. I had attended a talk on Groovy at my local Java User Group that Rod Cope gave and found it intriguing. Groovy is a scripting language that includes some of the features of Tiger (1.5 / 5.0 JDK) and can compile into Java Bytecode. It uses Java syntax and can be written with a 1.4 JDK. The book uses Groovy because it can call Ant tasks directly. As an aside, I’m really glad I have broadband at home to download all the tools and code. I’m thrilled to try out Groovy because the DenverJUG talk on it was really good but I feel sorry for anybody that still has 56K – Groovy and CruiseControl are each a 10 MB download. Maybe they should provide a CD with the tool set for the book, something like the OpenLogic (formerly Out of the Box) CD. The Groovy install turned out to be a little more difficult than I thought. Groovy installed and ran (the console and shell) OK, but when I ran the Groovy Ant build script my Groovy script can’t find the java compiler (javac) to compile my Groovy code. I asked the author for advice and the problem was I didn’t have the tools.jar explicitly specified on my classpath. Here’s my modified classpath to make my Groovy Ant build work:
Groovy may be a good choice for complex builds because of the power of scripting and being able to easily get access to nested filesets using the <fileset> task. We haven’t got to the automation part, but I can see for some tasks you could use a tool like Quartz to automate your build with Groovy. The book suggests that for simple builds there may not be a big advantage to using Groovy over Ant build.xml files. An alternative pure scripting approach I hadn’t heard of called rake (http://rake.rubyforge.org) was mentioned too. More sage advice – set up your automation as the first step – don’t wait until the build dies because somebody is missing a jar. It seems he suggests implementing it like a test-driven design project build.
Mike introduces us to CruiseControl by talking about the alternatives. Sure, you could write batch files or shellscripts to automate your builds with cron or at jobs, but then you end up spending time in maintenance that could be better spent elsewhere. CruiseControl is like cron for Ant. The book suggests putting CruiseControl on a dedicated box – it doesn’t have to be beefy or top of the line – and let it run continuously. For those of you in the .NET community, the book mentions that there’s a CruiseControl for .NET – http://ccnet.thoughtworks.com as well as Nant – an Ant tool for .NET builds. I downloaded and walked through the Demo. In the step where we are automating things including check-in and check-out of project files, it would be nice if the book included the CVS steps necessary to create the project in CVS. I understand the book may not have done this because there is a whole book in the series dedicated to CVS, but it still would have been nice. I did refer to my Pragmatic CVS book to set-up my project for the automation book. I’m not sure if I did it correctly, but what I did was download and set-up CVS, then check my source tree into the C:srcmaster directory (CVSROOT) by doing the following command in my dms directory.
cvs -d C:srcmaster import –m ”” dms dms initial
This seemed to work because when I checked out the source tree in my ~buildscheckout
directory I got a DMS tree that looks OK. I ran into one more problem and it’s
related to my lack of CVS knowledge. You have to check-in JAR files using a
binary switch (cvs add -kb some.jar) and that was causing my JAR to be corrupt.
The book touches on an important point here. If your project is set-up this way, it’s very easy to have a new developer on the project do a checkout and have everything ready to go. I’ve been on past projects where configuration for new developers wasn’t so simple. Suggestions are given on how often you should build and they depend on the types of tests you’ll run for a particular build type. Each test type may be run with different frequency. The book does mention Anthill here and what the differences are between Anthill and CruiseControl. Decide what works best for your environment, but pick something and start automating. The book also demonstrates how to publish the results of your automated build and e-mail the team when a build fails and again when the build has been fixed.
The book goes into detail describing what constitutes a ‘Release’ of your software and then shows the reader how to use automation to create a release. The release can be the current version, or an older version of the project. The description of what the release includes is more detailed than just the actual code required to run the release – it includes things like feature lists and instructions for installation. Mike Clark shows us how to create a release branch in CVS that will contain the snapshot of our code for the release we’re generating. We can go back to it if we need to and will be able to merge code back into our main repository.
The Next Step: Building a Package
The book has the reader create a distribution file – the final product to ship to the users. Creating this distribution package is best done through Ant – and we’re given a sample build called package.xml that accomplishes this task. We also learn how to make a professional looking installation package. There are great tools out there to do everything we need to do. The book looks at NSIS – which creates an installation package similar to the Tomcat installation. The book covers both fat clients and J2EE deployments and explains how to automate them. There are success stories about deployments that aren’t so easy to automate like WebSphere and I hope the knowledge of how to do that finds its’ way to the Pragmatic Project Automation blog. The final portion of Releases covered in the book is how to automate updates of your deployment using Java Web Start. I know one of our speakers at the Denver Java User Group uses this at their company to deploy a fat Swing client to their user community and to update the jars when the application changes. Mike Clark writes about the limitations of Java Web Start though and that better technology will probably come along.
Diagnostics / How to re-use those JUnit tests
The book also shows us how to create diagnostic tests using JUnit. These tests are for the user to run after end-user installation when they encounter errors. This is a great way to catch configuration errors. You can add to the test suite as you find new errors – the book suggests re-using a lot of your unit tests so you could get twice the bang for your buck. Use them on the development side, and when appropriate on the deployment side, as a diagnostic tool.
You need to know when builds fail, and since not everyone might be at their desk checking e-mail, the book presents a few different options. One option is to use text messaging to a cell-phone or pager. Another is by publishing an RSS signal that you can view in your aggregator. Maybe you’ll just use a monitor to display status and metrics about the build. You could also set-up lava lamps – one red to indicate a broken build and one green to indicate the build is clean. You can be creative in what you do, but the point the book makes is that you can automate these ways to monitor your system through CruiseControl. The book also shows us how to monitor the systems in production by screen scraping or by examining the log files. Monitoring can be especially important if your team is not co-located. If half your team is in India, you may be losing valuable time if a build breaks between the Eastern hemisphere shift and the time the group in the Western hemisphere clocks in.
This book is a great introduction to Project Automation for beginners, but it has lots of examples for intermediate and advanced users as well. It is a great book to show the value of project automation for projects that are not yet taking advantage of this. My favorite feature of this book is the demos that allow you to ‘play along’ and learn by example. The demos are tremendous tutorials for those of us new to the topic. There are also lots of specific samples and code/xml snippets for setting up and running your CruiseControl build. Appendix A has a list of the projects used in the book and where to get them. The book takes us from push button builds to build automation to everything we need to make a deployable application and monitor those applications or builds. Taking these code snippets and package creation instructions for your own use make this book worth having in your reference library. I enjoyed Pragmatic Automation and its’ high level of detail and I recommend purchasing this book.
Find out more at http://www.pragmaticprogrammer.com/starter_kit/auto/