July 2004
Overview
“Argh! I hate these dependencies.” We’ve all said it once,
or at least thought it. The reason being that every project has dependencies
and managing them can sometimes be a nightmare; the larger the project, the
more dependencies it has. Savant is the latest sub-project of Verge, a collection
of open source projects from Inversoft, which provides the means to handle these
dependencies in a simple and efficient manner. Verge does not have the most
complex build ever written and but it does cover most of the common problems
inherent with builds. We’ll demonstrate how Savant helped alleviate the
build pains of all the Verge projects.
What is Savant?
Savant is an extension to the Ant build system from Apache. Ant allows developers
to build new tasks and types that can be called from a build file in order to
handle tasks not covered by Ant itself. Savant is simply a collection of tasks
and types that allow projects to declare their dependencies, perform common
tasks on these dependencies (such as copy) and publish items that the project
produces.
Savant uses the concept of artifacts to represent a single dependency, usually
a library or executable. Artifacts are ordered into groups and projects. Groups
represent any entity that produces artifacts. The Apache foundation is a good
example of a group. Projects are projects within a group that produce the artifact.
Any group can contain any number of projects and projects can produce any number
of artifacts.
Savant uses a local cache to store the downloaded or otherwise resolved dependencies
on the local file system. This allows developers to work disconnected from the
network after the first build has downloaded or resolved the dependencies. It
can also greatly reduce the time for builds to run because resolving dependencies
can sometimes take a long time.
Verge before Savant
First let’s take a look at how the Verge projects were built before
Savant. There are currently three projects (excluding Savant) in the Verge family
of projects. These projects are Verge Commons, Verge JUnit, and Verge Web. The
current inter-project dependencies look like this:

The Verge Web project depends on both Verge Commons and Verge JUnit. All of
these projects also depend on various third party libraries. In the past, all
of these projects had a lib directory that contained all the
dependencies, both 3rd party and local. Here’s what the lib
directory of Verge Web looked like:
lib/
commons-httpclient.jar
log4j-1.2.7.jar
standard.jar
jdom-b8.jar
jstl.jar
junit.jar
verge-common.jar
verge-junit.jar
At first this worked well because the projects didn’t change much. The
verge-common.jar file would be copied from the Verge Commons project by hand
after it was built. The same was done for the Verge JUnit project’s verge-junit.jar.
Since both of these projects changed so rarely, it was not that difficult to
remember to do this. However, once work started on version 2.0, this became
a nightmare. We also constantly had to update the verge-common.jar and verge-junit.jar
files to ensure that they were up-to-date and unsurprisingly the CVS version
numbers got impossible to manage (1.208 for example). In addition, we had very
little idea of what versions the 3rd party libraries were, having just dropped
the JAR files into the lib directory (without thinking of putting
a version on the name).
In order to remedy the inter-project dependencies, we did a bit of ant hackery
so that the Verge Web project would build the Verge Common and Verge JUnit projects
as part of its build. We also had to make certain during deployment of the Verge
Web project, the dependencies were copied into the correct location (i.e. into
our J2EE example applications). This solution worked fairly well but was a bit
messy and was nearly impossible for new developers to figure out without guidance
or a tutorial. It also had the downfall that all the projects had to be checked
out when doing development.
Verge with Savant
In order to solve all of our problems, we decided to design a tool that would
allow us to use the current build files we had but handle all the dependencies
for us. In order to accomplish this we wanted to remove all the 3rd party and
inter-project libraries from each of the projects and provide the ability to
declare what dependencies a project had. To accomplish this the Savant dependency
task was created. This task allowed projects to specify what local projects
and artifacts they depended on and how to fetch the artifacts. We also built
in the ability to execute builds for other projects so that Verge Web would
build Verge Common and Verge JUnit automatically.
One other requirement that we had was that since http://ibiblio.org
was already hosting a large repository of 3rd party libraries for Maven builds
we wanted a way to leverage their repository for our builds. In order to satisfy
this requirement we would add support for the Maven style repositories within
Savant.
We also needed a way to publish the artifacts that Verge Common and Verge JUnit
produced to the local cache. This was needed so that when Verge Web was building
and would call the build for Verge Commons, Verge Web would be able to retrieve
the artifacts produced by Verge Commons. Unless Verge Commons published its
artifacts during this process, Verge Web would not be able to find them easily.
By providing a means to publish artifacts we were also ensuring that developers
could work on projects without having to check out other local projects.
After the initial version of Savant was complete, we added the new definitions
to the build files for Verge Common, Verge JUnit and Verge Web. Let’s
take a look at what was added to the build for Verge Web, since it depends on
the other projects and depends on many artifacts.
First we started by adding the dependency definitions. The first dependency
definition included all the artifacts from the other Verge projects. In order
to configure Savant to handle other local projects, we needed to add definitions
of those local projects. Here is our dependency definition with the project
definitions inside it:
<savant:dependency>
<savant:project name="Verge Commons" group="verge" dir="../common" target="release"/>
<savant:project name="Verge JUnit" group="verge" dir="../junit" target="release"/>
...
</savant:dependency>
Once we added the project definitions we needed to add the artifacts from those
projects that Verge Web depended on. When Savant encounters an artifact from
a local project, it attempts to build that local project out of the directory
specified using the target specified (both on the project definition). Each
local project is only built once; therefore multiple artifacts from a single
project can be included without slowing down the build. Here is our dependency
definition with the projects and an artifact group that contains the artifacts
from those projects:
<savant:dependency>
<savant:project name="Verge Commons" group="verge" dir="../common" target="release"/>
<savant:project name="Verge JUnit" group="verge" dir="../junit" target="release"/>
<savant:artifactgroup id="group.deps" classpathid="classpath.deps">
<savant:artifact group="verge" projectname="Verge Commons" name="verge-common" version="2.3" type="jar"/>
<savant:artifact group="verge" projectname="Verge JUnit" name="verge-junit" version="2.3" type="jar"/>
</savant:artifactgroup>
...
</savant:dependency>
One thing to notice about our dependency definition is that we have specified
a classpathid on the artifact groups. This classpathid is used by Savant to
add all the artifacts inside that group to a path (see the Ant manual for more
information on paths and classpaths). If the path does not exist, Savant creates
it and then adds the artifacts. If it does exist, the artifacts are added to
the existing path. By using the classpathid, the path that is created can be
used in our javac call for the classpath used during compilation. Here is what
our javac call looks like:
<javac srcdir="${dir.src}" destdir="${dir.compile}" deprecation="${build.deprecation}"
source="${build.source}" debug="${build.debug}">
<classpath>
<path refid="classpath.deps"/>
</classpath>
</javac>
Next, we needed to include all the artifacts available from http://ibiblio.org
that Verge Web depends on. To accomplish this, we added the definitions for
those artifacts to the artifact group we just created. We could have also created
another artifact group, but the existing one would suffice. Here is our definition
with the extra artifacts:
<savant:dependency>
<savant:project name="Verge Commons" group="verge" dir="../common" target="release"/>
<savant:project name="Verge JUnit" group="verge" dir="../junit" target="release"/>
<savant:artifactgroup id="group.deps" classpathid="classpath.deps">
<savant:artifact group="apache" projectname="commons-httpclient" name="commons-httpclient" version="2.0" type="jar"/>
<savant:artifact group="apache" projectname="log4j" name="log4j" version="1.2.7" type="jar"/>
<savant:artifact group="apache" projectname="taglibs" name="standard" version="1.0.2" type="jar"/>
<savant:artifact group="jdom" projectname="jdom" name="jdom" version="b8" type="jar"/>
<savant:artifact group="java" projectname="jstl" name="jstl" version="1.0.2" type="jar"/>
<savant:artifact group="junit" projectname="junit" name="junit" version="3.8.1" type="jar"/>
<savant:artifact group="verge" projectname="Verge Commons" name="verge-common" version="2.3" type="jar"/>
<savant:artifact group="verge" projectname="Verge JUnit" name="verge-junit" version="2.3" type="jar"/>
</savant:artifactgroup>
...
</savant:dependency>
There was one last piece of the puzzle that we needed to complete so that Savant
would download all the artifacts that did not come from Verge Commons and Verge
JUnit from http://ibiblio.org.
What was required for this was to define a workflow. A workflow is a series
of steps that Savant takes in order to resolve artifacts. Each step in the workflow
is called a process and in order to meet our original requirements with respect
to http://ibiblio.org we created
the Maven process that would contact a remote repository and download artifacts
from maven style directories (see the Maven documentation for more information).
Here is our dependency definition with the workflow:
<savant:dependency>
<savant:project name="Verge Commons" group="verge" dir="../common" target="release"/>
<savant:project name="Verge JUnit" group="verge" dir="../junit" target="release"/>
<savant:artifactgroup id="group.deps" classpathid="classpath.deps">
<savant:artifact group="apache" projectname="commons-httpclient" name="commons-httpclient" version="2.0" type="jar"/>
<savant:artifact group="apache" projectname="log4j" name="log4j" version="1.2.7" type="jar"/>
<savant:artifact group="apache" projectname="taglibs" name="standard" version="1.0.2" type="jar"/>
<savant:artifact group="jdom" projectname="jdom" name="jdom" version="b8" type="jar"/>
<savant:artifact group="java" projectname="jstl" name="jstl" version="1.0.2" type="jar"/>
<savant:artifact group="junit" projectname="junit" name="junit" version="3.8.1" type="jar"/>
<savant:artifact group="verge" projectname="Verge Commons" name="verge-common" version="2.3" type="jar"/>
<savant:artifact group="verge" projectname="Verge JUnit" name="verge-junit" version="2.3" type="jar"/>
</savant:artifactgroup>
<savant:workflow>
<savant:mavenprocess url="http://ibiblio.org"/>
</savant:workflow>
</savant:dependency>
Savant also supports other processes including a savantprocess, which can also
be used to download artifacts from remote locations, but is more robust with
respect to URL structures; and the cvsprocess, which can be used to fetch artifacts
from a CVS module. Savant allows for custom processes to be created and added
to the workflow if needed.
We realized that we still required the J2EE 1.3 artifact so that Verge Web
would compile correctly and http://ibiblio.org
did not have that artifact. In order to fix this problem, we downloaded the
artifact from Sun and created a new CVS directory called extralibs.
We then added this artifact to that directory under a sub-directory named java/j2ee.
In order to allow Savant to find this artifact, all we needed to do was to point
Savant’s local cache to the extralibs directory and Savant
would do the rest. To accomplish this, we added another dependency definition
with a custom local cache location and a single artifact (inside an artifact
group). Here is that dependency definition:
<savant:dependency localcache="../extralibs">
<savant:artifactgroup classpathid="classpath.deps">
<savant:artifact group="java" projectname="j2ee" name="j2ee" version="1.3.1" type="jar"/>
</savant:artifactgroup>
</savant:dependency>
As you can see, this dependency definition did not require a workflow because
we knew that Savant would always find the artifact in the local cache. That
was all that was needed in order for the Verge Web project to begin building
the Verge Commons and Verge JUnit projects automatically as well as downloading
and fetching all its dependencies.
Before the build for Verge Web would work correctly, we still needed to update
the other project build files so that they would publish their artifacts. This
is something important to remember: when a local project is built via a dependency
definition, that project must publish its artifact to the local cache so that
they can be used by the current project. In order for Verge Web to find the
verge-common-2.3.jar artifact, we needed to add a publish definition to the
Verge Commons build file. The Verge Commons publish definition looks like this:
<savant:publish from="${dir.dist}/verge-common.jar">
<savant:artifact id="verge-common" group="verge" projectname="Verge Commons" name="verge-common" version="2.3" type="jar"/>
</savant:publish>
This definition publishes the verge-common-2.3.jar artifact under the group
verge and project name of Verge Commons. Publishing
is the act of creating the artifact by copying a file into the local cache.
One important point about this is that the local file can have any name and
Savant will rename it during publishing to the correct artifact name. Additionally,
Savant supports the publishing of an unlimited number of artifacts from a single
project.
Now that all of our build files had been updated, when a build is executed
for the Verge Web project, here is what happens: On the first build of Verge
Web, Savant uses the workflow defined to attempt to resolve the artifacts. Each
process of the workflow is executed in order and in the Verge Web build this
downloads most of the artifacts from http://ibiblio.org.
Once it hits the verge-common-2.3.jar artifact, it goes out and builds the Verge
Commons project. The Verge Commons build publishes the verge-common-2.3.jar
artifact to the local cache. Once Verge Commons build completes the Verge Web
build resumes and can now fetch the verge-common-2.3.jar artifact from the local
cache. This process is then repeated for Verge JUnit. Lastly, Savant changes
the location of the local cache and resolves the j2ee-1.3.1.jar artifact from
our extralibs directory.
More complex issues
Verge is a decent example of how most projects have dependencies on local
and 3rd party libraries. There are a few more complex issues that do come up
in projects that Savant handles by resolving and publishing artifacts. The most
prevalent issue is Serialized classes. If classes that are serialized and passed
across the wire it is important to ensure that these classes are up-to-date.
In most cases developers will create a number of POJOs (plain old java objects)
that help to transmit data back and forth from remote services. All of these
classes implement the java.io.Serializable interface. Java handles serialization
by stamping the compiled classes with a serial version unique identifier (serialVersionUID)
during compilation. This serialVersionUID is checked when a serialized class
is sent across the wire to ensure that the local version of that class is the
correct version. If these POJOs are deployed as a JAR file, it is vital that
the JAR file be up-to-date. Savant makes this easy to accomplish, if the JAR
file is an artifact, by handling the resolution of the artifact during the build
and ensuring that the JAR file is the correct version.
Why not Maven?
People have asked, “why not use Maven?” This is an excellent question
considering that Maven already supports the concept of artifacts and dependencies.
One of the major concerns when considering Verge was that each Verge project
produces multiple artifacts (excluding the verge-junit project). Maven has a
fundamental concept that each project should produce one artifact. Of course,
Maven does allow a project to depend on multiple artifacts from a 3rd party
project, but it makes it very difficult for projects built with Maven to publish
multiple artifacts in a clean manner.
The philosophical debate will rage on, with or without Maven and Savant, as
to whether or not projects should produce single or multiple artifacts; but
for Verge this was a requirement. The Verge Web project produces multiple J2EE
tag libraries and ships the TLD files (tag library descriptor) inside the JAR
for each tag library. Verge Web also contains multiple J2EE applications both
for testing and as examples. Maven would have required that each enterprise
and web application be a separate project, in addition to each tag library.
In order to switch to a Maven build, the Verge Web codebase would need a major
refactoring and would cause the number of build files to maintain to drastically
increase. We believed that a build system should not try to force any of these
decisions on its users and instead leave it up to the developers to decide if
their project should publish single or multiple artifacts and how it should
be laid out and deployed.
Another major concern regarding Maven was that in order to support building
of local projects, either some fancy scripting needed to be done, or a separate
master build file would need to be created and maintained. This master build
would use the Maven reactor to handle other projects. This solution seemed to
be a bit cumbersome and was poorly documented. We believed that the build system
should be smart enough to build a local project if it existed or skip it if
it didn’t and accomplish this in a simple and clean manner.
We also felt that Maven’s repository structure was too flat to be highly
scalable. Maven collapses the concepts of groups and projects. Therefore, the
Jarkarta Taglibs project from Apache becomes simply taglibs. This makes names
collisions a much more common situation because now any group publishing artifacts
using the name taglibs will end up putting their artifacts in with the Jarkarta
artifacts. Savant separates out these concepts so that things more closely model
real world organization (where there are groups/companies that have multiple
projects which produce multiple artifacts).
The rest of our concerns with Maven were based on the state of the documentation
and plugins at the time we were first experimenting with it. We felt that if
we were to put a major investment into a new system, that it should be stable
and well documented. Many of the plug-ins we attempted to use had bugs and the
documentation proved difficult at best. Our attempts to accomplish tasks outside
of the scope of Maven’s plug-ins also proved difficult. After a few design
sessions, discussions and prototypes, we felt confident that the Ant build system
could be easily extended to support all of our dependency needs and we could
therefore retain the majority of our existing build. We would only need a few
minor additions in order to specify what artifacts each project depended on
and published.
Other Solutions
There are a few other solutions that people have already built, however most
of them are either abandoned projects or only partially solve the problem. Howard
Lewis, the founder of the HiveMind project, built one such extension. He created
a new task similar to the ant get task. This task downloads dependencies from
the Maven repository over at http://ibiblio.org
and verifies integrity using MD5. Another similar download solution is a toolkit
that is part of the Synthesis project over at Java.net. This tool is called
Jarstore and essentially performs the same type of download except that it provides
a bit more robustness with a local cache of dependencies.
These solutions are ideal for single codebase projects because they are mainly
used for the dependency resolution (downloads) portion of the build. They download
dependencies from a repository on the web and store them in a specified location.
The build script author can then setup the classpath using the dependencies
that were downloaded.
These solutions still do not solve the problems of multi-project builds, nor
do they provide the classpath and fileset construction or workflow logic that
Savant does. If a “download only” type of solution is required,
Savant can easily be configured so that it only downloads artifacts and stores
them anywhere on the file system (by configuring the local cache location).
The classpath can then be constructed manually later on in the build script.
Future of Savant
Savant is in the first phases of development. There are other complex issues
that need to be address when discussing dependencies. One of these issues is
code churn. Code churn occurs in larger companies when many projects are consistently
and frequently changing. If artifacts are pulled from a remote repository, they
may contain changes that break the current work of developers. This is a bit
of a religious debate because eventually the developer needs to fix his code
to work properly with the latest version. But he may be deep into some work
and not wish to switch gears to fix broken builds or worse, broken deployments.
He may want to delay this work until he is ready to integrate prior to committing
his work.
Another problem that arises when switching a build over to Savant is that by
moving 3rd party libraries out of the source tree it becomes difficult to correctly
setup IDEs to find all of a projects dependencies. Currently either the dependencies
can be used from the local cache, or an Ant target can be written to copy the
dependencies into the source tree for use by the IDE. Both of these solutions
are not ideal. In order to solve this problem, plugins for popular IDEs could
be written to parse the Ant build files and setup project dependencies from
the Savant definitions.
Summary
Overall, Savant is a step in the right direction because it removes the management
of dependencies from developers and puts it where it belongs, in the build.
Savant provides the means for projects to declare in one place their dependencies
and how they wish to find them. It then handles all the plumbing automatically.
For more information including complete documentation, visit the Inversoft website
at http://www.inversoft.com
or the verge project home at http://verge.dev.java.net.
PRINTER FRIENDLY VERSION
|