Opinion: Complex build systems need a real language

Discussions

News: Opinion: Complex build systems need a real language

  1. Jon Tirsen talks about a Java project that he was working on that had a very complicated build system. He started off with Ant, even writing a tool that took his project information and generated the ant build, and then tried Maven.

    He ended up realising that for this kind of build system he really wanted a real language. He happened to choose Ruby as the implementation of choice, but others could have be chosen.

    Using Ruby to build Java systems

    Threaded Messages (21)

  2. Would you say that part of the problem is not the complexity of the build script produced (afterall, only the Ant has to read it), but the complexity of the XSLT which does have to be maintained? I read an interesting article on Martin Fowler's site recently where he claimed to be using Ruby in place of XSLT too.
  3. Ant == poop[ Go to top ]

    I find it baffling that people use the weakest scripting language ever concieved to maintain highly complex systems using an advanced programming language.

    If anyone is unhappy with ant's complete inflexibility and backwards way of approaching everyting (not to mention complete mis-use of XML), check out AAP ( http://www.a-a-p.org ), which is a python-based build system. The rib is that you can use it's builtin facilities (like make or ant), but you can easily embed real python code wherever you need it for things that are complex.
  4. Ant scripting[ Go to top ]

    You can also use scripting in Ant you know.

    Our release script has a chunk of JYthon in it to sort out the build label (either supplied to enable builds of earlier releases or to generate a new one) which works just fine. You have complete access to the ant API which means that you can do anything from calling ANT methods to generating new targets on the fly.

    Not saying that ANT is the be all and end all but it has serverd me fine on two pretty large and complex projects (including the latest one wich includes installshield (*spit*) and .Net components as well.

    On the last project I autogenerated most of the script using perl :-)
  5. scons: a Software Construction tool[ Go to top ]

    You can try scons: "Configuration files are Python scripts--use the power of a real programming language to solve build problems."

    http://www.scons.org/
  6. Groovy Builds[ Go to top ]

    Or how about using Groovy? Using an AntBuilder, you get the best of both worlds: Ant builds within the context of a real programming language.

    Check it out at: http://groovy.codehaus.org/ant.html

    What the example doesn't show you is that you could just as easily throw a loop or an if around pieces of the Ant markup if you need to.

    Disclaimer: I personally haven't tried this method, but I've been aware of it and it does intrigue me.
  7. Complexity needs removing not managing[ Go to top ]

    As a member of the project Jon refers to, I'd like to point out that the complexity of the build script is due to the way dependencies between source packages are managed. The script compiles packages at the bottom of the dependency tree like util and remote first, then adds them to the classpath and compiles the code that depends on them, and so on. That way, if a lower level class has a dependency on a higher level class, it will be indicated by a compilation failure.

    Further, we have two dimensions of dependencies - 'tier' dependencies, such as client, server and remote, and 'module' dependencies, which are logical separations based on business domains, e.g. sales, stock and admin. XSL is used to generate the build script, generating something like 7 tiers x 20 modules = 140 compilation targets, 140 test targets, 140 little jar files on the classpath, etc.

    I think the problem here is not one of managing all that complexity, but trying to remove it by coming up with a solution that's better aligned with the problem. The problems of package dependency checking and compilation are separate concerns and the script mixes them into one, leading to all that complexity.

    So, rather than using a language like Ruby to manage all the complexity, I've been working on an alternative. Wait for the plug... Frustrated with the build, I've developed a Java Package Analyser (I've called it Japan and it's about to go up onto sourceforge), which can be used to check dependencies between packages according to the rules you define. This means dependency checking can be done separately from compilation, which means compilation can be done with one javac, not 140 little javacs.

    The end result is a much simpler Ant build script (down from 45,000 lines to 700), which runs much faster (10 mins down to 3 mins) and is easier to maintain. I admit I haven't finished yet - I like the way Jon tracked which test packages have already run and don't need running again, which I haven't solved yet. But still, I hope I've shown that sometimes complexity can be down to the solution not being aligned to the problem, in which case the complexity doesn't need managing, it needs removing.
  8. Complexity needs removing not managing[ Go to top ]

    I've been working on a build process using Ant lately, that also performed module dependency checks. But rather than performing dependency checks using some sort of package analyzer it used a different appraoch:

    First you could configure dependencies between modules within a separate dependency file. The following types of dependencies were distinguished: compile-time, runtime, and "general" (i.e. compile and runtime) dependencies. For each module to be build you could specify upon which other modules (or external libraries) it depends, and what the type of dependency is.

    Further, for each module you can configure its content (Java source files) using include and exclude expressions. Each module was compiled separately. The classpath used to compile a module was generated from the dependency file mentioned above. The modules that the module to be build depends upon were referenced from a directory into which the modules that have been build are placed. After a module has been compiled, it was checked if only that classes were compiled that were specified as the module's content. If more classes were compiled, this means that there is a dependency not yet considered, and the build fails.

    I chose this approach, since it is very reliable, and it forces you to explicitly manage the dependencies between the modules.

    Regards,
       Dirk
  9. I totaly agree with you, this is really not Ants job IMHO. Dependency checking should really be done with something like JDepend (http://www.clarkware.com/software/JDepend.html) which can be plugged nicely into an Unit Test or Ant Build for example.
  10. Wow![ Go to top ]

    Wow Chris, your post is like a ray of sunshine in a dark and dismal world.

    I've managed to resist posting snide comments about people inventing ludicrous solutions that are nothing more than glorified bandaids for no better reason that to feel clever, but it's really nice to see that there are some people out there who actually stop to think about the underlying problem and what you're trying to achieve, vs finding yet another awkward obscure way of doing something (I mean honestly, RUBY?!)

    Your solution addressed the problem in a far more elegant and correct manner. The reason for the obscenely complex buildfile (it sounds like) is that it's trying to do too much, and to do things it never was supposed to do.
  11. Wow![ Go to top ]

    [[[
    Your solution addressed the problem in a far more elegant and correct manner. The reason for the obscenely complex buildfile (it sounds like) is that it's trying to do too much, and to do things it never was supposed to do.
    ]]]

    I see 2 specific reasons why Ant files spin out of control.

    1. "My (sub)project is special". Maven lurches too far here, but do we really need any more compile/test/jar/doc/clean tasks? Special subprojects are awkward to fit into a generic structure. Let's face it, builds are a graph/tree walking exercise; if you intend to visit the nodes, the nodes must be visitable and follow a common contract. [All your locally vital goop can be called out using an etc task, or even better, ignored.]

    2. Ant dependency declarations are backways. I need to register subprojects with the master. What's that about? JUnit makes the same mistake...
  12. Wow![ Go to top ]

    2. Ant dependency declarations are backways. I need to register subprojects with the master. What's that about? JUnit makes the same mistake...


    I may be wrong on what that's about, but the way we're doing it is a subproject is self-contained, and only needs to know about itself. Encapsulation of sorts.

    Maybe we're talking about different terminology....
    Steve
  13. Wow![ Go to top ]

    I think so. Dependency != jar.
    If we are speaking here about j2ee you can think in term of dependencies/artifacts about
    - wars
    - ejbs
    - ears
    - configurations



    There is quite a lot of "theoretical" knowledge related to constructing decent build system for the enterprise and how to handle dependencies and how to declare them.
    "self-contained" projects are by far the worst solution (e.g there is no chance that plugin thuis project in coontinous integration system will be easy)
    Intra project communication via advanced SCM systems tailored to your needs is the way to go. Here I mean that projects should exchange information via something which is often called "artifact repository".
    Peer To Peer communication between projects in large scale system creates horrible, unmaitainable graphs of dependencies.
    Nowadays SCM system like CVS, SVN etc are not at all usable for that and this concept (artifact repository) needs to be implemented as a parallel aspect.

     
    Quite good overview of how it should look like that can be found for example

    http://www.amazon.com/exec/obidos/tg/detail/-/0471327603/002-2652926-6192019?v=glance


    You can also take at look at "maven way" of doing these things:
    http://www.pivolis.com/pdf/J2EE_projects_Maven_V1.1.pdf

    Maven not only implements "artifact repository" concept but also enables "synchronization" between different levels of repositories (local, remote).
    This synchronization is possible but it doesn't mean that it has to be used. This is just an extra feature.


    Michal
  14. Wow![ Go to top ]

    I think so. Dependency != jar.

    > If we are speaking here about j2ee you can think in term of dependencies/artifacts about
    > - wars
    > - ejbs
    > - ears
    > - configurations
    >
    >

    Agreed, we have a concept of a "component" which is a jar, war, ear, or really anything that can be a unit.

    Steve
  15. [[[
    The script compiles packages at the bottom of the dependency tree like util and remote first, then adds them to the classpath and compiles the code that depends on them, and so on. That way, if a lower level class has a dependency on a higher level class, it will be indicated by a compilation failure.
    ]]]

    Running the build from the top of your package graph is a job for make; this why ant is not make without make's wrinkles ;)

    In a previous life/job we had a java project made from subprojects (arbitrary depth, multiple jar targets if wanted in each subproject). Each project declared its dependencies (you were expected to know what you were linking to!). The make would simply figure out the dependency graph and run the builds in order. If the build became cyclic we'd find out early. All target jars were sent to a dir outside the project (ie /usr/java/myproj/lib) via a "make install" not into the dependent project's lib/ dir as is idiomatic with ant. It also could also compile native code per platform and across multiple jdks. I used to think the makefile was complex, but I look at some ant files now and wonder...

    I suspect J2EE deployment greatly complicates the build process. Other than dependency checking and normalization, all the ugly hacks in my antfiles are clustered around war and ear file generation.
  16. Complexity needs removing not managing[ Go to top ]

    Interesting post. We're developing our build system, copied from another group in the same company.

    One thing they've done (I haven't seen the ant code for this) is an ant script is self-contained if there are no dependencies. Other dependent ant scripts call the first, adding its jar to its classpath. So basically our system will be able to recursively construct all its dependencies from the ground up.

    We're definitely going to add jdepend to our stuff, after your post, because things really shouldn't have to be so complicated. But you gotta start somewhere, and a system needs to be able to contain the mess, and help you clean up your code as time goes. We're going from a small disorganized random manual build, to hopefully a more organized systematic sort of thing.

    Steve
  17. Complexity needs removing not managing[ Go to top ]

    I agree. The dependency checking was one part of the system that should've been simplified. I just wish we had figured that one out much earlier. ;-)

    On the other hand, that was only one part of the build system and I do not believe we could have simplified away all the complexity. You mention incremental execution of test-suites and there's also building of deployable war-files, running acceptance tests, generating startup jars, deploying them to production servers and so on.

    There's two ways of handling complexity, removing it or managing it. If you can remove it, you should, if you can't you should manage it. With Ruby (or any other 'real' programming language) I can manage complexity, but I can't remove it. With Ant I can't even manage complexity.
  18. Complexity needs removing not managing[ Go to top ]

    With Ant I can't even manage complexity.


    I disagree with that. You _can_ manage complexity with Ant using the basic strategy of "devide and conquer". You can encapsulate common parts of a build process in separate build scripts in order to be reused. You can parameterize targets. Further, Ant is quite extensible. You can write your own tasks if you encounter common problems that cannot be handled by core tasks.

    Of course Ant is not a programming language (although it provides basic capabilites of conditionally executing tasks and targets). Therefore, if your build process requires a lot of "programming" logic you usually have a problem with Ant. But that's not what Ant was designed for. The strengths of Ant are the strengths of every build-language (like make): managing transformations.

    Regards,
        Dirk
  19. Can I see???[ Go to top ]

    Can I get a copy of your Ant scripts to see exactly what you are doing, especially the dependency checking that you do? You can reach me at (813) 348-7313. George
  20. There are two separte aspects of complex build system

    a) it is a complex computer program.
    And it is rather proven that "scripting languages" are not well suited for creating anything large. They can be used but it will be always hard to maintain and imporove scripts in time. Ruby is better then poor XML scripting
    but I am all for strongly typed languages in programs that have more than 1000 line of code.

    b) as anything complex it needs some design and proven patterns should be applied. I hope that soon build systems will be not only written in high level languages (like Java) but they will be comercialized/standarized.
    I think that there is also no future for build system which cannot be customized via GUI and integrated with continnous intergatoin system.
    It should be possible to create/customise a ready to use, headless build system with help of tools like IDEA/Eclipse etc. In the end time spend on writing yet another build is a time lost.

    Michal
    P.S.
    Just wait for the new next generation of Maven! It will be much more Java in Maven.
  21. I tend to sympathize with the assertion since certainly as my build scripts have grown in complexity, I have found that I would greatly benefit from the same structuring functionality commonly found in general purpose scripting languages that support modularity, reuse and extension, but that are missing or lacking in Ant.

    To be sure, <import>, <macrodef>, <scriptdef>, Antlib, namespaces, and others address many of these issues, and are welcome, powerful and useful additions to Ant, making it more manegeable than ever before. However, they have also come at the cost of unresolved edge cases and ungainly syntax.

    I offer as an example what can be done when the basis of a build script is not a new language, but an existing, general purpose scripting language. As an experiment, I attempted to mimic Ant's style and organization as a language extension of python, i.e. packaged as a python module such that build scripts are pure python. Thus, build authors benefit from python's cross-platform scripting capabilities and mature structuring fuctionality.

    The project's admittedly unimaginative working title is 'pant'.

    In pant, I ported the main concepts of target, task, project, fileset, javac, copy, and delete, among others. Here is what a simple build file looks like, my hope is that it would be readily comprehensible by Ant users even if they don't know much python:

        # build.py - example pyant build script
        import os
        from pant import *

        project = Project(name = "test", default = "build")

        class properties(Target):
            classesdir = "build/classes"
            srcdir = "src"
            jarname = "test.jar"

        class compile(Target):
            depends = properties
            def run(self):
                javac(src = properties.srcdir,
                    destdir = properties.classesdir)

        class clean(Target):
            depends = properties
            def run(self):
                delete(dir = properties.classesdir)

        class jar(Target):
            depends = properties
            def run(self):
                jar(
                    destfile = properties.jarname,
                    basedir = properties.srcname,
                    destdir = properties.classesdir)

        class build(Target):
            def run(self):
                clean()
                compile()
                jar()

    Notes:
    0. Everything in this example is pure python. There are no modifications to the language itself.
    1. Targets are defined by subclassing pant.Target.
    2. The depends class variable of a subclass of Target is a list of other Target subclasses and is analgous to the depends attribute of Ant's target element.
    3. The body of a Target subclass's run method is analogous to the body of an Ant target's element content. Within the run method's body, any legal python code may be executed and normal scoping rules apply. After all dependent targets are run, if any, the run method of a Target is called, if defined.
    4. Targets may be run by simply calling them, i.e. clean().
    5. Targets may be defined in another python script and imported using python's import keyword.
    6. Parameters may be passed to Targets when calling them, e.g. clean(dir="some/dir")

    didge
  22. I agree[ Go to top ]

    ANT works fine when the projects are small.
    One way is to keep project small by creating sub projects.

    Why has not anyone tryed to create a ANT but expressed in Java not xml.
    All developers know Java.
    You could create java classes that match the tasks in ANT.
    Or maybe use the ant classes direct.

    This Java ANT could handle "ANT" java and traditional java.

    XML is nice and good in many whays but that does not make it to
    a good programming language