TheServerSide Interviews Aslak Hellesoy on Xdoclet

Aslak Hellesoy is a software developer and technology manager in a Norwegian Consulting company called BEKK. He is one of the project administrators on the popular XDoclet project, and also the creator of Middlegen, a database driven code generator that complements XDoclet in a nice way. He answers several questions on open source.

Welcome Aslak, can you give us a quick introduction on yourself, and your role in open source?

I am a software developer and technology manager in a Norwegian Consulting company called BEKK. I have a master of computer science from INSA Toulouse in France. I'm one of the project administrators on the popular XDoclet project, and also the creator of Middlegen, a database driven code generator that complements XDoclet in a nice way.

How did you get involved in XDoclet?

I was one of the early adopters of XDoclet back in 2001. I contributed some functionality for BEA WebLogic Server, and after a while I was invited by the team to become a committer.

What is it like to be part of a project as popular as XDoclet? What are the challenges?

It's challenging. The XDoclet project isn't agile enough.

XDoclet has been downloaded 80,000 times. The latest version 15,000 times. There are 250 subscribed users on the mailing list, so I think it's fair to assume that there are some 5000 daily users out there. Users that discover bugs and have questions. And even some that contribute patches.

The hardest part is keeping up with all the communication this generates. The xdoclet-user and xdoclet-devel lists plus our JIRA bug tracker receive somewhere around 40 messages per day, and keeping up to date with all of these demands a lot of time.

The XDoclet project has reached a critical mass (both in terms of communication load and amount of code), and this is something we have to deal with. We want to lower the amount of communication. We're taking several measures to make this happen:

  1. Mathias has recently put up a Wiki on Using a Wiki encourages a more agile and accessible documentation process where the whole world can contribute (and not only the core developers).

  2. We're writing XDoclet from scratch doing it right. This time the design is a lot better, the code is properly documented and easier to understand. Therefore users are more likely to be able to fix bugs themselves and submit patches.

  3. Splitting up the project. The XDoclet project will only support the core engine, and smaller teams will maintain plugins for various platforms such as JBoss, Struts etc.

All of these things are quite new to both the development team and the users, so it will take some time before we see the effects of them. But once we get into it we as a development team will be able to focus more on getting things done and less on answering (or ignoring) emails and bugs.

I hope that over the summer we'll be on track with a leaner and more agile project. Being agile is crucial when things move as fast as they do in the XDoclet world.

What are the plans for XDoclet 2?

Lot's of things! It's a complete rewrite. In short:

  • Get rid of our home-grown template engine and use Velocity and Jelly as template engines.

  • Ant decoupling (but still Ant support of course).

  • IDE integration. We'll add a metadata API that can be used by IDE plugin writers to discover what tags/tag attributes are available, and in what context. The metadata API will also provide information of what configuration options are available, so that information that is traditionally specified in an Ant script can be specified in the IDE plugin, typically in a configuration dialog.

  • Better diagnostics. XDoclet will emit warnings or errors when unknown tags are encountered, or when tags are badly expressed or in the wrong place.

  • An SDK. This is will be a set of Ant tasks to aid plugin development, such as packaging an XDoclet plugin, generating metadata (that the metadata API will use), generate documentation from special @tags in the plugin classes themselves.

  • A query language. The plugin writers will be able to query the xJavaDoc API directly from the Jelly or Velocity template. We'll look at JXPath for this.

  • We're taking advantage of Jakarta Commons Collections' Predicate interface, which makes it possible to perform selections on Collections.

  • A decent test suite.

  • Built with Maven.

  • Understandable source code ;-)

Why did you decide to switch from XDoclet's home-grown templates, to Apache Velocity and Jelly?

Because it makes it easier to write new plugins for XDoclet. Our home-grown template engine uses a tag library mechanism similar to JSP, and this requires a lot of code to be implemented in java classes (tag handlers). Further, Velocity is a very popular template engine. The Velocity Template Language (VTL) provides all we need, and has built-in support for looping and conditionals. We believe a switch to Velocity will attract more contributors and lower the entry barrier for newbies. The query feature will provide a lot of power in a simple and convenient way.

Also, XDoclet 2 has an abstraction layer for the underlying generation mechanism. So, if someone wants to plug in a different code generation engine (such as perhaps FreeMarker or WebMacro), this is done by writing a simple adapter class.

Will XDoclet be backwards compatible? i.e. will the source tags be the same, just with new backend templates?

Before I answer that question, let me explain one of our most important goals for XDoclet 2.

XDoclet 2 will consist of a core and a built-in SDK. We're not planning to maintain the plugins like we do now e.g. various EJB containers, JDO, Struts, Hibernate etc. That's something we'll encourage "vendors" to maintain. For example, we'll expect the JBoss team to maintain all the plugins that produce JBoss-specific files. - The Struts team likewise. We'll even encourage commercial vendors to do that. If some vendor doesn't maintain their own XDoclet plugins, but there is still demand for it, we'll form a separate sub project for it, or encourage someone else to do it.

This has several benefits. First, it keeps the plugin code with the same developers as the platform, and therefore makes it easier to keep the XDoclet plugin in sync with the platform it's written for. Further, it will increase the over-all quality. XDoclet 1.2.x keeps plugins for some twenty different platforms in the same code base. We don't have enough people to maintain all the bug reports, feature requests and patches that are reported.

XDoclet 1.2 has become a dinosaur with a little brain and an enormous body. We all know what happened to the dinos, and we don't want XDoclet to suffer the same destiny. So, we're putting XDoclet on the surgery table to do some serious fat sucking and face lifting. To get back to your question about backwards compatibility: It depends on what the plugin developers will do. It's entirely up to them to keep things backwards compatible. However, we're building support for deprecation and alternative/synonymous tags into the core, so plugin developers can deprecate old tags (but still support them), and invent new tags.

Will XDoclet 2 only be released when all current templates are 'ported' to Velocity?

No, XDoclet 2 will be released before that. In fact, when XDoclet 2 is released, there will probably not be more than a couple of usable plugins. This will be the plugins that we choose to port during the development of the XDoclet 2. These will serve as validation, but also as inspiration for plugin developers that want to port old XDoclet 1.2 modules to new XDoclet 2.0 plugins.

Do you see the metadata JSR impacting XDoclet?

Most likely. Unfortunately none of the XDoclet committers are part of the JSR team, but I have heard rumours that the JSR will go for a JavaDoc @tag based metadata syntax. I assume that the JDK that supports it will also have built-in support for accessing this metadata. However, the goal of the JSR is more analog to .NET attributes. It's metadata that is accessed at run-time, as opposed to build/compile-time, as it is with XDoclet.

When the JSR hits the streets, we'll probably see a lesser need for XML deployment descriptors, and XDoclet's raison d'etre might be diminished. However, this shift will not happen over night, and there will always be a need for code generation, even with the JSR implemented. We'll fine-read the JSR when it comes out.

Do you see yourself using that API in the future?

I'm pretty sure I will. the JSR is not a competitor to XDoclet, it's a complement. And if I can get away with generating less code to do something, then that's a good thing. Code generation is one way to deal with a problem. If there is an alternative way, like runtime attributes, I'd say that's a better way than code generation.

Why was there a need to write your own JavaDoc, xJavaDoc?

I doubt that the JavaDoc authors had ever imagined that JavaDoc would be used at the core of a code generator. There was so much information we wanted to get out of the source code and the @tags, and JavaDoc couldn't provide all we needed. Like getting the @tag of a certain method, and having it look in the superclass if it wasn't on the first one. Sun's JavaDoc also emitted too many warnings about non-existant classes (referenced classes that weren't yet generated), and it was slow. Cedric Beust (our EJBGen friend) has written about it in his blog:

So I sat down and wrote xJavaDoc from scratch, using JavaCC. It provides what we need, and if it doesn't, we change it. It's faster, and we have full control over it. It solves every single problem that Cedric aptly points out.

How is xJavaDoc different to JavaDoc?

Sun's JavaDoc core consists of a java parser and an API that represents classes, fields, methods, comments, tags etc. Sun also has something called doclets, which is a class you can plug into the JavaDoc core, and let it do something with the object model created by the core. The default JavaDoc doclet takes that object model and generates API documentation in HTML. XDoclet 1.1.2 and earlier was a doclet for Sun's JavaDoc core, and it generated various XML files and java files.

xJavaDoc is quite similar to JavaDoc on the surface. It parses java source files and builds object to represent them. The xJavaDoc API and the JavaDoc API are very similar. The xJavaDoc API is a bit richer, and uses java.util.Collection in the method signatures instead of arrays. And xJavaDoc has native Ant support.

Can it be used as a replacement for Suns tool?

No, it can't. Well, in theory it could, if someone went through the pain to reimplement the standard JavaDoc HTML API doclet to fit with xJavaDoc. Or write an adapter for it. But I don't see why anybody would do that. Sun's JavaDoc is OK for what it has been designed for: API doc generation.

Do you think XDoclet does a good job in keeping "up to date" with the products it generates XML descriptors for?

In some areas, yes, in others no. EJB, JBoss and Struts is quite up to snuff. Some modules are virtually dead, while some are OK. The most popular modules are also the best uptodate. It's the user community that drives this.

Are there any long term plans in your head for XDoclet? Where do you see XDoclet in a couple of years?

I wouldn't date speculate on that. It depends on what the JSR brings us. It depends on where Java goes, and whether there will still be a need for code generation. And finally it depends on the new plugin communities.

One of my biggest concerns is IDE integration, so I'll probably take on that as soon as I can.

Some people argue that some descriptor elements should be placed in the source code, whilst others should be more loosely coupled. For example, TX attributes should be in the source (as getting that wrong can even "break" an app), but env properties should be external to allow for different settings. Do you see any differences? How does XDoclet allow you to decouple values from the source code?

[e.g. having value expansion: name=${whee} ... whee being in a properties file somewhere., and xml linkage points where xdoclet can suck in xml files]

I absolutely agree with you. Sun also wanted it that way when they defined the J2EE roles. The data that goes into the assembly descriptor part (or similar for non-EJB technologies) should ideally not be part of the source code. One example is JNDI names.

When prototyping it can be handy to keep everything hardcoded in the tags. But if you want to make your code truly portable between different environments, you should use properties. Then you can provide values for these properties in a different place, like in the Ant script, or in a properties file.

There has been a lot of talk about code generation. Is it possible to get generation-happy?

Most definitely. I have been generation-happy myself. But now I'm actually trying to avoid code generation if I can. It really depends on the technology you're using. EJB is a very complex technology with a lot of weird files involved. I think it's perfectly legitimate to be generation-happy if you're developing EJBs.

On the other hand, just because XDoclet can generate files for you, it doesn't mean you should. I know that many people working with WebWork prefer to write their configuration files by hand, although XDoclet can generate them. At the end of the day, you just want to get your work done, and there are scenarios where generation doesn't improve the way you work. After all, it is an extra step in your build process.

How do you know when you're using XDoclet too much?

Many people are asking us questions about errors they get from JBoss when they deploy an XDoclet-generated EJB. These people don't understand that they should start in the other end. Find out why JBoss complains.

This is a sign that people are using XDoclet too much, and relying too much on it. It's important to understand what's going on behind the scenes.

How would you compare the Xdoclet approach to Model Driven Development? When is each appropriate?

I believe they are complementary technologies. There is a new cool tool out there called AndroMDA. This tool generates EJB Java sources with XDoclet tags from UML/XMI models. These sources can then be passed on to XDoclet to generate a complete component. It works along the same principles as Middlegen, except that the metadata comes from an XML model instead of a database.

I think Model Driven Development tools are good for generating the high level business logic code. These tools can take advantage of XDoclet to generate the low-level code (such as deployment descriptors etc).

From the user's perspective, it depends on where in the loop she wants to be. If she's in a hurry and not comfortable with low-level coding and XDoclet tags, a Model Driven approach could be nice. On the other hand, if the developer wants full control on the code, XDoclet seems more appropriate. And then if you want complete control, you can write assembly code.

So it's all about abstraction levels, and what level you're most comfortable with.

Where does Xdoclet fit into the world of AOP? Can it be used to generate cross cutting code?

XDoclet is not the right tool for AOP. In fact, I think it would be completely contradictory to use XDoclet to generate cross-cutting code. One of the purposes with AOP is to define behaviour in one single place. Say, I want logging on all set* methods. If you were to specify that with XDoclet tags, you'd have to put these tags in every single method, defeating the purpose of AOP.


What is Middlegen and why did you create it?

Middlegen is a code generator that takes metadata from a database instead of from other java sources. It connects to a preexisting database, reads metadata (what tables, columns and relationships are available) and generates code from that. The code it generates is meant to access the database.

I created it because I'm lazy. I had a big database that I wanted to access via CMP using XDoclet. I realised that if I mixed the knowledge about what a CMP should look like and the knowledge about what the database looks like, I could generate the Beans with XDoclet tags.

As many open source projects, it started out as a little thingy, and grew to be a quite popular tool.

What 'niche' did it fill?

XDoclet made it easy to write EJBs. Middlegen took it a step further, and made the writing of CMP EJBs completely automatic. It's the perfect tool for people who want to generate a java persistence layer from a database, using a bottom-up approach.

What generation targets are supported within Middlegen? Are there any others coming out soon?

Currently, CMP EJB for JBoss and WebLogic, MVCSoft, JDO, Hibernate and a simple Struts/JSP CRUD web interface. The Hibernate support is experimental, but it will get better. The hibernate folks have chosen Middlegen as an "official" tool.

Do you have any plans for IDE plugin support, so middlgen could be plugged into Eclipse, TogetherSoft and the like?

Yes, that's planned. I'm currently refactoring Middlegen so it uses the new XDoclet 2 core. That makes the codebase more compact, and it also makes it easier to write IDE plugins, since that is accounted for in the XDoclet 2 core. I'm a passionate IDEA user, so I might write that plugin myself. And I hope the community will follow and write plugins for other IDEs.

What do you see for the future of Middlegen?

  1. Tighter integration with the XDoclet 2 core. Not the xJavaDoc part, but the configuration and Velocity part of it.

  2. Maybe move some of the code to Jakarta Commons SQL. Parts of it is very similar to that API, and a merger her would be very nice.

  3. IDE integration.

  4. Support for more platforms.

Dig Deeper on Open source Java

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.