Java Development News:

TSS Relaunches on Tapestry

By Howard Lewis Ship

01 Jan 2005 | TheServerSide.com

Introduction

From its inception in May 2000, TheServerSide.com has been an exceptionally successful information portal site, receiving upwards of seven million hits per month. By 2004, TheServerSide was faced with a dilemma – they wanted to expand and improve the functionality of the site in a number of ways, and they wanted to reuse their existing code base to support launching additional online communities … but their existing code base simply wasn't up to the challenge.

TheServerSide.com is a typical, high volume web site. Incoming requests are dispatched to a front controller servlet for processing (this is very similar to how a Struts application is organized). JSPs are used for rendering response pages. A stateless session bean provides access to business logic, acting as a façade around a number of entity EJBs (using bean-managed persistence). In many cases, data transfer objects move between the different layers.

All in all, this was a common architecture considering the state of J2EE when the portal was first created. Over time, projects tend to grow, and functionality ever expands. Past a point, the architecture began to get in the way, preventing TSS from rolling out new functionality. The chief issues were:

  • The home grown front controller servlet was primitive. It was fine for simple pages, but as soon as your use case expanded beyond simple CRUD (Create Update Delete) style pages, to something with a

    Requires Free Membership to View

  • more complex workflow (such as a wizard), you were left to your own devices – often, a confusing tangle of HTTP query parameters and HttpServletRequest attributes, with code spread across servlets and JSPs.
  • The overall architecture had many layers. A change to the object model would require a raft of changes: to the database schema, to the entity beans, and to the DTOs (data transfer objects) that moved between the application layers. This highly coupled state led to a brittle application, where even simple object model changes entailed a high degree of risk.
  • Testing was hard. You have to jump through hoops to test the entity beans, and the web tier.

It was obvious that the implementation needed to be refactored to allow for a more agile environment for the development team. The next question was: what solution would be used?

After surveying the many options available, the open-source Tapestry project was selected for the presentation tier. Two developers were allocated; Eric Preston and Howard Lewis Ship (an independent consultant, and leader of the Tapestry project).

Meanwhile, in parallel, a conversion from entity EJBs to JDO was performed by a second two member team. Because of the architecture of the original site, which funneled all backend access through a set of stateless session beans, it was tractable to rewrite the presentation layer front end at the same time as the database access back end, simply by keeping the interfaces of the stateless session beans stable during the process.

Goals

The existing (1.0) version of TheServerSide.com is entirely stateless – no HttpSession was allowed; the new version, 2.0, inherited this same requirement. Performance could not be sacrificed either -- TheServerSide.com receives upwards of seven million hits per month.

As a content-focused site, it was extremely important that the untold number of external links pointing to articles and discussions on TheServerSide.com remain valid even after the transition.

This was a very forward thinking approach; re-architect the existing system in such a way that only the most observant would even detect a change but, in doing so, position for the rapid rollout of visible features.

Converting JSPs to Tapestry pages

The majority of the work was the straightforward transformation of the JSP pages into Tapestry pages. In many ways, this was a purely mechanical effort, converting JSP tags and scriptlets (Java code embedded within a JSP) into proper Tapestry pages and templates. The 1.0 site included a large number of "components", in the form of JSP snippets that were included by top-level JSPs to form complete pages. Tracking down these individual files and replicating their behavior was a painstaking, but manageable, chore.

One of the joys of Tapestry is handling the little ugly issues that occur when formatting complex layouts for the web. For example, consider the title bars used throughout the application, such as the "New content around the community" title bar on the home page:

If you look very carefully at the text, you'll see that there is a subtle shadow, and equally subtle shading of the edges of the bar. The HTML developer accomplished this using a mix of cascading style sheets and background images, but the HTML required to achieve the desired result is verbose and awkward:

 <table class="box" cellspacing="0"> <thead> <tr> <th> <img src="http://media.techtarget.com/tss/static/skin/images/bar_begin.gif" height="18" width="8"/> <span class="container"> <span class="text">New content around the community</span> <span class="shadow">New content around the community</span> </span> <span class="fill">New content around the community</span> </th> <td> <img src="http://media.techtarget.com/tss/static/skin/images/bar_end.gif" height="18" width="7"/> </td> </tr> <tr> <td colspan="2" class="barbottom"> < img src="/images/spacer.gif" height="10" width="10"/> </td> </tr> </thead> <tbody> <tr> <td colspan="2"> <h2><a href="http://media.techtarget.com/tss/static/articles/article.tss?l=AspectWerkzP1"> AspectWerkz 2: An Extensible Aspect Container </a> </h2> . . . </td> </tr> </tbody> </table>

Not only is this a lot of typing, but the same pattern is repeated, with subtle variations, throughout dozens of JSPs … a painful violation of the Don’t Repeat Yourself principle on many levels (including the way "New content around the community" is repeated three times!). When using Tapestry, one of your chief problem solving tools is to create new components. Tapestry makes creating custom components natural and easy, and in fact, TheServerSide.com contains dozens of such components.

For this situation -- unwanted duplication of HTML and logic -- a Box component, used in many HTML templates, replaces all the gory detail above with the much more streamlined:

 <span jwcid="@Box" title="New content around the community"> <h2><a href="http://media.techtarget.com/tss/static/articles/articles.tss?l=AspectWerkzP1"> AspectWerkz 2: An Extensible Aspect Container </a> </h2> . . . </span>

The Box component is responsible for outputting the <table> and other tags, integrating in the title provided as a parameter, and the content in its body. Because Tapestry components can have their own templates, it is simply a matter of refactoring the earlier example into its own HTML template, Box.html:

 <table cellspacing="0" class="box"> <thead> <tr> <th><img src="http://media.techtarget.com/tss/static/skin/images/bar_begin.gif" height="18" width="8"/> <span jwcid="@HeaderBarLabel" value="ognl:title"/></th> <td> <img src="http://media.techtarget.com/tss/static/skin/images/bar_end.gif" height="18" width="7"/></td> </tr> <tr> <td colspan="2" class="barbottom"> < img src="/images/spacer.gif" width="10" height="10"/></td> </tr> </thead> <tbody> <tr> <td colspan="2"><span jwcid="@RenderBody"></span></td> </tr> </tbody> </table>

Most of this template is just static HTML; the tags with a jwcid attribute are Tapestry components. HeaderBarLabel is another custom component, which outputs the set of three <span> tags used to format the text in title bar (and the shadow on that text). The RenderBody component renders the body of the component, the portion of the page template enclosed by its tags (the <h2> tag and other content from the earlier example).

All that's needed to finish the Box component is a component specification, a file used to identify the details of the component (including information about parameters). This file is called Box.jwc:

 <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE component-specification PUBLIC "-//Apache Software Foundation//Tapestry Specification 3.0//EN" "http://jakarta.apache.org/tapestry/dtd/Tapestry_3_0.dtd"> <component-specification allow-body="yes" allow-informal-parameters="no"> <parameter name="title" type="java.lang.String" direction="in" required="yes"/> </component-specification>

These two files, Box.html and Box.jwc, are simply stored in the application's WEB-INF folder and are automatically available for use within any page or any other component within the application. Many useful components, such as this one, don't require even a single line of Java code. Because creating components is such a common tool when building Tapestry applications, the framework makes it exceptionally easy to do so.

This same approach was taken throughout the application – identifying common behaviors and encapsulating them as reusable components. This applies to much more than simple "macro-like" components such as Box … much more sophisticated components are also common.

For example, each of the major categories of the application (News, Discussions, Patterns, etc.) includes pages for listing forums, listing threads within a single forum, and listing messages within a single thread, as well as pages for posting a new thread and posting a reply to an existing thread.

Each of these pages is a simple "shell" around a more involved component that provides the desired functionality. Once again, the Don’t Repeat Yourself principle … the code and templates related to (for example) posting a new thread into a forum, including the code that interacts with the backend logic in the stateless session bean, is isolated into a single place: the PostForm component.

Maintaining URLs

One of the great challenges in this project was to make the transformation as invisible as possible, by keeping all the URLs the same, before and after deployment of the 2.0 code base. This is somewhat against the grain for Tapestry, which normally controls URLs entirely.

Fortunately, the existing URL format universally used a ".tss" extension (the ".tss" extension was mapped to JSP files, for obscure historical reasons) . For example, a link to a thread within the news category was of the format /news/thread.tss?thread_id=30111. Of course, in the new application, there are no JSPs at all … just Tapestry pages with equivalent functionality.

The solution was to create a new servlet, mapped to the ".tss" extension, that would identify the old style links and perform a server-side redirect to the Tapestry application (mapped to /tss), passing the correct parameters, in the format expected by Tapestry.

With over seventy pages, and (in some cases) multiple operations per page, that servlet would become very complex! In fact, a more appropriate approach to mapping from old to new is a data driven solution, where the servlet uses configuration data to decide how to handle each request.

Data driven solutions are what the HiveMind services and configuration microkernel is all about (HiveMind is another open source project, a kind of sister project to Tapestry); using HiveMind it was possible to create a configuration point that contains several dozen entries, each defining how to map from an external URL (in the old format) to an internal URL (in the Tapestry format). For example:

 <redirect match-path="/news/thread.tss" query-string-pattern="thread_id=(d+)" redirect-path="/tss?service=external/NewsThread&sp=l{1}"/>

These entries combine path matching with regular expression parsing of the query string (the list of query parameters). From this, a new internal URL is generated, in the format that Tapestry expects. Because the redirect path is a server-side forward (not a client side redirect), the user will still see the external, "old style", URL in their browser's address field.

These kind of transformations can also be accomplished using Apache server's mod_rewrite module … however, that means for any testing, an Apache reverse proxy must be set up. Much of TheServerSide.com was tested from within the IDE using Jetty, a light-weight servlet container. Because the rewrite rules were handled in-container, using a servlet and driven by HiveMind, the configuration was constantly tested during normal development. In addition, not having rules inside httpd.conf means that much less configuration needs to be managed during software installation and updates.

It was not enough to recognize the old URLs, it was also necessary to generate URLs in the external format. In a typical Tapestry application, existing generic components (such as DirectLink or ExternalLink) are used. For this application, new domain-specific components were created, such as ThreadLink and UserLink. These components would assemble a URL with the expectation that the previously described mapping would occur. Again, the end result was that the links users see, in the live Tapestry application, match exactly the URLs created by the 1.0 version.

Dependency Injection

The HiveMind microkernel was not relegated to just transforming URLs; it was a central feature in the implementation of the application. Moving business logic out of the presentation tier is always a good idea – the presentation tier is notoriously hard to test because of all the cruft (HTML, query parameters, Servlet API) that gets in the way of simply invoking methods and checking results. In sensible Tapestry applications, the page and components classes should be concerned with just moving and formatting data … any more involved behaviors should be moved elsewhere, where they can be tested outside of the constraints of the web application environment.

Many Tapestry applications make use of Spring beans to store such logic, but TheServerSide.com uses HiveMind, which offers a similar services model (based on singletons), and also provides the configuration mechanism (used for URL rewriting as just discussed, and many other things throughout the application).

In any case, it is one thing to have your business logic coded and tested … it is another to access it from the web tier. For TheServerSide.com, we leveraged Tapestry's property specification feature, which allows new properties to be added to existing page and component classes. We would then inject HiveMind services into those properties.

For example, on the Search page, we needed access to the Searcher service, a kind of wrapper around the Lucene text search engine. Inside the Search.page specification file, we defined a property to hold the service, and accessed the HiveMind Registry to set the default value for the property:

 <property-specification name="searcher" type="portal.services.Searcher"> registry['portal.Searcher'] </property-specification>

The portal.Searcher string is the fully qualified service id of the Searcher service, which implements the portal.services.Searcher interface. Inside our code, the Searcher service appears as an abstract property:

 public class Search extends . . . public abstract Searcher getSearcher(); public void pageBeginRender(. . .) { . . . SearchResults results = getSearcher().search(sc, getReturnCount(), getStartIndex()); . . .

Again, this pattern was followed on many pages in the application; business logic was refactored out of servlets and into HiveMind services, and those services were injected into Tapestry pages (or components). This resulted in a much cleaner separation of the presentation logic (in the pages and components) from the business logic (in the HiveMind services), which made it much easier (or even just possible) to unit test the business logic code.

Sneaking In Some Improvements

The goal of the project was to reproduce all the existing functionality and set the stage for adding new functionality in the future. However, in many cases, it was easier for the developers, and better for the end users, to bend the rules a little and improve things.

For example, in the 1.0 version of the application, you could log in from almost any page … but after logging in, you would be returned to the home page, not the page you were looking at prior to entering your email address and password. For the 2.0 version, after logging in, you are returned to the same page, and the same page state. That is, if you have navigated to the third page of news items on the "More News" page and log in, you will still see the third page of new news items. That's a tall order for a supposedly stateless application. It's made possible by a number of specific concepts in Tapestry:

  • Each page has a unique name, and every page knows its own name.
  • Page state, even transient page state, is stored in JavaBeans properties of the page object.
  • Tapestry's Hidden component can store an object (using serialization) into an HTML form, and recreate the object when the form is submitted.

Combining these together, the login form stores a memento of the page containing the name of the page, and copies of the page's key properties (such as the new item index on the More News page). When the login form is submitted, the memento is reconstituted, and then used to restore the state of the page after logging in. The application acts as the user expects, in a stateful manner, even though there is no HttpSession to store that state.

Another improvement is the announcement zone, an area of the page at the top right of the page reserved for messages and errors. For example, after logging in to the application, you'll see a confirmation message:

In fact, this zone isn't limited to just text; using Tapestry's powerful Block component, it's possible to "link" in a portion of another page or another component. This shows up most powerfully in the administrative functions (that most users don't see), where entire forms show up in this space:

This kind of behavior would be very difficult to graft into a pure servlet application; because Tapestry is built on a true component object model, these types of manipulations become natural, and the user benefits from a more consistent, more thoughtful layout.

Pain Points

Very few of the pain points in the conversion were caused by Tapestry. The most troublesome parts of the application were areas where there was a lack of consistency … for example, tech talks vary considerably in layout and format, so building pages to display any single tech talk, or building components to link to a tech talk, proved tedious.

The other pain point was the necessity of doing a big bang conversion, rather than some form of incremental conversion. In retrospect, this could have been accomplished, but initially was dismissed. This necessitated the occasional use of the live production application as a reference for functionality when converting a JSP to a Tapestry page.

Some best practices evolved over the course of the project, perhaps the most important one being the need to relate old content files to new template files. TheServerSide.com is a content-driven application, and that content is scattered through dozens of different HTML and JSP files. As part of the conversion, files were often renamed, split apart, or moved to entirely new locations. For the content maintainers, it was a necessity to generate and maintain a comprehensive cross-reference of old and new files.

Conclusion

The conversion of TheServerSide.com to Tapestry has been an unqualified success. Performance and stability are at least as good as the 1.0 version of the application (largely thanks to improved caching provided by the combination of Solarmetric Kodo JDO for database access, and Tangosol Coherence for cluster-wide caching). End users have barely noticed the changes, except in the form of user interface improvements.

Several developers have already been able to make bug fixes and improvements to the code with little or no guidance from the primary development team. The HTML page templates are decidedly easier to read and maintain.

With the 2.0 code in place, work is already going forward with a 2.1 release which will actually add new features, features that would have been impractical to implement on the original code base. Tapestry has delivered on all fronts: simpler Java code, simpler HTML templates and generally faster and easier development.

References

Jakarta Tapestry – http://jakarta.apache.org/tapestry/

Jakarta HiveMind – http://jakarta.apache.org/hivemind/

Object Graph Navigation Language – http://www.ognl.org/

Spring Framework – http://www.springframework.org

Lucene – http://jakarta.apache.org/lucene/

Solarmetric Kodo – http://solarmetric.com/

Tangosol Coherence -- http://tangosol.com/index.jsp

Jetty – http://jetty.mortbay.com/jetty/index.html

Author

Howard Lewis Ship is the creator and lead developer for the Jakarta Tapestry and Jakarta HiveMind projects. He has over fifteen years of full-time software development under his belt, with over seven years of Java. He cut his teeth writing customer support software for Stratus Computer, but eventually traded PL/1 for Objective-C and NeXTSTEP before settling into Java. Howard is the author of Tapestry in Action for Manning Publications, and is currently an independent open-source and J2EE consultant, specializing in customized Tapestry training. He lives in Quincy, Massachusetts with his wife Suzanne, a novelist.