Many web development professionals are biased toward thinking that site performance optimizations are back-end related: database access, page and data caching, clustering, etc. Web browser and other front-end activities are often profiled as an afterthought, yet that's where all the relevant end-user activity takes place. A slow web site will lose viewers regardless of how useful it is, and many of the performance offenses occur between the front-end servers and the web browser. The book
High Performance Web Sites by
Steve Souders, Chief Performance Yahoo, gives a 14-step guide for improving the interactions between browsers and servers, and how design and implementation decisions affect them.
The book is written in a dynamic, no-nonsense style that makes it a breeze to go through. At 156 pages, including the table of contents and foreword, it's got to be one of the slimmest yet most useful books published this year.
The book consists of 17 chapters. The author made the decision to expand the common "what you should already know" snippet in the preface to a full two chapters, A and B, so that all the information needed for applying his 14 rules is in one place. The book's table of contents reflects this minimalist and efficient style:
Chapter A: The Importance of Frontend Performance
Chapter B: HTTP Overview
Rule 1: Make Fewer HTTP Requests
Rule 2: Use a Content Delivery Network
Rule 3: Add an Expires Header
Rule 4: Gzip Components
Rule 5: Put Stylesheets at the Top
Rule 6: Put Scripts at the Bottom
Rule 7: Avoid CSS Expressions
Rule 8: Make JavaScript and CSS External
Rule 9: Reduce DNS Lookups
Rule 10: Minify JavaScript
Rule 11: Avoid Redirects
Rule 12: Remove Duplicate Scripts
Rule 13: Configure ETags
Rule 14: Make Ajax Cacheable
Chapter 15: Deconstructing 10 Top Sites
Chapter A is the "tie-in" between back- and front-end that explains what happens with a web request, why it's important to take browser interaction into consideration, and introduces Steve's Performance Golden Rule:
Only 10-20% of the end user response time is spent downloading the HTML document. The other 80-90% is spent downloading all the components of the page.
How those other components are served is key for optimal performance. The recommendations in the rest of the book require equal participation of the application, front-end, user interaction, and back-end developers alike.
Each of the chapter begins with a short description of the problems addressed by the rule and puts the following recommendations in context. One or two examples follow the description, then the book dives into the solutions for each problem. In
Rule 1: Make Fewer HTTP Requests, for example, the main problem is latency in HTTP connections exacerbated by multiple page components fetched from the web. The solutions proposed include the use image maps, CSS sprites, inline images, and combining JavaScript and CSS into one or two larger files of each. The book even mentions caveats in some approaches (e.g. "Although this approach is not currently supported in Internet Explorer, the savings it can bring to other browsers makes it worth mentioning," about inline images). The content reflects hands-on experience that readers can leverage immediately.
Rule 6: Put Scripts at the Bottom is one of the best chapters in the book, perhaps one that should be required reading for front-end developers. It explains how JavaScript placed at the beginning of a page stops the web browser from downloading other page components such as HTML, CSS, or images until the script is complete. As a result, it doesn't matter that Firefox, Safari, and IE can fetch two or more components in parallel from a host by default because the concurrent downloads stop until a JavaScript file has been fetched. Moving the scripts to the bottom of the page will boost the speed at which other components load and create a better end-user experience. There is a golden nugget featured in this discussion (page 47) that describes how using multiple
CNAMEs for a component server can help to boost performance by doubling the number of concurrent requests that the browser will make to a particular host, per the HTTP/1.1 specification.
Chapter 15: Deconstructing 10 Top Sites is a good presentation of how different companies have applied the high performance rules. The analysis of Yahoo! could be a bit more biting but otherwise fits the mold of the rest of the chapter. Amazon, AOL, CNN, eBay, Google, MSN, MySpace, Wikipedia, and YouTube are the others. It's not clear if the discussion of CNN's site took place before their page re-design in the middle of this year; this analysis may be outdated now and re-evaluating the new site based on the 14 rules is an interesting exercise.
The biggest shortcoming of the book comes from its biggest strength: brevity. The author often describes what needs to be done, but spends little or no time discussing
how. For example, coverage of Gzip compression talks briefly about Apache 1.3 and 2.x configuration for Gzip and deflate, but dwells little on how to implement it. It also offers little exposition of how various versions of Internet Explorer choke on some compressed content types, or how to get around this.
Another example of how to fix something that is not included in the book would be combining multiple CSS files into one. Development and user experience groups tend to work on different parts of a page, each polishing a different CSS file for a component. The result could be a huge CSS file with lots of URL references. A couple of good how-to recommendations could be to build a single file from the multiple CSS snippets before serving it, by either having the server assemble a combined CSS file through JSP or PHP directives, or by creating the production CSS file during the continuous integration process (e.g. CruiseControl plus Ant/make/Maven tasks).
The burden of implementation falls on the reader, who needs to apply her experience and ability to search to find out how to apply a given rule. This is where mileage will vary, and a short resource guide, or a few configuration file examples in the text would be welcome and wouldn't bloat the book.
If you're looking for web optimization interactive tools, you may want to install the
YSlow plug-in for Firefox, also developed by Steve Souders. The plug-in works with
Firebug, and it generates a score card graded A - F for each rule described in the book. Clicking on the score will send you to the
Yahoo! Developer Network section where the rule is summarized - shorthand help that can be applied immediately from your own browser if the book itself isn't at hand.
High Performance Web Sites is one of the best books available on the subject. Its content and style make it a must-have in your collection, whether you're a front-end, back-end, or user interface developer.
High Performance Web Sites (first edition)
By Steve Souders
O'Reilly Media, Inc.
ISBN: 0-596-52930-9
Eugene Ciurana is the West Coast contributing editor for TheServerSide.com and the director of systems infrastructure at LeapFrog Enterprises. He can be found on IRC (##java, #esb, #security, #awk) under the /nick pr3d4t0r.