Discussions

News: Book Review: High Performance Web Sites

  1. Book Review: High Performance Web Sites (3 messages)

    Many web development professionals are biased toward thinking that site performance optimizations are back-end related: database access, page and data caching, clustering, etc. Web browser and other front-end activities are often profiled as an afterthought, yet that's where all the relevant end-user activity takes place. A slow web site will lose viewers regardless of how useful it is, and many of the performance offenses occur between the front-end servers and the web browser. The book High Performance Web Sites by Steve Souders, Chief Performance Yahoo, gives a 14-step guide for improving the interactions between browsers and servers, and how design and implementation decisions affect them. The book is written in a dynamic, no-nonsense style that makes it a breeze to go through. At 156 pages, including the table of contents and foreword, it's got to be one of the slimmest yet most useful books published this year. The book consists of 17 chapters. The author made the decision to expand the common "what you should already know" snippet in the preface to a full two chapters, A and B, so that all the information needed for applying his 14 rules is in one place. The book's table of contents reflects this minimalist and efficient style: Chapter A: The Importance of Frontend Performance Chapter B: HTTP Overview Rule 1: Make Fewer HTTP Requests Rule 2: Use a Content Delivery Network Rule 3: Add an Expires Header Rule 4: Gzip Components Rule 5: Put Stylesheets at the Top Rule 6: Put Scripts at the Bottom Rule 7: Avoid CSS Expressions Rule 8: Make JavaScript and CSS External Rule 9: Reduce DNS Lookups Rule 10: Minify JavaScript Rule 11: Avoid Redirects Rule 12: Remove Duplicate Scripts Rule 13: Configure ETags Rule 14: Make Ajax Cacheable Chapter 15: Deconstructing 10 Top Sites Chapter A is the "tie-in" between back- and front-end that explains what happens with a web request, why it's important to take browser interaction into consideration, and introduces Steve's Performance Golden Rule:
    Only 10-20% of the end user response time is spent downloading the HTML document. The other 80-90% is spent downloading all the components of the page.
    How those other components are served is key for optimal performance. The recommendations in the rest of the book require equal participation of the application, front-end, user interaction, and back-end developers alike. Each of the chapter begins with a short description of the problems addressed by the rule and puts the following recommendations in context. One or two examples follow the description, then the book dives into the solutions for each problem. In Rule 1: Make Fewer HTTP Requests, for example, the main problem is latency in HTTP connections exacerbated by multiple page components fetched from the web. The solutions proposed include the use image maps, CSS sprites, inline images, and combining JavaScript and CSS into one or two larger files of each. The book even mentions caveats in some approaches (e.g. "Although this approach is not currently supported in Internet Explorer, the savings it can bring to other browsers makes it worth mentioning," about inline images). The content reflects hands-on experience that readers can leverage immediately. Rule 6: Put Scripts at the Bottom is one of the best chapters in the book, perhaps one that should be required reading for front-end developers. It explains how JavaScript placed at the beginning of a page stops the web browser from downloading other page components such as HTML, CSS, or images until the script is complete. As a result, it doesn't matter that Firefox, Safari, and IE can fetch two or more components in parallel from a host by default because the concurrent downloads stop until a JavaScript file has been fetched. Moving the scripts to the bottom of the page will boost the speed at which other components load and create a better end-user experience. There is a golden nugget featured in this discussion (page 47) that describes how using multiple CNAMEs for a component server can help to boost performance by doubling the number of concurrent requests that the browser will make to a particular host, per the HTTP/1.1 specification. Chapter 15: Deconstructing 10 Top Sites is a good presentation of how different companies have applied the high performance rules. The analysis of Yahoo! could be a bit more biting but otherwise fits the mold of the rest of the chapter. Amazon, AOL, CNN, eBay, Google, MSN, MySpace, Wikipedia, and YouTube are the others. It's not clear if the discussion of CNN's site took place before their page re-design in the middle of this year; this analysis may be outdated now and re-evaluating the new site based on the 14 rules is an interesting exercise. The biggest shortcoming of the book comes from its biggest strength: brevity. The author often describes what needs to be done, but spends little or no time discussing how. For example, coverage of Gzip compression talks briefly about Apache 1.3 and 2.x configuration for Gzip and deflate, but dwells little on how to implement it. It also offers little exposition of how various versions of Internet Explorer choke on some compressed content types, or how to get around this. Another example of how to fix something that is not included in the book would be combining multiple CSS files into one. Development and user experience groups tend to work on different parts of a page, each polishing a different CSS file for a component. The result could be a huge CSS file with lots of URL references. A couple of good how-to recommendations could be to build a single file from the multiple CSS snippets before serving it, by either having the server assemble a combined CSS file through JSP or PHP directives, or by creating the production CSS file during the continuous integration process (e.g. CruiseControl plus Ant/make/Maven tasks). The burden of implementation falls on the reader, who needs to apply her experience and ability to search to find out how to apply a given rule. This is where mileage will vary, and a short resource guide, or a few configuration file examples in the text would be welcome and wouldn't bloat the book. If you're looking for web optimization interactive tools, you may want to install the YSlow plug-in for Firefox, also developed by Steve Souders. The plug-in works with Firebug, and it generates a score card graded A - F for each rule described in the book. Clicking on the score will send you to the Yahoo! Developer Network section where the rule is summarized - shorthand help that can be applied immediately from your own browser if the book itself isn't at hand. High Performance Web Sites is one of the best books available on the subject. Its content and style make it a must-have in your collection, whether you're a front-end, back-end, or user interface developer. High Performance Web Sites (first edition) By Steve Souders O'Reilly Media, Inc. ISBN: 0-596-52930-9 Eugene Ciurana is the West Coast contributing editor for TheServerSide.com and the director of systems infrastructure at LeapFrog Enterprises. He can be found on IRC (##java, #esb, #security, #awk) under the /nick pr3d4t0r.

    Threaded Messages (3)

  2. Excellent review[ Go to top ]

    Hi Eugene, just a quick thanks for the effort you put into this thorough review; most insightful. You can bet your berries that a copy of this book will find its way to my shelf.
  3. ..and found the experience very rewarding. Being a typical backend developer I too was under the impression that all performance improvements happen at the backend, this book opened up another perspective. Also this book gives a very good perspective into the kind of issues that go into building a highly scalable commercial website. What makes the book useful is its brevity. It is possible to read it end to end in a couple of days. The principle's are explained with enough clarity allowing the reader to seek out more details. I wish more technical books were written like this. Also use the following link for discussions on each of the rules explained in the book.
  4. These rules work[ Go to top ]

    I've seen these rules applied, along with other similar rules, to bring down the loading time from 14 seconds to less than 3 seconds for a complex page in our web app. One particular technique that helped tremendously was to gzip the data in Ajax calls, since we can move a lot of data in one Ajax call on that particular page. Cheers, David Flux - Java Job Scheduler. File Transfer. Workflow. BPM.