Scope :
This document focuses on architecture and design strategies for providing consistent response time and handling spurt of load for global e-commerce application. It provides a high level overview and detailed design strategies on taking advantage of a CDN for performance and scalability. The document focuses on Akamai as the CDN which is being discussed considering it’s prevalence and writers familiarity. It does leave out details for streaming solutions, security etc provided by CDN vendors. This document also does not discuss resolving inconsistent response time problem by distributed application deployments or compares multiple CDN vendors.
Target audience:
The document is intended for architect/designers who want to design the web sites for a CDN like Akamai. It could also be used for people interested in understanding impact of CDN usage.
Solution options:
Content delivery networks were primarily established to handle problem of internet congestion and provide faster response (by caching closer to user locations) around mid 1990s . They specialize in providing an optimal internet route(e.g. SureRoute ) , caching closer to user(e.g. edge server caching which can off load upto 80% of traffic to origin) along with compression. These providers now are providing more value added service e.g. security , video streaming solutions , fail over specifically in case of network failover etc.
As elaborated in diagram above, Akamai helps to optimize a website performance by optimizing route to destination, compression, providing custom high speed network and caching of content near user location.
CDN market is a surprisingly unipolar market where there is the Akamai and rest(i.e. Limelight, Amazon Cloud front, SimpleCDN, Edgecast etc.). Akamai currently delivers close to 20% of total internet traffic and is a de-facto in an e-commerce application delivery area. This document focuses on design strategy for Akamai but can in general be used for any CDN which caches based on URL.
Solution details:
Typical web sites are composed of three kind of sections/pages:
- Group1 : Common information: This kind of information is common for all the users who visit the web site. An example could be a public news broadcast for a volcano eruption which need to be shown to all the users. This kind of information is an ideal content to be cached with Akamai.
- Group 2: User group specific information: This kind of information is shared between a user group e.g. users of a specific country see that country specific promotions page. This kind of information can also be cached by Akamai but they need to have a way to differentiate them for different user groups e.g. having country as part of the URL.
- Group 3: User specific information pages: This is the most private information which is specific to a user e.g. users shopping details. In normal scenario it does not make sense to cache these pages. Though, it makes a lot of sense to keep this information in a silo and if possible load it asynchronously .
- Combination of above: Most pages would be a combination of above three i.e. common information , user group specific information and user specific information e.g. home page for a logged in user may have his name, country specific promotions and global news alert.
Subsequent sections provide details on how to design for above content types:
Design guidelines:
- It’s never too early to consider use of CDN: Folk over coffee might have told to consider CDN as a pluggable components which can be a thought in case application faces performance issues in production. My suggestion would be to change coffee mates from tomorrow and start considering CDN as early as requirement elicitation phase , both functional(e.g. is it really required to have user specific information on home page ) and non functional (e.g. how much extra load is expected during peak season Cost to use CDN like Akamai could be significant and earlier everyone realizes it , it gives enough time for them to discuss the value of it and be sure about it. Akamai also provides custom tags which can help to cache fragment of pages. Decision to use them should also be taken in architecture phase. It may also impact your development estimate if a purge screen need to be build for purging cached content.
- Use AJAX for personalized sections : This is possible the most important decision one would make in solution design which will impact caching from Akamai. Design the areas which have user specific or user group(group 2 and 3) information to be loaded through AJAX . This will ensure that the remaining page can still be cached and loaded by Akamai . Even group 2 information can be cached by Akamai if the URL has a parameter to map information to the user group.
- Have Unique URL: Akamai uses URL as key, and so having a unique, consistent URL is key to having an appropriate design for Akamai. Solutions like webflow where similar URL can provide different content are problem child and will require lots of “workarounds” to make them cacheable
- Consider CDN using capacity planning : Well thought through screen design along with intelligent implementation can move significant(i.e. upto 80%) server load to CDN provider. Typical promotional events (e.g. mail promotions) have bounce rate of 40-80% where user come to landing page and leave after reading the offer. Most of this traffic can be moved to be delivered from Akamai by making sure that landing pages are generic or customization is achieved based on AJAX/cookies.
- Consider usage of cookies clearly: Akamai implementation can be dependent on cookies in cases where URL’s are not unique(i.e. same URL may give different content). Actually, it can also create a cookie for your domain if required. In my experience this is not a recommended practice because there are some kind of clients don’t support cookie like screen scraping software, some disable it or if a different domain site is accessing your site as affiliate. This might lead to inconsistent behavior across different user set.
- Consider impact of HTTPS: Akamai can support delivery of page over HTTPS. As Akamai requires origin site to CNAME their domain to Akamai, when user access a page on HTTPS and certificate is registered to a different organization then browser will complain. Also Akamai has a separate network which can be used for delivery of HTTPS pages with extra security. This obviously comes with a different cost structure and delivery SLA. It is good to be aware of it and make a decision if site really requires HTTPS pages to be delivered over Akamai.
- Be aware of global implementation/roll out: Akamai has different edge servers which are located in different location. This requires time(i.e. sometimes in hours) for configuration changes to take effect globally. Clearing objects from cache might take few minutes(i.e. around 10). Also the TTL across different servers may be different resulting in different versions of files across different servers. This may create inconsistency across different locations at specific times if the cached version of files are different at different edge server. I know this does not sound like the bug you want to hear at 2 in the night. To avoid this it is better to ask Akamai to provide report of object size across different geographies and flag any inconsistent objects pro actively..
- Plan for Akamai refresh: If your application publishes files which need to be refreshed on demand and urgently by business users , do consider providing a user interface for purging the cached content. Akamai does provide SOAP API which can be used to clear content but it would require extra effort to build a UI to use these SOPA APIs.
- Consider usage of modules which uses user IP: As Akamai intercepts all the request which are intended for the application, the code which tries to read a user IP would get the Akamai server IP. It can be solved easily by telling Akamai to pass user IP as a separate header element which could be used in application. The challenge comes from the modules which work on user IP(e.g. mod_evasive,mod_security) as if these modules expect that IP address will be available in a certain header field then it will create issues. Either look at way to modify the code to refer to custom header or choose a module which makes it configurable.
- Use of Akamai logging: Akamai does log lot of details about the way users are accessing your site and has very critical information which can help in debugging tricky scenarios. Akamai removed log every couple of days so it would be good to have an area where Akamai can deliver the log files and can be used by sustenance team to debug.
- Aware of changes which Akamai can do: Akamai can compress the content and can modify HTTP headers(e.g. setting pragma headers). It is better to configure the compression and test this thoroughly with web server compression.
- Test and test specific cases: When Murphy said, "Anything that can go wrong, will go wrong”, He might very well be talking about global production rollouts. Adding Akamai to the picture add another dimensions to roll out plans. Always ensure that there is enough time( around 1-2 weeks) to test almost completed (i.e. >80% UAT complete ) site using Akamai networks. This testing should be done by both business and technical users because the issues could both be functional e.g. incorrect caching applied or more technical or the URL not changing in specific scenario. Be aware that Akamai test infrastructure is limited and users may experience delays which may be only for testing and should not create wrong impression.
- Don’t ignore application level caching: Akamai caching should not be considered as replacement of application level caching as both serve different purpose. For pages which need to be served from origin server, it is still a good idea to cache data/queries etc.
Conclusion:
In conclusion, for global websites it is almost mandatory to consider usage of CDN to have a consistent user response across globe and reduce load on servers. For websites to use CDN appropriately, it is important to keep the recommendations in mind as this will help to make the journey to go live smooth and also utilize the available tools.
References
- http://en.wikipedia.org/wiki/ARPANET
- http://en.wikipedia.org/wiki/CNAME_record
- http://www.nytimes.com/2008/03/13/technology/13net.html?pagewanted=print
- http://www.nytimes.com/2011/02/22/technology/22iht-broadband22.html?_r=1
- http://en.wikipedia.org/wiki/Cyber_Monday
http://www.searchenginejournal.com/walmartcom-crashes-on-black-friday/4041/