Scope :

 This document   focuses on architecture and design strategies for providing consistent response time and handling  spurt of load for global  e-commerce application.  It   provides  a high level overview and detailed design  strategies  on   taking advantage of a CDN for performance and scalability. The document focuses on Akamai as the CDN which is being discussed considering it’s prevalence and writers familiarity.  It does leave out details for  streaming solutions, security etc provided by CDN vendors.   This document also does not  discuss resolving inconsistent response time problem by distributed  application deployments or compares multiple CDN vendors.

Target audience:

 The document is intended   for architect/designers who want to design the web sites for a CDN like Akamai.  It could also be used for people   interested in understanding impact of CDN usage.

 

Solution options:

Content delivery networks   were primarily established to handle  problem of internet congestion  and provide faster response (by  caching   closer to user locations)   around mid 1990s .  They specialize in providing an optimal internet route(e.g. SureRoute ) , caching closer to user(e.g. edge server caching  which can off load upto 80% of traffic to origin) along with   compression. These providers  now are  providing more value added service e.g. security , video streaming solutions , fail over specifically in case of network failover etc.

 

As elaborated  in diagram above,  Akamai helps to optimize a website performance by optimizing route to destination, compression,  providing custom high speed network and caching of content near user location.

CDN market  is a surprisingly unipolar market where there is the Akamai and rest(i.e. Limelight, Amazon Cloud front, SimpleCDN, Edgecast  etc.). Akamai currently delivers close to 20% of total internet traffic  and    is  a de-facto in an e-commerce application delivery area. This document focuses   on design strategy for Akamai but can in general be used for any CDN which caches based on URL.

 

Solution details:

Typical web sites  are composed of  three kind of sections/pages:

  • Group1 : Common information: This kind of information is common for all the users who visit the web site. An example could be a  public news broadcast for a volcano eruption  which need to be shown to all the users. This kind of information is an ideal  content to be cached with Akamai.
  • Group 2: User group specific information: This kind of information is shared between a user group e.g. users of a specific country see that country specific promotions page. This kind of information  can also be cached by Akamai but they need to have a way to differentiate them for different user groups e.g. having country as part of the URL.
  • Group 3: User specific information pages: This is the most private information which is specific to a user e.g. users shopping details. In normal scenario it does not make sense to cache these pages. Though, it makes a lot of sense to keep this information in a silo and if possible load it asynchronously .
  • Combination of above: Most  pages would be a combination of  above three  i.e. common information , user group specific information and user specific information e.g. home page   for a logged in user may have his name,  country specific promotions  and global news alert.  

 

 Subsequent sections provide details on how to design for above content types:

 

Design guidelines:

  1.  It’s never too early to consider  use of CDN:   Folk over coffee might  have  told  to consider CDN as a pluggable components which can be a thought in case application faces performance issues in production. My suggestion would be to change  coffee mates from tomorrow and start considering CDN as early as requirement elicitation phase ,   both functional(e.g. is it really required to have user specific information on home page )  and non functional (e.g.  how much extra load is expected during peak season   Cost to use CDN like Akamai could be significant and earlier everyone realizes  it , it gives enough time for them to discuss the value of it and be sure about it.    Akamai also provides custom tags which can help to cache fragment of pages. Decision to use them should also be taken in architecture phase.  It may also impact your development estimate if a purge screen need to be build for purging cached content.
  2. Use AJAX for personalized sections :  This is possible the most important decision one  would make in   solution design  which will impact caching from  Akamai. Design the areas which have  user specific or user group(group 2 and 3)  information  to be loaded through AJAX . This will ensure that the remaining page can still be cached and loaded by Akamai . Even group 2 information can be cached by Akamai  if the URL has a parameter to map information  to the user group.

 

  1.  Have Unique URL:  Akamai  uses URL as key, and  so having a unique, consistent  URL is key to having an  appropriate design for  Akamai. Solutions like webflow where similar URL  can provide different content  are problem child and will require lots of “workarounds” to make  them cacheable

 

  1. Consider CDN using capacity planning :  Well thought through screen design along with intelligent  implementation can  move significant(i.e. upto 80%)  server load to CDN provider. Typical promotional events (e.g. mail promotions)   have bounce rate of 40-80% where user come to landing page and leave after reading the offer. Most of this traffic can be moved to be delivered from Akamai by making sure that landing pages are generic or customization is achieved based on AJAX/cookies.
  2. Consider usage of cookies clearly: Akamai implementation can be dependent on cookies in cases where URL’s are not unique(i.e. same URL may give different content). Actually, it can also create a cookie for your domain if  required. In my experience this is not a recommended practice because there are   some kind of clients don’t support cookie like screen scraping software, some disable it or if a different domain site is accessing your site as affiliate. This might lead to inconsistent behavior across different user set.
  3. Consider impact of HTTPS: Akamai can support delivery of page over HTTPS. As    Akamai requires origin site  to  CNAME  their   domain to Akamai,   when user access a page on HTTPS  and certificate is registered to a different organization   then browser will   complain. Also Akamai has  a separate network which can be used for delivery of HTTPS  pages with extra security. This obviously comes with a different cost structure and delivery SLA. It is good to be aware of it and make a decision if site really requires HTTPS pages to be delivered over Akamai.
  4. Be aware of global implementation/roll out: Akamai has different edge servers which are located  in different location. This requires   time(i.e.  sometimes in hours)  for  configuration changes  to take effect globally. Clearing objects from cache might take few   minutes(i.e. around 10). Also the TTL across different servers  may be different resulting in different versions of files across different servers. This may create inconsistency across different locations at specific times if the cached version of files are different at different edge server. I know this does not sound like the bug you want to hear at 2 in the night. To avoid this it is better to  ask Akamai to provide report of object size across different geographies and flag any inconsistent objects pro actively..
  5. Plan for Akamai refresh: If your application publishes files which need to be refreshed on demand and urgently by business users , do consider providing a user interface for purging the cached content. Akamai does provide SOAP API  which can be used to clear content but it would require extra effort to build a UI to use these SOPA APIs. 
  6. Consider usage  of modules which   uses user  IP:  As Akamai intercepts all the request which are intended for  the  application, the code which tries to read a user IP would   get the Akamai server IP. It can be solved easily by telling Akamai to pass user IP as a separate header element which could be used in application. The challenge comes from  the  modules which work on user IP(e.g. mod_evasive,mod_security) as if  these modules expect that IP address will be available in a certain header  field then  it will create   issues. Either look at way to modify the code to refer to custom header  or choose a module which  makes it configurable.
  7. Use of Akamai logging:  Akamai does log lot of details about the way users are accessing your site and has  very critical information which can help in debugging tricky scenarios.  Akamai  removed log every couple of days so it would be good to    have  an area where Akamai can deliver the log files and can be used by sustenance team to debug.
  8. Aware of  changes  which  Akamai  can do:  Akamai can  compress the content and can modify HTTP headers(e.g. setting pragma headers). It is better to configure the compression and  test this thoroughly with web server compression.
  9. Test and test specific cases:   When Murphy said, "Anything that can go wrong, will go wrong”, He might very well be talking about global production rollouts. Adding Akamai to the picture add another  dimensions to roll out plans. Always  ensure that there is enough time( around 1-2 weeks)  to test almost completed (i.e. >80% UAT complete ) site using Akamai networks. This testing  should be done by both business and technical users because the issues could both be functional e.g. incorrect caching applied or more technical or the URL not changing in specific scenario. Be aware that Akamai test infrastructure is limited and users may experience delays which may be only for testing and should not create wrong impression.
  10. Don’t ignore  application level caching:  Akamai caching should not be considered as replacement of application level caching as both serve different purpose. For pages which need to be served from origin server, it is still a good idea to cache data/queries etc.

 

Conclusion: 

In conclusion, for global websites it  is almost mandatory to consider usage of CDN to have a consistent user response across globe and reduce load on servers. For   websites to use  CDN appropriately,  it is important to keep the recommendations in mind as this will help to make the journey  to go live smooth and also utilize the available tools.

References

  1. http://en.wikipedia.org/wiki/ARPANET
  2.  http://en.wikipedia.org/wiki/CNAME_record
  3. http://www.nytimes.com/2008/03/13/technology/13net.html?pagewanted=print
  4. http://www.nytimes.com/2011/02/22/technology/22iht-broadband22.html?_r=1
  5. http://en.wikipedia.org/wiki/Cyber_Monday

 http://www.searchenginejournal.com/walmartcom-crashes-on-black-friday/4041/