Discussions

EJB programming & troubleshooting: Improving System Efficiency by Method-Based Caching

  1. Dear reader,

    this message intends to promote the idea of method-based caching
    as a means to elegantly and effectively improve the efficiency of
    EJB-systems. In order to explain what this is about I have
    implemented a related API that uses javassist. My suggested
    approach helps to easily and transparently separate the aspect of
    client side caching from other system aspects.

    If you got curious download the API and check it out at
    http://www.ipd.uka.de/~pfeifer/dmcache.zip
    (Requires JDK 1.5 and ANT 1.5 or higher.)
    If you are undecided, go on and read the API's README file which I
    have attached below. (The general idea is based on a research
    paper of mine which is available at
    http://www.ipd.uka.de/~pfeifer/publications/doa03.pdf)

    I am looking forward to feedback and an interesting discussion. If
    there is any/enough support, I would be willing to contribute the
    API and the related concepts to the JBoss group.

    Greetings,

    Daniel Pfeifer

    ----------------- README (from dmcache.zip) ------------------------

    DMCache - A Dynamic Method-Based Caching API
    ============================================

    Daniel Pfeifer, $Date: 2004/12/09 10:16:40 $

    Download at: http://www.ipd.uka.de/~pfeifer/dmcache.zip


    PART 1: WHAT IS THIS THING GOOD FOR?
    ------------------------------------

    This API demonstrates a cache that dynamically, transparently and
    consistently caches results of method invocations. It intends to
    promote the idea of method-based caching as an excellent option
    for improving the efficiency of modern information systems.

    Method-based caching can be useful in the context of layered
    architectures where a layer is abstracted by a set of Java
    interfaces. A good example is a servlet-based web server whose
    servlets invoke EJB methods (all EJBs are abstracted by Java
    interfaces). Method results are cached at the client side (which
    happens to be a web server for the example) and so at a cache hit,
    a costly and potentially remote EJB method call can be avoided. In
    this context the approach is ALWAYS A BETTER OPTION THAN DYNAMIC
    WEB CACHING. The reasons for this exciting statement are explained
    below. An important example of a dynamic web caching approach is
    JESI (see http://www.esi.org/jesit_tag_lib_1-0.html).

    Note that method-based caching can be applied in such a way that
    it maintains 100% cache consistency (also called strong cache
    consistency). In order to do so, an application developer has to
    annotate the respective methods of a service interface. The
    annotations form a so called "cache model". Apart from these
    annotations, a respective cache remains 100% transparent
    (invisible) to the client AND to the server code and so it can be
    deployed in very late project cycles without harming existing code
    and system functionality. In terms of aspect-oriented programming,
    method-based caching may be considered a way to separate a caching
    aspect from other system aspects.

    For detailed information on the general idea of method-based
    caching, please read the paper available at
    http://www.ipd.uka.de/~pfeifer/publications/doa03.pdf
    By means of an experiment and a benchmark application, the paper
    shows that method-based caching can considerably increase the
    overall system efficiency of a real world EJB-based web
    application.

    As apposed to the implementation presented in the referenced
    paper, DMCache (this API) is fully dynamic. In particular, all
    cache classes that implement a layer's Java interfaces (e.g. a set
    of EJB interfaces) are generated at system RUNTIME using a dynamic
    proxy approach. Further, cache models are not specified via
    XML-files but by means of method annotations. (Annotations are a
    new feature of JDK 1.5.)

    The API is in an early state but functional and tested. The source
    code in the package "ord.ipd.dmcache.model.test" gives an idea on
    how to use it and what it can do.

    In practice, this kind of caching is meant to entirely REPLACE
    DYNAMIC WEB CACHING FOR ARCHITECTURES WITH AN EJB-LIKE APPLICATION
    LAYER. A good example is a servlet-based web server with servlets
    invoking EJBs. Since EJB calls are expensive it may be useful to
    cache the respective calls' results for read-only calls. In such a
    case, EJB calls are usually expensive and form the system's
    bottleneck - however servlet computations are cheap (apart from
    the included EJB calls).

    The following considerations reveal that under these
    circumstances, method-based caching is A LOT SMARTER than dynamic
    web caching, because the former
    1) leads to better hit rates than dynamic web caching (or at least
       the same hit rates),
    2) makes page fragmentation approaches such as ESI obsolete,
    3) is very likely to consume less memory dynamic web caching,
    4) potentially provides strong cache consistency,
    5) does not pollute the server or client code with
       nasty cache-related code-snippets or tags.

    The following paragraphs explain why this is true:

    1) A servlet-based web page computation my be considered as a
    function page(m_1(a_1_1),...,m_k(a_1_k)) with m_1(a_1_1) to
    m_k(a_1_k) being respective EJB calls and a_1_1 to a_1_k being
    respective method arguments. (The argument values a_1_1,...,a_1_k
    are derived from the parameters of a corresponding HTTP page
    request.) If m_1(a_1_1) to m_k(a_1_k) are read-only method calls,
    then page(...) quite likely may be cached (as a dynamic web page).

    One gets a hit for page(...) only if the respective page request
    parameters correspond to the arguments for the underlying
    EJB-method calls, namely a_1_1,...,a_1_k. Obviously this case also
    leads to corresponding hits in the case of method-based caching.
    Thus, the hit rates of method-based caching are at least as good
    as for dynamic web caching.

    Now consider a second page computation page_2(m_1(a_2_1),...,
    m_k(a_2_k)) which is based on other request parameters than
    page(...). If a_2_i = a_1_i holds for some i in {1,...,k} then one
    will get a hit for the method-based cache but not for a respective
    dynamic web cache (since the request parameters between page(...)
    and page_2(...) differ). Thus method-based caching can cause even
    better hit rates than dynamic web caching!

    2) Method-based caching makes page fragmentation approaches. This
    follows straight from 1): At best, page fragmentation can only
    produce fragments so tiny that at least zero to one EJB method
    calls m_i(a) will be contained in a fragment computation. If the
    fragment computation contains zero EJB-calls, then its computation
    is very efficient and so caching the fragment is useless (remember
    that servlet executions are usually efficient apart from their
    embedded EJB-calls). If the fragment computation contains one EJB
    method call then the method-based cache has the same potential of
    producing a cache hit in respect to m_i(a).

    3) Web page code is usually highly redundant because it contains a
    lot of rendering information. Therefore cached web pages are
    usually stored on disk and must be read from disk at a cache hit.
    In contrast, method results are usually a lot less redundant and
    come in a compact binary format. Thus cached method results can be
    kept in memory - no disk access is necessary at a cache hit.

    4) Strong cache consistency can be reached via cache models (see
    above). Alternatively timeout-based approaches may be suitable
    too.

    5) Using dynamic web caching, JESI tags or other code for cache
    consistency usually must be embedded in servlets. It make the
    servlet code error prone and less readable. In contrast,
    annotations for method-based caching are compact and well
    separated from other system code. They are located in front of
    method declarations as JDK 1.5 annotations.


    PART 2: BUILDING
    ----------------

    The API requires the JDK 1.5 and ANT. (It works with ANT 1.6 or
    higher - lower version are not tested.) In order to build it,
    please set the environment variables "JAVA_HOME" and "ANT_HOME"
    appropriately and go to the directory "dmcache/bin".

    Run "build.bat" on Windows or "build.sh" on Unix and find the
    results in "dmcache/build/jar":
    dmcache.jar - the API's library.
    dmcachetest.jar - the test code.
    "build.bat" also runs the JUnit test whose result can be found
    in "dmcache/log/DMCacheTestResult.txt".

    "build.bat javadoc" or respectively "build.sh javadoc" generates the
    Java documentation in "dmcache/build/javadoc".
    ___
  2. this loike pretty interesting .

    1) where is this Method based caching going to run. ? Which tier , Web tier or Apps tier ? my concern is more on the GC ? will it have an impact on overall heap allocation as well GC patterns ?