High availability client design - request for comments (long)

Discussions

EJB design: High availability client design - request for comments (long)

  1. Hi

    below I have some code for client access to an entity bean. The example is based on a Struts action class. The code effectively caches a home interface instance to avoid JNDI lookups. The motivation behind the code is for self-recovery in the case of an ejb container problem.

    Is this necessary given that in a clustered environment a 'smart' rmi stub may be used?

    Is this thread safe?

    Any comments much appreciated...

    regards

    Rob

    public class ShowEJBAction extends Action
    {
        private volatile EJBHome home = null; // a cached home instance
        private Properties env; // used to load environment configuration settings.

        //
        // default constructor
        //
        public ShowEJBAction()
        {
            super();

            //
            // look for a properties file which may contain configuration properties
            //
            InputStream is = this.getClass().getResourceAsStream("application.properties");
            env = new Properties();

            try
            {
                if (null != is)
                {
                    env.load(is);
                }
                else
                {
                    System.out.println("No properties provided. Using defaults.");
                }
            }
            catch (IOException e)
            {
                //
                // we don't have a servlet yet, so we cannot log a message
                //
                System.err.println("Warning - environment properties failed to load.");
            }

            //
            // lookup the home interface and cache it ready for use.
            //
            lookupHomeInterface();
        }

        /**
         * This method performs the main processing associated with the action. The method uses the cached home interface
         * to create a session bean instance to perform the work. This method attempts to recover from partial failures by recreating
         * the home interface. Hence the application should self-recover without a webserver restart.
         *
         * @return ActionForward - the strust response
         * @pre
         * @post
         * @param mapping
         * @param form
         * @param request
         * @param response
         * @throws IOException
         * @throws ServletException
         */

        public ActionForward perform( ActionMapping mapping, ActionForm form, HttpServletRequest request, HttpServletResponse response )
        throws IOException, ServletException
        {
            ActionForward retval = mapping.findForward( "Success" );

            //
            // get a local copy in case another thread resets the shared reference (atomic thread safe operation)
            //
            EJBHome localHome = home;
    EJB EJBean=null;
            try
            {
                //
                // if our local reference is null, then try to re-establish a new home instance
                //
                if (null == localHome)
                {
                    localHome = lookupHomeInterface();
                }

                if (null != localHome)
                {
                    //
                    // create an entity bean and call its business method.
                    //
                    EJBean = home.findByPrimaryKey(new EJBPK(1));
                    EJBean.getValueObject();
                }
                else
                {
                    //
                    // we have attempted to recover a new home interface but it was unsuccessful
                    //
                    retval = mapping.findForward( "Error" );
                    getServlet().log("Home interface is still null after re-attempt");
                }
            }
            catch(ObjectNotFoundException oe)
            {
                retval = mapping.findForward( "NotFound" );
            }
            catch(Exception e)
            {
    //
    // an exception here could indicate a problem with the home interface - so get another
    //
                resetHomeInterface(localHome);
                getServlet().log("Exception",e);
                retval = mapping.findForward( "Error" );
            }

            return retval;
        }

        /**
         * This method is used to lookup the home interface using jndi. This mehtod is
         * called during construction and when a new home interface instance is needed,
         * interface during recovery from a partial failure.
         * @return
         */
        protected EJBHome lookupHomeInterface()
        {
            EJBHome retval=null;

            try
            {
                //
                // if the home variable is null, we still need to look it up, otherwise
                // another thread has already beaten us, so just return the current instance.
                //
                synchronized(this)
                {
                    if (home == null)
                    {
                        Context initial = new InitialContext(env);
                        Object objref = initial.lookup(env.getProperty("EJBHome"));

                        retval = (EJBHome)PortableRemoteObject.narrow(objref,
                        EJBHome.class);

                    }
                }

                //
                // Note by doing this assignment outside of the sync block, we ensure that the object is complete,
                // but there may be more than one instance created in separate threads. This is acceptable as we are
                // not relying on a 'singleton' pattern.
                //
                home = retval;
            }
            catch(NamingException ne)
            {
                getServlet().log("Error whilst looking up EJB home interface : ",ne);
            }

            return retval;
        }

        /**
         * This method is used to dispose of the cached home interface. This method
         * may be called during a partial failure.
         * @param home
         */

        protected void resetHomeInterface(EJBHome home)
        {
            //
            // if the object's home instance is the same as the thread's local copy, then remove it. Otherwise
            // some other thread may have already reset and created a new instnce, so leave it alone.
            //
            if (this.home == home)
            {
                synchronized(this)
                {
                    if (this.home == home)
                    {
                        this.home = null;
                    }
                }
            }
        }
    }
  2. This code is not thread safe. There are plenty of subtle reasons, but I'm going to give you the simplest one, hoping you will be convinced to drop the whole thing rather than try and fix the problem.
    The problem lies in the lookupHomeInterface method. If two of these run concurrently, one thread may get back an uninitialized home object. Whoever wrote the code tried to avoid this by doing what would intuitively create a one-way memory barrier. That is, he first initialized a temporary variable inside a synced block, and then copied it outside the block (the comment states that's what he was trying to do). This intuition is wrong. The Java spec sais all actions performed inside a synced block must be flushed before monitorexit. It doesn't say actions that occur outside the synced block can't go inside it and occur before monitorexit. So the temporary variable is useless, and the code is still not thread safe. The VM could (and sometimes will) make a prescient write as an optimization and get rid of the temporary variable completely.
    Not only that, this code doesn't run any faster than if you simply synced the entire method - it allways enters a synced block anyway.
    BTW - you have another bug in lookupHomeInterface. If home wasn't null to begin with, retval won't get initialized and you'll return null. This has nothing to do with multithreading.

    To conclude: use normal synchronization. You don't get any optimizations with this code (and even if you did, it wouldn't be big enogth to make any difference). And you will most probably have a bug if you try to code lazy init in Java.

    Regards
    Gal
  3. I wrote:
    "And you will most probably have a bug if you try to code lazy init in Java".
    I actually ment:
    "And you will most probably have a bug if you try to code double-checked lazy init in Java".

    Obviously, you can easily write lazy-init using standard synchronization techniques.

    Gal
  4. Hi Gal,

    thanks for the comments - agree there is a bug in the lookup method...

    Appreciate your comments on DCL - and I'm aware of the subtle threading issues.

    The idea here is that the action object gets initialised with a home interface. All subsequent requests go through the perform method which is not synchronized. There is probably some synchronization inside the home interface object, but thats beyond my control. Hence in a perfect world - everything humms along...

    Now pretend that the EJB container dies, and that the naming service can failover to another container. Hence in the web container, I want to discard the cached home interface and lookup another. That way, I don't need to restart the web container.

    regards

    Rob
  5. The idea of caching the home interface and renewing it when it fails is perfectly reasonable, and even recommended. My comment wasn't about that.
    The problem is, for some reason you try to go around synchronizing the methods that access this cached home interface. That doesn't work, for the reason I've listed and other more subtle reasons. I didn't mean you should drop this design completely - only that you shouldn't try to avoid synchronization.
    The mere act of entering a monitor hardly has any implications on efficiency. What may hurt efficiency is the fact that entering a synchronized block forces the JIT compiler do drop it's internal cache (i.e, don't use prescient writes), and in a multi-processor environment forces the CPU to invalidate it's cache. Because you probably enter some synced block anyway (because there's probably some IO in the home interface impl), you don't gain considerable performance.

    My advise is, keep using this design but use "normal" synchronization. Maybe when the new memory model comes out it will be possible to implement double-checked locking, but currently it isn't worth your trouble (IMHO).

    Gal