Discussions

News: Opinion: Enemy of the State

  1. Opinion: Enemy of the State (19 messages)

    Gregor Hohpe enjoys arguing with people over a beer. His latest rant revolves around what it means to really be 'stateless' in your architecture. He discusses visible/invisible state, recoverable state, conversational state, and more.

    State is like the Matrix
    State is like the Matrix. It is everywhere around us but most of the time you don't notice it. Every modern computer is stateful. A Turing machine is stateful, in particular the band holds state and the head is at a certain position, which is also state. Every modern computer has an instruction counter that points to the next instruction
    to be executed. That is state. Almost every modern machine has a call stack that remembers which methods you executed last. That is state, too. I think the last truly "stateless" computer I can think of are analog computers, wired together from transistors and resistors. But even those have a stateful element, the capacitor. So, chances you are
    whatever you build outside of an op-amp, it is stateful. Get over it.
        
    Once we get this part out of our system, the debate of statelessness becomes a lot more productive. So the question is not whether a system has state, but rather:

    • Does the state matter?
    • For the state that matters, where is it held?

    Enemy of the State

    Threaded Messages (19)

  2. More buzzwords[ Go to top ]

    If you want a truely buzzword-compliant architecture, you need to add a few more to Gregor's list of "loosely coupled" and "stateless"

    -service oriented
    -fully scalable (preferably prefixed with linear)
    -robust
    -performant
    -pattern-based

    (out of fashion buzzwords include object-oriented, XML-based, Web Services)


    And if you're describing a product, you need to add:

    -standards based
    -vendor independent
    -certified (doesn't seem to matter by whom)
    -low learning curve (every product ...)
    -ajile (for development tools)



    PJ Murray
    CodeFutures - Java Code Generation
    http://www.codefutures.com
  3. Opinion: Enemy of the State[ Go to top ]

    Unfortunately, Sun produced a set of blueprints, articles and interviews that caused an entire generation of enterprise developers to follow the false prophet of stateless architecture. That's hardly surprising, considering:

    1) Applications have state.

    2) If you can produce a segment of the application that is stateless, that simply means that you have delegated the state management to a subsequent segment.

    3) For J2EE applications, that typically means that you have delegated the state management to a database.

    4) For high-scale applications, that typically means that the database needs to run on a big box.

    5) Sun sells big boxes.

    Don't get me wrong: Stateless applications are easier to build, and if databases scaled linearly and big boxes didn't cost any more than small boxes, then stateless would be the way to go. (With a typical ecommerce site transactional mix, a 64-CPU server running Oracle may provide only five or ten times the transactional throughput of a 2-CPU Dell box running Microsoft SQL Server, but it could easily cost a thousand times as much.)

    The problem is that stateless J2EE applications tend to create a SPOB (Single Point Of Bottleneck), they tend to scale poorly, they tend to perform poorly, and they tend to be terribly expensive to scale.

    Combine that with the fact that J2EE took a large amount of business logic from the database tier and moved it "up to" the J2EE tier, adding the cost and latency of at least one external request to a database for every "business logic" operation. (Compare the performance of a complex PL/SQL stored procedure with the same logic implemented in Java using JDBC to grab and update data from the database. For kicks, repeat the same exercise with entity beans.)

    Combine that with the use of the intranets and the web itself to provide many of these J2EE applications across an enterprise or even to the general public, and the user load on these systems can surge much higher than previous iterations of the same application.

    There are only two choices:

    1) Put the logic back into the database.

    2) Move at least some of the state up into the J2EE tier.

    We chose to support the second route by providing the means (Coherence) for J2EE applications to share and manage data in the J2EE tier, and since the scalability problem typically only shows up when you start to cluster the J2EE tier, we implemented the data sharing itself as a cluster.

    The result of having some state available for use in the J2EE tier can be amazing. In one of our customers' applications, the average page time dropped from over 15 seconds down to 18 milliseconds (three orders of magnitude improvement -- without even using content caching!) and the database utilization dropped by well over 90%. Another (an ASP) hosts several thousand fully dynamic (and very stateful) web sites from a single cluster, and their database utilization per customer dropped by well over 90%.

    Almost every J2EE application that I have seen that scales and/or performs poorly does so because it doesn't make efficient use of state in the J2EE tier.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  4. Sun is going to get mad at you[ Go to top ]

    If every one manages state intelligently, than who is gonna buy huge E15k and E20K servers from Sun? Sun needs those millions for a 15K. Some of my friends joke that EJB was invented to sell big sun servers. having used a few E4500, 6800 servers in the past, they are nice systems.

    Joking aside, I'm definitely in favor of applying javaspaces, jcache and grid techniques to high availability/high performance design challenges. It's much easier to pick up a low/mid end PC with 4Gb of ram than ordering a E6800, or E15K from Sun. The last time I ordered a E4500 it took 6 months for delivery. Kinda hard to provide extra capacity on demand if it takes 6 months to get the hardware. About the only way is to monitor the performance and generate logs nightly. That way, if you have to use big servers, you can atleast order the hardware 8 months in advance.
  5. Opinion: Enemy of the State[ Go to top ]

    Unfortunately, Sun produced a set of blueprints, articles and interviews that caused an entire generation of enterprise developers to follow the false prophet of stateless architecture.

    Cameron putting that balme only on SUN is not Fair, it's tout by MS way before SUN. Forgot about the blue prints or white papers and remember SUN gave you ststeful session bean VS MS stateless MTS components... The reason now all big companies wants you to think in terms of stateless components is just becuase of Webserviced FUD and again SUN is not alone there, as a matter of fact SUN joined this bandwagon way after every one (IBM, MS, HP, etc)

    "Stateless applications are easier to build"

    I always think it is opposite, stateless components are harder to build and easier to scale, just becuase of the fact that some where in the tier some one has to maintain the state and in case of statless component it is the developers repsonsibiltiy..
  6. Opinion: Enemy of the State[ Go to top ]

    Unfortunately, Sun produced a set of blueprints, articles and interviews that caused an entire generation of enterprise developers to follow the false prophet of stateless architecture.

    Cameron putting that balme only on SUN is not Fair, it's tout by MS way before SUN. Forgot about the blue prints or white papers and remember SUN gave you ststeful session bean VS MS stateless MTS components...

    Stateful session beans are designed for "per client" state (or even finer grained). They aren't even re-entrant. Regarding "statefulness", they are basically unusable for application state, although they do have their uses (like the callback interface support).

    Regarding the Sun bashing ;-) .. I was referring to articles such as:

    http://www.artima.com/intv/distrib4.html
    Stateless applications are easier to build

    I always think it is opposite, stateless components are harder to build and easier to scale, just becuase of the fact that some where in the tier some one has to maintain the state..

    I was referring to the ability to leave the state in the database, while all of the logic is in the J2EE tier.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  7. So Sun advocates statelessness? That was something new. That most be very recently! (it is really difficult to keep up to what is in and what is out these days :) Perhaps someone can give me some links to those Sun pages?

    On the contrary it has always been Microsoft that has preached stateless on the server. Personally I have always developed in this way, even before the Web, with old fashioned Client-Server technology. Now with SOA and the new "Client-Server on Steroids", I find that my routines and practices still holds, (as they did under the "browser-period"), amazing for such a long period of time.

    Using any kind container managed sessions on the server just invites trouble. Similar to laying your head on the stock waiting for the executioner. Server statelessness is a little more difficult to develop, but saves you from a whole nest of problems further ahead. Above all it gives you fully linear scalability.

    Or you could see it this way. Using session variables on the server is similar to the small boys that kisses in their pants in the winter: first it is nice and warm, but after a while its getting unpleasant.

    Regards
    Rolf Tollerud
  8. Http is really stateless[ Go to top ]

    I hate to point out the obvious, but the world of state isn't really bi-polar as the vendors would like everyone to believe. HttpSession is a type of session that is light-weight, but I've seen people load a huge graph into HttpSession. By huge, I mean a users' entire addressbook. I've even seen people use simple mechanisms to persist changes to the cached data in HttpSession. In essence, it is stateful in that state is being actively cached and maintained.

    I've seen plenty of ASP and JSP developers do this, so it probably more common than desirable. The whole bi-polar perspective of state is rather unproductive, because sometimes you really should actively manage state on the client or the server and sometimes both. Ruling out one approach for the other is a good way to get blind sided.

    I think cameron's advice of using state carefully is solid advice. If you need to manage state, test the options and approaches and then pick on that fits the requirements. Doing it the other way creates huge headaches. I've worked for CTO's that make declarations like, "we must only use stateless", or "we should use stateful when we can."

    Then again, the majority of the CTO's out there are pretty worthless in my book. The good ones really shine and make it a joy to work. I've only come across 1 in the last 7 years, but there are good CTO's out there that can take a pragmatic approach.
  9. Http is really stateless[ Go to top ]

    I hate to point out the obvious ... Then again, the majority of the CTO's out there are pretty worthless in my book. The good ones really shine and make it a joy to work. I've only come across 1 in the last 7 years, but there are good CTO's out there that can take a pragmatic approach.

    :)

    Wish this was obvious to more. Unfortunately most are oblivious to it.
  10. Http is really stateless[ Go to top ]

    I think in the future there will be lots of software packages that will require continuous access to online resources, and those apps will present a PC client face to the user while the heart of the applicaiton logic uses the web, either via web services, ftp, or http clients, to get the job done. I see this as being an adjunct to websites, where a public user could see the website one way and a subscriber could see it a different way, and a subscriber with the rich client would see it a third way. This is an efficient way to develop money-making websites, in my opinion.

    These new web applications will be every bit as complex as EJB-based applications, but the server side will be either web services or servlets, and the client will be the above0-mentioned rich client.

    HTTP is stateless; Servlets are bound to a session state; but there's really no need to maintain the client state on the web site. You can always maintain a record of a user's activities on the server side, presumably in your database, but the "workflow management" can be on the client side.

    Roger Hoppe
  11. I don't know why I keep coming back here[ Go to top ]

    This is the kind of dreck that makes this site so laughable. The linked article is barely a fluff piece. It's less interesting than many blog entries. The article has no real point, makes not claims one way or the other, and doesn't provide any real insight into anything.
  12. Opinion: Enemy of the State[ Go to top ]

    For any Object, the less state it contains, the more scalable it will be. It should always be able to get the state from other Objects, whether that state is kept in read-only Entity Beans, Singletons, or Objects maintained in some sort of directory.

    The client should send state information (or a representative key value so it can be plucked from a database or cache) to the server Objects.

    Server Objects directly responding to a client should not maintain a state of their own. They should delegate this function to one of the solutions above (Entity/directory/Singleton), but ONLY insofar as they have to for functionality or efficiency.

    Bottom line - entangling service Objects with state information limits scalability. Breaking this funcitonality out allows more efficient use of the system.
  13. Opinion: Enemy of the State[ Go to top ]

    [Greg, a pity that you did not post this message a week earlier. I was at JavaPolis in Antwerp and we could have talked about this over a beer. I wouldn’t even have bothered paying for the beers (they are still relatively cheep in Belgium, so we could even have talked for several hours before you got me ruined :-)]

    I took some time to look at this state issue from a slightly different perspective.

    When defining “state that matters” for a given application or component in an SOA, it is also interesting to consider the notion of a session – or how you define a session for a given component in your architecture. I believe the state that matters is related to that session.

    The smallest kind of session that is relevant in a service-based architecture is a session that lasts for the duration of a single invocation of a “self-contained” service (a service that does not need call other services to do its job). These sessions are relatively short, which is the reason why these services are often called “stateless services”.

    Though, even so-called stateless services have state in the call stack, and that state is certainly state that matters. This state matters as long as the session lasts – in this case the duration of the service execution. If we would implement that same service using an asynchronous event-driven architecture that state would have to be kept somewhere else, but this does not make that implementation more or less statefull than the stack-based approach. What matters is how that state is managed. In the stack-based approach the state is automatically managed by the JVM, while in the other case you will have to implement some state management yourself. Anyway for these so called stateless services, state can easily be managed as the end of the session is well defined – at the end of the service execution.

    So, the fact that these kind of service implementations scale better is not due to the fact that the hold less state, but because the session is short and the state has to be kept for a very short time.

    Another aspect to consider for this kind of services is data caching. State in the cache is not related to a session and thus is not state that matters. So, in my opinion applications that use data cashes are not more statefull that the same application that does not use a cache. I believe that making your application statefull (= doing application-level caching) for performance reasons is not a good idea. Instead, use the intelligent caching strategies implemented in your persistency solution.

    Furthermore, I don’t think that better scalability and easier clustering are the only advantages of using so-called stateless services. Equally important is the fact that stateless services are a lot easier to design and implement. You don’t have to concentrate too heavily on concurrency issues, session management, etc. (this should be a non-issue when you use a J2EE application server). So if your requirements allow you to work in a “stateless” manner, please do so.

    Though, in many circumstances we simply need a session that lasts longer.

    A second kind of session is a session that lives as long as a conversation lasts. The conversational state related to those sessions is traditionally managed by a statefull session bean or in the session of a Web container. The problem with this kind of session is that it is much harder to manage the state related to the session, as the sessions last much longer and the end of the session is less well-defined (in many cases it is defined by a timeout).

    A third kind of session is a business process (workflow) session. These sessions are typically long-lived and in many cases the session state is persisted between the different steps in the workflow. This kind of state is traditionally kept in the DB or taken care of by some workflow engine.

    I think it is not a bad idea to avoid keeping conversational or business process state in the service implementations. Business process engines or workflow products are entering the market that have the ability to handle that state for us. These products should be able to handle both conversational state (micro-flows) and business process state (macro-flows) equally well. I still haven’t found the ultimate solution or product, but I am sure we will get there in a while ...

    So, for our future SOA-based applications we should try to keep conversational and business process state on a workflow engine, keep the service implementations stateless, and use advanced (but not application specific) data caching techniques in your service implementations.

    Cheers.
  14. Opinion: Enemy of the State[ Go to top ]

    Sven,
    So, in my opinion applications that use data cashes are not more statefull that the same application that does not use a cache. I believe that making your application statefull (= doing application-level caching) for performance reasons is not a good idea. Instead, use the intelligent caching strategies implemented in your persistency solution.

    Can you clarify what you mean?

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  15. Opinion: Enemy of the State[ Go to top ]

    My second post. Perhaps my first was too ambiguous.

    My main point was this: In a client-server application, the server Objects should not contain client state. That state should be either passed to them by the client with each request or maintained in a server-side cache or database, with the client passing a key value with each request. Otherwise, the applicatoin server will have to jump through hoops (and very expensively, I might point out) to be scalable.

    As for the question of whether the persistence maintains the cache or an in-memory cache such as the Coherence product is used, there is no advantage of one over the other, so I would choose whichever one is cheapest or fastest in a particular implementation (tradeoffs abound).

    Session state includes client state for the duration of the session. It's a proven fact that stateful session beans do not scale as well as stateless session beans or servlets. In fact, servlets scaled best of all.

    Application state and session state are maintained with no muss, no fuss by servlet engines. What I really want to know is this: Why should anyone ever use EJBs?

    Roger Hoppe
  16. Opinion: Enemy of the State[ Go to top ]

    My main point was this: In a client-server application, the server Objects should not contain client state. That state should be either passed to them by the client with each request or maintained in a server-side cache or database, with the client passing a key value with each request. Otherwise, the applicatoin server will have to jump through hoops (and very expensively, I might point out) to be scalable.

    I fully agree!
    It's a proven fact that stateful session beans do not scale as well as stateless session beans or servlets. In fact, servlets scaled best of all.

    You could indeed manage the conversational state in the Web container. Though, for complex applications with complex conversations, managing that state can be become a nightmare (and I still haven’t found a user friendly workflow management solution in the Web application framework). In these situations I think that the introduction of an additional “workflow engine” layer makes sense. This workflow engine should be able to manage the conversation state through micro-flows in a well-performing and scalable manner.

    Sven.
  17. Opinion: Enemy of the State[ Go to top ]

    Cameron,

    When I say that “state in the cache is not state that matters”, I look at the problem purely from an application developer’s perspective (but I can imagine that you see things a bit differently ;-). As a developer I see the cache simply as an in-memory extension of the DB (the DB cache is not related to the service session as I described it). I also believe that the cache should be hidden behind the interfaces of the persistency solution (e.g. the JDO PersistenceManager, the Hibernate Session, etc.) if possible, so that it becomes completely transparent.

    As a result I do not consider the state in the cache as service state, in the same way that I do not consider the data in the DB as service state (I realise that also the latter statement is disputable). I think that from an application developer’s point of view this makes a lot of sense. For him it’s only the state in the application that he is directly involved with - and that he has to manage - that matters. Besides, developers should care about state management as little as possible.

    This doesn’t mean, however, that I realise that the cache needs memory – memory that cannot be occupied by service session state. The more memory we reserve for the cache, the fewer sessions can be supported simultaneously, but the faster the application will run. Though, as cache size and strategy should configurable it should be possible to find the right balance for your application.

    The point that would like to make is that application developers should avoid implementing custom caching solutions directly in their service logic. I believe that cache product vendors (or open source alternatives) are – or at least should be ;-) – a lot more proficient at implementing robust caching mechanism and strategies to manage the state in their caches than the casual application developer is. And I am not going into details, because I’m sure that you can tell me lot more about caching strategies than I’m able to tell you.

    There are always exceptions of course, where a custom solution still makes sense. Though the more I think about it, the fewer examples I can think of where the state management could not be handled by an advanced general purpose caching product (coupled to a DB).

    Cheers,
    Sven De Smit.
  18. Opinion: Enemy of the State[ Go to top ]

    Hi Sven,

    I wasn't disagreeing, I just wasn't sure what you were suggesting. Here's my own $.02 (€.01) on caching decisions:

    1. Applications that run fine (response time, user load, expense of the infrastructure, etc.) without caching shouldn't cache. While it seems obvious, many applications don't need to scale or run any faster than they already do, and any additional complexity (such as caching) is just entropy.

    2. As you describe, caching in the DB access layer is often a big win by itself. (Six different JDO vendors, Hibernate and JDX all support Coherence as a clustered caching plug-in for this reason.) Transaction management etc. makes DB access layer caches safe, assuming proper use.

    3. Some state is very naturally cacheable (clustered or otherwise) because it _never_ changes. A lot of config and meta-data falls into this category.

    4. Some application state is runtime only -- not persistent -- so it shouldn't be stored in a database. This data is easily managed in a single JVM, or using Coherence (or similar) when clustering. Usually you know this data because it's the code you have to re-work when you go from a single server to a clustered deployment, and you have to store things in the database that used to not be there.

    5. For performance- and scale-intensive applications, there are more advanced options that you'd never consider on simpler / lower-scale apps. For example, the biggest improvements in performance that I've seen (in practice) come from avoiding the DB access layer altogether in the "time critical" and "scale critical" areas of software. Applications that have to maintain throughput of over 100K transactions per second just can't do it in the database, and for that matter can't do it on a single JVM or even a single physical box.

    Obviously, knowing your requirements up front will help to pick the easiest possible solution. Most projects either over-require or under-require. The former is expensive (building a castle to store a lawn-mower) but it works (and makes for really humorous stories). The latter ends up being a complete failure, because the initial architecture can't adapt to the unexpected requirements, and the project becomes a sink-hole for budget and schedule over-runs.

    As always, the trick is to make it as simple as possible, but no simpler.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  19. I like Sven's points.

    IMO, matters relating to object state and their lifetimes have very little to do with matters relating to caching of data in the middle-tier. :-)

    As I understand it, "state" relates to object lifetimes. So, "stateless" could mean one of two things:

    1. Enforced Statelessness: a la MTS/COM+ or EJB (Just-in-time-activation);
    2. Voluntary/Pseudo Statelessness: in the sense of using best-practices to isolate publicly callable methods from one another (obj.method1() operates independently of obj.method2() - often these demarcations are based on transactional semantics);

    I'd call 1. as "extreme statelessness" where each method call to the stateless object results in the object being "killed" (deactivated, recycled etc) by the underlying application (TPS) server.

    2. is a voluntary approach where you can make calls like this but shouldn't!:

    obj.setSomething(....);
    obj.setAnotherThing(...);
    obj.updateDatabase();

    You can't do that at all with 1.

    Kind regards

    Abdullah
  20. KISS[ Go to top ]

    You guys always make things so complicated.

    It is very simple.

    If you save the data then use stateless.

    If you don't save the data then use stateful.

    Both are needed. Almost everything in ecommerce is stateless since it is stored data.

    So the question is do you persist your basket or don't you?

    The answer depends on whether you want your customers lose their basket info when your server goes down, they change browsers, or somthing else. Although, saving data is more expensive (performance and cost).