The Myth of Transparent Clustering

Discussions

News: The Myth of Transparent Clustering

  1. The Myth of Transparent Clustering (41 messages)

    People seem to feel that clustering can be applied as an aspect, completely transparent to your application, the way JVMs handle garbage collection. In "The Myth of Transparent Clustering," JBoss Cache lead Manik Surtani discusses why this is just a myth and a pipe dream, that applications need to be designed and written with clustering in mind if goals of high availability and scalability are to be achieved.
    What is usually overlooked is the second half of my argument, performance, usually let down by not designing and implementing your application with clustering in mind. This is waved away as unimportant, that modern computing resources mean that 'minor inefficiencies' will not be a problem. Such wave-awayers obviously haven't heard of Sun Fellow Peter Deutsch's Fallacies of Distributed Computing. As much as people would like to think that with modern techniques like AOP, bytecode injection, annotations, along with a healthy dose of ignorance of reality, wishing upon a star and belief in the tooth fairy, clustering can be a truly decoupled aspect that can be applied to anything, they are wrong. I've heard a comment about JVMs handling garbage collection transparently and why clustering should similarly be as transparent. The bitter truth is that clustering can at best be thought of as half an aspect.
    What do you think? Should transparent clustering even be a goal?

    Threaded Messages (41)

  2. The Newport Test[ Go to top ]

    I think we've been over this before. Until the cost of communication is similar to the local invocation cost then it's a myth. I propose a new test for this, someone should apply transparent clustering to Apache Derby and make an Oracle RAC killer... I think we should make this the new Turing test of transparent clustering, "The Newport test" :) It ain't happening anytime soon although I'll never say it's impossible :)
  3. People seem to feel that clustering can be applied as an aspect, completely transparent to your application, the way JVMs handle garbage collection. [..] Should transparent clustering even be a goal?
    While the transparent approach does not excite me personally, it could have some use cases that make sense. I don't believe in counting out possibilities just because someone says that they can't be done. What we accomplish today with Coherence was considered an impossibility just a few years ago, and now it's commonly accepted and best practice, so I'm glad we didn't listen to "common sense" back then. Regarding the Fallacies of Distributed Computing, those are coming from the "ice age" of distributed systems, when the point of view was of a bunch of disparate systems working together using antiquated RPC and low-level messaging approaches. These days, the grid is the computer, so those fallacies conceptually apply to sub-systems within a working machine, as if: 1. The [CPUs are all] reliable. 2. [Processor-to-processor] latency is zero. 3. [Processor-to-processor] bandwidth is infinite. 4. The [system bus] is secure. 5. [CPU count] doesn't change. 6. There is one administrator. 7. [Processor-to-processor communication] cost is zero. 8. The [CPUs are] homogeneous. So you can start to see which fallacies are true still, and which are simply reflecting an out-of-date RPC-based mindset. When companies (such as Sun) build servers, they attempt to build them to survive component failure, such as CPU failure, memory failure, drive failure, etc. Modern grid architecture is no different. BTW - you should change "hreef" to "href" ;-) Peace, Cameron Purdy Tangosol Coherence: Clustered Shared Memory for Java
  4. People seem to feel that clustering can be applied as an aspect, completely transparent to your application, the way JVMs handle garbage collection. [..] Should transparent clustering even be a goal?


    While the transparent approach does not excite me personally, it could have some use cases that make sense. I don't believe in counting out possibilities just because someone says that they can't be done.
    Right, but the problem is that the constraints a transparent-clustering-friendly application should observe are rarely well-explained. Just like servlet session clustering. It can be accomplished if the objects are serializable, if this and if that. More, are we sure that an application implemented in this way is easy to implement and maintain ? And, as noted by the author, the overhead of such a framework ? Again the problem is that transparent clustering becomes synonym of simple, stupid, ready-to-use, "why it takes you so much time to design your application" in manager's mind and many others amenities Guido
  5. Just like servlet session clustering. It can be accomplished if the objects are serializable, if this and if that.
    To that effect, a simple yet effective way of checking that all your session objects are serializable is MessAdmin. Those objects will show up with a nice, scary red background if not serializable... Of course, this is only a very small part of the information exposed by this tool... go check it out! :)
  6. Just like servlet session clustering. It can be accomplished if the objects are serializable, if this and if that.

    To that effect, a simple yet effective way of checking that all your session objects are serializable is MessAdmin. Those objects will show up with a nice, scary red background if not serializable...
    Of course, this is only a very small part of the information exposed by this tool... go check it out! :)
    I have been accused so many times of being a "dreaded vendor" so I am just going to go for it...I apologize to those I will annoy with this pitch. It is now possible to use a clustered session impl w/o having its attributes (objects) implement serializable. It is also possible to avoid the call to setAttribute() or put() when you change objects that are currently stored in a clustered session. With JVM-level clustering, the session API is no longer used to signal to the cluster that objects are changing...the very act of changing objects on heap will cause the deltas to be pushed where needed. Download Terracotta for Sessions for your app server if you want to give it a try. BTW, I don't feel so bad anymore about being a "dreaded vendor" because Terracotta is offering solutions to real problems and a production license is free.
  7. It is now possible to use a clustered session impl w/o having its attributes (objects) implement serializable. It is also possible to avoid the call to setAttribute() or put() when you change objects that are currently stored in a clustered session. With JVM-level clustering, the session API is no longer used to signal to the cluster that objects are changing...the very act of changing objects on heap will cause the deltas to be pushed where needed.
    True enought. I didn't have a chance to test your product yet, but, as far as I understand, your clustering solution is at the JVM level, which isn't the same as "traditional" HttpSession replication. Those of us not using your solution will still have to ensure all session attributes are serializable, calls to setAttribute() is done when required, and so on... PS: your flash demo is quite impressive, congratulations! ________ MessAdmin Notification system and Session administration for J2EE Web Applications
  8. I think your justification for "Grid is the Computer" is not quite right. In the 20 years I have been working in IT industry, I have seen maybe one or two CPU/Memory failures. I see network failures daily. For all practical purposes CPU/Memory is infinitely more reliable than any kind of clustering which depends on network. The latency for reading data from memeory is usually around 10 clock cycles of the memory chip. This is around 50 Nano seconds. The latency for reading data from another machine is around 1 ms or more. This is a factor of million. As far as latency goes, without a grid you are better of by a factor of million. Then comes the bandwith. CPU-to-CPU bandwith is in many GB per second. Even on a GigaBit network we never see this kind of bandwith between computers. Tranparency in the grid does not come free. One needs to understand that objects do not magically show up when they are needed. First you need to serialize them, transport them and then deserialize them where they are needed. This takes CPU, Garbage Collection, Latency, and Network Overhead. Yes it can be transparent if you can afford a degradation in throughput by a factor of thousand to a million. There are other approaches which can provide the same goal without "Grid the Computer". One has to make a choice after a careful analysis of the cost of distributing the application. We found that solutions without any distributed or message based architecture to be cheaper to build and deploy while providing far better throughpout and reliability.
  9. I think your justification for "Grid is the Computer" is not quite right.
    OK.
    In the 20 years I have been working in IT industry, I have seen maybe one or two CPU/Memory failures. I see network failures daily. For all practical purposes CPU/Memory is infinitely more reliable than any kind of clustering which depends on network.
    Servers fail fairly regularly. Whether the failures are specifically caused by a CPU or by memory is a different question, but many applications cannot accept the risks of server failure. Any connection within a network is obviously less reliable than any single solid state device (e.g. the guts of a particular server). However, modern grid infrastructures do not inherently rely on any particular network connection being reliable, but rather that the network, in general, remains available.
    The latency for reading data from memeory is usually around 10 clock cycles of the memory chip. This is around 50 Nano seconds. The latency for reading data from another machine is around 1 ms or more. This is a factor of million. As far as latency goes, without a grid you are better of by a factor of million.
    Only if you need to access data remotely. Why would you want to move data around like that? Why not move the processing to where the data is already sitting? ;-) Operating systems already do this internally when they schedule threads; google for NUMA-aware thread scheduling. With Coherence, one line of code will allow you to issue the selective execution (in parallel, with once-and-only-once semantics) of a data-centric agent across any number of servers.
    Then comes the bandwith. CPU-to-CPU bandwith is in many GB per second. Even on a GigaBit network we never see this kind of bandwith between computers.
    You're using the wrong software ;-) In our tests, we are pushing a sustained 120MB per port, so in a 100 server grid we can easily do 10GB/s. Several of our hardware partners are showering us with kit includiung infiniband and higher-speed Ethernet (since we're wire-constrained on gigabit ethernet). However, to repeat my previous message, the goal of the design should be to avoid moving things around -- locality will almost always win.
    Tranparency in the grid does not come free. One needs to understand that objects do not magically show up when they are needed. First you need to serialize them, transport them and then deserialize them where they are needed. This takes CPU, Garbage Collection, Latency, and Network Overhead. Yes it can be transparent if you can afford a degradation in throughput by a factor of thousand to a million.
    I'm not suggesting that things be built assuming transparency, but that's likely because I'm involved in part of the industry that is not looking for transparent programming solutions, and has no problem architecting applications to run at their theoretical limits in a grid environment (i.e. Amdahl's Law).
    There are other approaches which can provide the same goal without "Grid the Computer". One has to make a choice after a careful analysis of the cost of distributing the application. We found that solutions without any distributed or message based architecture to be cheaper to build and deploy while providing far better throughpout and reliability.
    Yes, certainly. I find that compared to multi-threaded systems, single-threaded systems are less expensive, higher quality and provide better latency figures for STP work-loads. ;-) So each problem set should be analyzed for the solutions that best fit the business requirements, and one should not begin with an assumption that "grid is it". At the same time, I can tell you that the problems we are solving with large-scale grid deployments could not be solved any other way. Peace, Cameron Purdy Tangosol Coherence: The Java Data Grid
  10. "You're using the wrong software " This is quite often the wrong answer. Customers come with their own network. They expect to work with their existing network and they are usually very hard to convince otherwise. Every software I did has significant network error detection and recovey built into it. "Only if you need to access data remotely. Why would you want to move data around like that? Why not move the processing to where the data is already sitting? ;-)" Actually I do not. I very much like all the processes and data sit on the same machine and process. My point, clustering/distributed caching does the opposite. By putting the data accross multiple machines and/or processes, it is creating an object transport bottleneck. I should clarify that I am referring to clustering (really I mean distributed caching) because of large amount of data, so large that presumably it does not fit into a single machine. Clustering to have identical copies of the same system and data is a trivial issue, which may/maynot need any distributed caching. The same goes for fault tolerance. One can possibly solve it by putting a load balancer and a server farm. I think the hard (perplexing) problem is that when we can buy machines with 64GB RAM and 16 CPUs for very reasonable prices, Why would/should I want to go for any kind of external/distributed caching? It is hard to imagine a system which cannot be partitioned into 64GB chunks so that each partition does have no dependency on other partitions. My point is that if you can do this, one can avoid the cost and complexity of middleware, its dependency on network, and degradation of performance. And this is very possible. "...Yes, certainly. I find that compared to multi-threaded systems, single-threaded systems are less expensive, higher quality and provide better latency figures for STP work-loads. ;-)..." I am not sure what makes you think that other options are limited to single threaded. As you are most definitely aware, you can have as many threads (and CPUs) as you like in a single machine. There are certain types of problems where distributed caching is the best solution. What prompted me for this post is the claims that distributed caching will solve/be justified by the issues you posted.
  11. Using one big box is OK providing you can buy one that is big enough and you do not need access to data on other boxes. Otherwise you have problems. Scaling up is inherently limited.
  12. In our tests, we are pushing a sustained 120MB per port, so in a 100 server grid we can easily do 10GB/s.
    What kind of latency and throughput do you get with Coherence's ReplicatedCache with pessimistic locking, on a cluster of 5 nodes (5 replicas), where all the nodes read and then update the same object? (i.e. 100% sharing.) The update size can be 1 byte if you like. I just see what kind of overhead you have in the worst case. Guglielmo
  13. What kind of latency and throughput do you get with Coherence's ReplicatedCache with pessimistic locking, on a cluster of 5 nodes (5 replicas), where all the nodes read and then update the same object? (i.e. 100% sharing.) The update size can be 1 byte if you like. I just see what kind of overhead you have in the worst case.
    I'd have to check with some engineers, but that would be a worst-case scenario, and could likely be as much as a few milliseconds, i.e. 4 cluster-durable operations: lock(id), value=get(id), some process against value, put(id,value), unlock(id). The best case scenario is an agent-based approach, which would tend to be less than one millisecond, i.e. 1 cluster-durable operation (four hops including synchronous backup for resiliency): result=invoke(id, agent). Even better is that the critical section ("serial" nature of the process) is very small, meaning that even if the latency from one thread's point of view is a large fraction of a millisecond, you could still have 25,000 of these occuring against the same ID inside of a second. Peace, Cameron Purdy Tangosol Coherence: The Java Data Grid
  14. I'd have to check with some engineers, but that would be a worst-case scenario, and could likely be as much as a few milliseconds
    I do expect that the latency will be relatively high, but the real interesting result will be the throughput. Clearly since all the nodes process exactly the same operations it will be the same as the throughput of a single node, but I am interested to see if a product that works well in the case of large clusters also holds its own in the worst case. I have some thoughts on this over the years and I am starting to think that protocol switching may be the answer here. Guglielmo Enjoy the Fastest Known Reliable Multicast Protocol with Total Ordering .. or the World's First Pure-Java Terminal Driver
  15. Re: Clustering may come at big cost[ Go to top ]

    In the 20 years I have been working in IT industry, I have seen maybe one or two CPU/Memory failures.
    20 years. Hmmmm. You know, age is a major contributing factor in memory failure ... . ;)
  16. Hi Cameron, Whilst I am happy to defer to your knowledge on grid computing, I do know a thing or two about physics and my degree is in electronics so I find your claims a bit incredible:
    1. The [CPUs are all] reliable. 2. [Processor-to-processor] latency is zero. 3. [Processor-to-processor] bandwidth is infinite. 4. The [system bus] is secure. 5. [CPU count] doesn't change. 6. There is one administrator. 7. [Processor-to-processor communication] cost is zero. 8. The [CPUs are] homogeneous.
    The problem I have with clustering is that it as always seemed to me as a solution looking for a problem. There is (almost) always a simplier way to achieve the same thing if you really need to. It just breaks the KISS (Keep It Simple) principle for me. I'm sure that you can come up with use cases that show the opposite, but I'm guessing that they would represent a very small percentage of real world problems. I've said it on TSS before, but to me clustering (like that used in Weblogic) is another example of vendor lead technology looking for a ready market through FUD. The again I'm happy to be proved wrong :^). Paul.
  17. I've said it on TSS before, but to me clustering (like that used in Weblogic) is another example of vendor lead technology looking for a ready market through FUD. The again I'm happy to be proved wrong :^).

    Paul.
    Alright, how about this, to make it simple: a system needs to have 100% uptime, therefore it is not ok to use a HA-solution whereby one node comes up as another goes down. The node may come down due to failure, but more often it will simply be due to system maintenance or such tasks. If you don't have a clustered and/or load-balanced system, how will you deal with that? I am not seeing clustering being used very often for performance reasons, but I am definitely seeing it used for uptime and fault tolerance reasons. The coolest clustering scenario I've seen lately is the stuff that John/C24 have done for various financing institutions using Coherence/GigaSpaces, with thousands upon thousands of CPU's with response time requirements of about 5ms per query, with 10k+ queries per second. Deal with that without clustering and I'll admit to being reasonably impressed.
  18. Hi,
    Alright, how about this, to make it simple: a system needs to have 100% uptime, therefore it is not ok to use a HA-solution whereby one node comes up as another goes down.
    OK, what percentage of clustered solutions truly need to be available 100% of the time? More often then not this is a requirement in the heads of vendors. There are hundreds of computer systems out there where the users are happy to tolerate a small percentage (say 0.1%?) downtime off peak, that have been providing useful service for years. As for an HA-solution this is one bit of jargon that I haven't come across so you will have to explain.
    The coolest clustering scenario I've seen lately is the stuff that John/C24 have done for various financing institutions using Coherence/GigaSpaces, with thousands upon thousands of CPU's with response time requirements of about 5ms per query, with 10k+ queries per second. Deal with that without clustering and I'll admit to being reasonably impressed.
    I didn't say not to use multi-processing. The issue is how? A hardware load balancer with sticky sessions in front of bog standard servers, each running the whole application and sharing a common database server works just fine for me. Not sure if such an architecture would meet the demands you describe, but I can't see why not. Not so cool but a lot simpler IMO. I must admit ignorance of Coherence and GigaSpace, which is why I specifically quoted Weblogic which tries to provide fail-over protection by maintaining an in-memory copy of your session on a buddy machine in the cluster. A nice idea, but a configuration and administration nightmare, and I'm not that sure it actually works. My basic point is that such architectures tend to create a lot of pain for very little percievable benefit. Paul.
  19. The coolest clustering scenario I've seen lately is the stuff that John/C24 have done for various financing institutions using Coherence/GigaSpaces, with thousands upon thousands of CPU's with response time requirements of about 5ms per query, with 10k+ queries per second. Deal with that without clustering and I'll admit to being reasonably impressed.
    I think these guys could do it: High Volume Transaction Processing Without Concurrency Control, Two Phase Commit, SQL or C (1997) (Make Corrections) Arthur Whitney, Dennis Shasha, Stevan Apter Int. Workshop on High Performance Transaction Systems http://citeseer.ist.psu.edu/rd/16332271%2C6926%2C1%2C0.25%2CDownload/http://citeseer.ist.psu.edu/cache/papers/cs/5729/http:zSzzSzcs.nyu.eduzSzcszSzfacultyzSzshashazSzpaperszSzhpts.pdf/whitney97high.pdf http://www.kx.com
  20. The problem I have with clustering is that it as always seemed to me as a solution looking for a problem. There is (almost) always a simplier way to achieve the same thing if you really need to. It just breaks the KISS (Keep It Simple) principle for me.

    I'm sure that you can come up with use cases that show the opposite, but I'm guessing that they would represent a very small percentage of real world problems.

    I've said it on TSS before, but to me clustering (like that used in Weblogic) is another example of vendor lead technology looking for a ready market through FUD. The again I'm happy to be proved wrong :^).

    Paul.
    It seems that everyone I speak to is clustering. The assertion that people should not cluster seems to be more accurately cast as "don't cluster in the App tier...cluster with the db." All Tangosol, and Terracotta are trying to do is provide a service that keeps more data as close to the processing context as possible. The more you continue to argue against this general concept, the more harm you do the community. Why? Simply because, in the hardware analogy where the DB is slow disk and the app server is CPU, you are asking people to use VERY LITTLE RAM, no L1, no L2 caching, and to keep EVERYTHING on disk. Does that make sense? Will it ever? I don't think so. To over-simplify, tuning always leads to "keep it in RAM." And availability leads to keeping it on disk. This set of vendors let's you avoid the trade off...keep it in RAM on a couple of machines...that will be faster than disk, and still highly available. In the name of KISS, app server clustering is actually simpler than DB clustering. Its cheaper. It solves the problem where the problem is occurring. And, it works. Is it really any easier to get Oracle or MySQL running, define a schema, grab your favorite OR-mapper, implement some optimistic concurrency or transactional interface, and then try to tune all of that? If you want to talk about speed for speed's sake, the single machine instance that persists nothing and communicates with no other server purely serves speed. No doubt it is faster than that same machine working as part of a cluster. I feel that such a discussion is purely academic.
  21. Hi, I was not advocating DB clustering. We need to be clear about the problem. If fault tolerance is the problem then we need to identify an acceptable mean time between failure. I would argue that hardware is so reliable knowadays that buying good quality kit is often the simplest solution to meet a required MTBF. Then of course you have approaches that tackle specific bits of hardware that tend to fail, like hard-disks, for this how about RAID? So there are lots of things you can look aswell as clustering.
    Why? Simply because, in the hardware analogy where the DB is slow disk and the app server is CPU, you are asking people to use VERY LITTLE RAM, no L1, no L2 caching, and to keep EVERYTHING on disk. Does that make sense? Will it ever? I don't think so.
    I'm all for caching, but caching is not the same as clustering.
    To over-simplify, tuning always leads to "keep it in RAM." And availability leads to keeping it on disk. This set of vendors let's you avoid the trade off...keep it in RAM on a couple of machines...that will be faster than disk, and still highly available.
    I am ignorant of Chorence as a product, but network latency tends to be greater and more problematic than disk access latency, so a RAM cache over a network will tend to be slower than local disk access, especially when that local disk is accessed through a local RAM cache. I am not an expert on grid technology, my main point is that it introduces a whole layer of complexity. My simple question is why? Paul.
  22. I am not an expert on grid technology, my main point is that it introduces a whole layer of complexity. My simple question is why?
    Re-read the last two paragraphs of Cameron's post (#213022). And then visit Tangosol's website and check out some of the white papers. And there are a few Tech Talks here you can view too.
  23. I am not an expert on grid technology, my main point is that it introduces a whole layer of complexity. My simple question is why?


    Paul.
    I terribly like this question. I am not contrary to new hype, I really hate hypes that become buzzwords good for any use. Again, what are the boundary conditions that make a grid/clustered/distributed caching solution suitable for your job ? May be this is left as an exercise to the willing student. Guido
  24. in the hardware analogy where the DB is slow disk and the app server is CPU, you are asking people to use VERY LITTLE RAM, no L1, no L2 caching, and to keep EVERYTHING on disk. Does that make sense? Will it ever? I don't think so.
    This is a common misunderstanding of database technology, no database keeps "everything on disk" and disk only. Any stand-alone database (i.e. not SQLite or Derby) uses an administrator-tuned block buffer that is kept in memory. More sophisticated databases like Oracle provide tunable multiple buffers for things like query blocks, transaction-related write blocks, shared blocks for things like locks and query parses, etc. To a point (depending on things like architecture and memory avialable), this can scale to the limits of your machine and it's how lots of the larger Oracle instances I've worked with grew to meet load. This is also why I generally tend to avoid ORM solutions like Hibernate as they tend to crush buffer efficiency in the database (rarely do I access all the attributes of a domain object instance and hydrating the entire object wastes database cache space and causes heap fragmentation and thrash in the JVM). True, there are limits to this, but you're talking many terrabytes of data and at which point you are better off looking to products like Cameron's to alleviate the pain. Search engines work the same way, indexes and data are kept on disk for recoverability purposes but as much as possible is kept in memory for fast access. My point here is that when app developers start trying to second-guess mature tools that have been "caching" their data for decades and have invested millions to solve the problem, they usually end up getting it wrong. As to your assertion that disks are slow and optimization means working around them, granted they are slower that memory access, all operations outside the primary memory bus are. However, this can also be mitigated with proper database planning using fiber storage and smart partitioning (by spreading physical volumes across spindles and by partitioning database tables depending on usage patterns). In my experience, 90% of application performance issues are constrained by the application, usually because it is not making educated use of the tools around it under stress including as the OP contends, because clustering isn't a forethought in the design.
    To over-simplify, tuning always leads to "keep it in RAM."
    In the name of KISS, app server clustering is actually simpler than DB clustering.
    These seem inconsistent to me, maybe I'm missing your point. In-memory caches come with their own suite of problems that are equally difficult to troubleshoot, for me anyway far more difficult than troubleshooting a RAC cluster. It's gotten better recently with generational GC, improvements in the memory model and tools like Tangosol but you still have issues like accessing the shared memory map, worrying about double-checked locking, etc. I say, better to start out making good use of your back-end systems, then move to in-memory optimizations as a last resort. I tend to think of tools like Tangosol and GigaSpaces more in massively parallel grid-type applications (I'm using Blitz for one now), not for app clustering. To me anyway, they are orthogonal problem domains.
  25. Hi Cameron, Whilst I am happy to defer to your knowledge on grid computing, I do know a thing or two about physics and my degree is in electronics so I find your claims a bit incredible ..
    I'm glad that you find them incredible, otherwise I'd wonder why I poured the last six years of my life into this area of technology ;-) The purpose of my analogy was to show that a modern grid still suffers from some of the same fallacies as a traditional distributed system, but at the same time it does not suffer from other of the fallacies any more than an SMP system does. I attempt to make this case by way of analogy (logic).
    The problem I have with clustering is that it as always seemed to me as a solution looking for a problem.
    I cannot counter how things "seem" to you. Your perceptions are completely valid from whatever vantage point you occupy, and I have no desire to tell you otherwise. However, I can provide you with a glimpse from my vantage point, and starting in 1999 / 2000, we were seeing Java and J2EE server applications being deployed in a "scale out" manner (your server farm example with a load balancer) that were fundamentally bottlenecked on the data tier, e.g. on a mainframe service or a relational database. That was a problem looking for a solution, and we were being paid good money to find a solution, which we realized did not exist in the marketplace, and so we built Coherence. It's purpose in that timeframe (2001 / 2002) was to help applications running in a scale-out environment by providing the dynamic data that they were using, without repeatedly having to go to a mainframe or a database (the "system of record") to get it. I cannot speak to other forms of "clustering". Our use of the term (like its use in other products such as BEA WebLogic) is not technically precise, but unfortunately no better term existed at the time.
    There is (almost) always a simplier way to achieve the same thing if you really need to. It just breaks the KISS (Keep It Simple) principle for me.
    Our customers have applications that perform millions of data access operations per second. These applications would not be possible without having the necessary data in memory, and having it somehow kept in sync across any number of boxes that happen to be accessing and modifying that same data. For these applications, the "before" approach was the one that you described:
    A hardware load balancer with sticky sessions in front of bog standard servers, each running the whole application and sharing a common database server works just fine for me.
    In many applications that I have seen, this "before" approach results in unacceptable response times with even a single user (due to the number of queries required to assemble the desired data), and saturates even a large database server (e.g. 100% CPU utilization) with a relatively small number of users. In other words, for these applications, the "before" approach is neither performant nor scalable.
    I'm sure that you can come up with use cases that show the opposite, but I'm guessing that they would represent a very small percentage of real world problems.
    As you can imagine, I see a lot of use cases that have these attributes. Based on our customer list and my own experience, I'm guessing that they represent a significant percentage of enterprise applications.
    I am ignorant of Chorence as a product, but network latency tends to be greater and more problematic than disk access latency, so a RAM cache over a network will tend to be slower than local disk access, especially when that local disk is accessed through a local RAM cache.
    Obviously, accessing a local RAM cache is fast, which is how most applications using Coherence would operate. For example, we support local caching, full replication, near caching and continuous query caching, each of which is designed to have the data already in local memory when it is needed. However, your other assumption is completely incorrect: Round-trip network performance is significantly faster (lower latency) than local disk access. Also, with more than two machines, network throughput scales linearly on a switched backplane, whereas disk throughput is limited by the local I/O bus. That means (for example) that parallelization of work can be very effective for I/O-intensive work, since the aggregate I/O scales linearly. Peace, Cameron Purdy Tangosol Coherence: The Java Data Grid
  26. Hi Cameron, I asked a straight question:
    I am not an expert on grid technology, my main point is that it introduces a whole layer of complexity. My simple question is why?
    And I got a straight answer:
    starting in 1999 / 2000, we were seeing Java and J2EE server applications being deployed in a "scale out" manner (your server farm example with a load balancer) that were fundamentally bottlenecked on the data tier, e.g. on a mainframe service or a relational database. That was a problem looking for a solution
    Thanks. BTW as I asked the question, it occured to me that the database tier does represent a bottle-neck. We suffered such a problem on my last project, but luckily with some database tuning and application specific caching we were able to get around it (reducing a long query down to a few seconds), but if our tweaking wasn't adequate then we would have been stuck. My general view is to get the application working first and then deal with performance bottle necks later. If you do hit a hard performance problem though then from your description Coherence may be the best remedy. Your explanation has encouraged me to take a look at your website. Thanks again, Paul.
  27. I asked a straight question [..] And I got a straight answer [..] Thanks
    You are welcome.
    My general view is to get the application working first and then deal with performance bottle necks later. If you do hit a hard performance problem though then from your description Coherence may be the best remedy. Your explanation has encouraged me to take a look at your website.
    For most applications, you are probably correct. There are also some applications, however, that it makes sense to architect around a cache-based architecture, knowing that the application will only be possible if it can obtain data instantly from memory, and knowing it can only maintain its required transaction rate if the transactions are themselves in memory. I spend a lot of time now with this type of application, and it is a very interesting field. Peace, Cameron Purdy Tangosol Coherence: The Java Data Grid
  28. It would be nice if the article had a link to another article that describes how to "Keep clustering in mind when designing". Do I just have to click my heels together and say "I wish I was clusterable, I wish I was clusterable"? :)
  29. Coming up soon mate! :) It is one of the things I want to write about.
  30. Please do so, because without your article does not bring any added value. I would like to see then also an alternative, nut just a problem posting.
  31. May Have to Think[ Go to top ]

    It's not hard to design applications to be clusterable. But it adds cost and complexity. The most important thing you can do about clustering is figuring out if you can get out of it. If you don't have to do it, don't. Specifically, if your application needs to scale up and you can afford to just double the number of cpus in your machine, then just do that. If that number of cpus is not cost-effective, you need to think about which parts of your workload might be parallelizable. One easy thing to try is the old fragment-and-replicate strategy. In many oltp systems the workload consists of a very large volume of transactions per day, and a comparatively small of amount of slow-changing "reference" data. In many of these systems you can partition the transactions by cluster node and replicate the reference data. Where clustering can get unavoidable is where you have these requirements: 1. High throughput 2. Low latency 3. High availability But in general the rule is to think about your application's workload. And every application is different. In the end the only person who can really figure which architecture is right for your system is you. Incidentally, if you have very specialized needs with almost 100% data sharing between nodes, and you only think your cluster needs to use 3-4 nodes, consider using the totem protocol as an internal system bus (see link below for my open-source implementation of this old protocol). Guglielmo Enjoy the Fastest Known Reliable Multicast Protocol with Total Ordering .. or the World's First Pure-Java Terminal Driver
  32. Re: May Have to Think[ Go to top ]

    It's not hard to design applications to be clusterable. But it adds cost and complexity.
    Precisely. It is just that it has to be *considered* at some level. One cannot just assume that an application written *without* clustering in mind can be efficiently clustered. Poor/unnecessary use of http session attributes is one example, sub-optimal calls between application tiers is another.
  33. It would be nice if the article had a link to another article that describes how to "Keep clustering in mind when designing". Do I just have to click my heels together and say "I wish I was clusterable, I wish I was clusterable"? :)
    Nice comment, what exactly should take care of when we design? If we assume that the application will be cluster but we don't sure what the environment is, will we make wrong assumption anyway? So, may be the concern is how to simulate cluster deployment environment at development? So that developer can develop application correctly?
  34. It would be nice if the article had a link to another article that describes how to "Keep clustering in mind when designing". Do I just have to click my heels together and say "I wish I was clusterable, I wish I was clusterable"? :)


    Nice comment, what exactly should take care of when we design? If we assume that the application will be cluster but we don't sure what the environment is, will we make wrong assumption anyway?

    So, may be the concern is how to simulate cluster deployment environment at development? So that developer can develop application correctly?
    Honestly, I feel like I am doing the right thing - but since I have not had to cluster, I don't know.
  35. A few points of clarification: Terracotta never tells folks that they are guaranteed not to have to rewrite apps. At the webinar you attended and the TSS conference the JBoss team attended, we talked about clustering frameworks, not end-user apps. Our clustered framework modules can scale and we have third party research showing that our solution is one of the fastest solutions available in these frameworks use cases. Second, to Cameron's point, your PoV is based on mis-cited references. Sun's research was focused on RPC-style clustering. Furthermore, You can be certain that Google, EBay, Amazon, Akamai, and Walmart.com have abstracted the problem of clustering away at this point (for example, Walmart.com's persistent session layer or Google's FileSystem are leveraged day-in and day-out to build clustered apps without custom tuning or clustered development patterns). I would be hard-pressed to be convinced that they do clustering work in all layers of the business logic. The very notion of an ABSTRACTION LAYER exists for a reason. At the end of the day, most folks who do clustering for a living will tell you that what is more important than the cluster API or the implementation is where you call out to your clustering service. Put another way, it's what you cluster, not how you cluster it. The task of tuning is always done, at least partly, after the code is written when runtime data can be gathered so the more flexible the clustering engine to being applied and reapplied in different ways w/o large impact on the source code, the more efficient the developer becomes. I would very much prefer to focus on use cases and programming guides rather than FUD and inventor's dilemma types of articles. Such guides will be of high value to the community. BTW, here's a simple pseudo code example of something that cannot be transparently clustered (or adds no value once having been clustered): main() { int j = 0; for( int i = 0; i < 100; i++ ) { j+=i; print( "here is the value of J: %d\n", j); } } Not much value in clustering that. Have I proven the notion of transparent clustering wrong with this example? Surely not. (I am sorry...for those who don't know me I love to quote movies: "and Don't call me Shirley" would be the proper retort.) --Ari
  36. You can be certain that Google, EBay, Amazon, Akamai, and Walmart.com have abstracted the problem of clustering away at this point
    All interesting examples:
  37. Google is an effectively read-only system, except for updating the index on-line, which means lock latency is not a problem, if they even have locks (Akamai should be similar.) Still, they did write a completely custom solution because they like to save money and probably just to do a good job. There is a big blurb about it somewhere on the web. I believe they inhomogeneous clustering. Definitely not a transparent solution.
  38. EBay I think uses WebSphere now. They can partition the workload by auction. User profiles can be replicated. I saw a big powerpoint presentation about this system a while back - it's out there but I can't find it right now.
  39. Amazon and Walmart can be partitioned by user session. The reference data (an enormous product catalog, I imagine) can be replicated. Didn't you work on walmart.com ? These examples all have workloads which are amenable to clustering. It reminds me of an article by Tom DeMarco where he talks about the benefits of ad-hoc solutions. Guglielmo Enjoy the Fastest Known Reliable Multicast Protocol with Total Ordering .. or the World's First Pure-Java Terminal Driver
  • Second, to Cameron's point, your PoV is based on mis-cited references. Sun's research was focused on RPC-style clustering.
    Not true. Even with data-oriented clustering, network latency, bandwidth and reliability will always be an issue. Cameron has some interesting points and statistics with high-speed ethernet and infiniband, but this is still specific to grid computing. In a world where servers are clustered using gigabit ethernet with commodity switches, the network layer is still very much prone to fault and failure.
  • Aspect and obliviousness[ Go to top ]

    I was listening to some academic talk about aspect and their potential obliviousness this week at the ECOOP conference. The discussion actually showed that obliviousness was a desirable property but not a mandatory property. The speaker gave two very good examples where obliviousness is a bad property: - transaction - persistence In both case, a piece of code not knowing it is aspectize leads to really bad things. A code should always be aware that it works in a transaction or not, a piece of code should know it is going to trigger a persistence operation or not. Persistence and clustering are 2 related concepts. So annotations describing such semantics are actually a good thing, since it explicits the behavior.
  • Interesting rant, but rant nevertheless. I did not understand why Joseph calls this post a "discussion" Manik throws out a more or less trivial statement but neither justifies his stand nor suggests what the right way would be. What's the value of Manik's post?
  • It's not a trivial statement nor is it a rant. It is a clarification of a point of view that people uphold, and my opinion on why this point of view is unattainable at the moment. I agree with Cameron's statement earlier that we may be able to attain completely transparent clustering at some point and it is a worthwhile goal to pursue. But we aren't there yet as some believe.
  • The network is reliable[ Go to top ]

    1. The network is reliable.
    A fallacy that immediately follows this one is: Data doesn't get incosistent. This is more trouble in practice than performance. To get a slow system running again isn't rocket science. Moreover, you can measure and optimize performace before the systems gets into production. Load testing and testing of error compensation mechanisms (fail over) of all components is essential for successfully running clustered applications, no matter what technology you use. Data inconsistency is more trouble since it can have a severe impact on business continuation (see Prioritizing Bugs). The same holds for security. I think you are perfectly right that clustered environments will remain highly challenging and need software that has to be designed and developed with clustering in mind.
  • I would like to make a distinction here about transparency and distributed systems. I Believe that we could all agree that distributed systems do not behave as local systems i.e. in distributed system there are issues such as partial failure, network overhead, marshaling/unmarshaling etc. that are not likely to happen with collocated code. I believe that we could also agree that if you take code that was not designed to be distributed and magically make it distributed without changing the code there is a very high likelihood that such attempt will fail. This is true even if the code will maintain the same semantic behavior from correctness perspective. The fact that we run in a distributed system will change the way we write our code and implement our business logic regardless of whether we use native java programming models or more distributed programming models (JDBC, JMS for example). While I would agree that using native java programming models (HashTable, Aspect) has an Inherit simplicity benefit I don't think that it solves the above issue. I.e. you still need to write your code differently to take into account the fact that is now running in a distributed environment. The main advantage In my view to the use native API is reduction in the barrier to entry which is indeed an important value. It is therefore not surprising that those who started with "transparent" distributed clustering solution end up using lots of priority code around that to address transaction semantics, data affinity, failover scenarios etc. In other cases you may end up using code that is very hard to maintain since it hard to know by looking at the code where the "costly" operations takes place. I would also argue that in majority of the cases it is even less native to write your distributed system in such a way. An approach I like which is used to address the often contradictory requirements of simplicity vs explicitly is to use an abstraction layer that provides the simplicity of the use of both standard Java APIs and distributed APIs without hiding the distributed nature of the solution from the developer. Spring does a very good job IMO in demonstrating how POJO based abstraction can be used to explicitly plug-in different capabilities. For example, you could very easily plug-in new remoting, transactions, and data sources without changing your Spring-based code. Mule is another good reference for enabling such level of transparency in a message-oriented environment. GigaSpaces has taken an interesting approach to "transparent clustering" by implementing standard Java APIs and distributed APIs in such a way that the implementations leverage our products' clustering capabilities. We refer to that model as middleware virtualization which basically stands for providing multiple API's on top of the same clustering model We end the need for the different server, clustering, and data models that are so commonly used to implement each middleware stack. By breaking that dependency we also open up a new opportunity for applications to access the same object instance through different APIs i.e. You can write messages using the JMS API and retrieve them via JDBC queries. You can find more information on that approach on the following references: Architect blog Open source Spring/Space module Mule space abstraction Nati S. CTO GigaSpaces "Write Once Scale Anywhere"