Werner Vogels with "A Word on Scalability"

Discussions

News: Werner Vogels with "A Word on Scalability"

  1. Werner Vogels with "A Word on Scalability" (36 messages)

    Werner Vogels, in "A Word on Scalability," summarizes 'scalability' as "A service is said to be scalable if when we increase the resources in a system, it results in increased performance in a manner proportional to resources added," along with some other choice bits of advice about scalability.

    Luckily, there's more than just one word.

    Here are some more words from his contribution:
    ...An always-on service is said to be scalable if adding resources to facilitate redundancy does not result in a loss of performance...

    Why is scalability so hard? Because scalability cannot be an after-thought. It requires applications and platforms to be designed with scaling in mind, such that adding resources actually results in improving the performance or that if redundancy is introduced the system performance is not adversely affected. Many algorithms that perform reasonably well under low load and small datasets can explode in cost if either requests rates increase, the dataset grows or the number of nodes in the distributed system increases.

    A second problem area is that growing a system through scale-out generally results in a system that has to come to terms with heterogeneity. Resources in the system increase in diversity as next generations of hardware come on line, as bigger or more powerful resources become more cost-effective or when some resources are placed further apart. Heterogeneity means that some nodes will be able to process faster or store more data than other nodes in a system and algorithms that rely on uniformity either break down under these conditions or underutilize the newer resources.

    Threaded Messages (36)

  2. ...An always-on service is said to be scalable if adding resources to facilitate redundancy does not result in a loss of performance...Why is scalability so hard? Because scalability cannot be an after-thought. It requires applications and platforms to be designed with scaling in mind, ...

    This blog is so short and does not tell much except that "scalability is hard". Data storage/retrieval scalability is indeed hard to achieve. App server scalability is easy (IMHO) nowadays. Especially for the retail style transactional web based system.

    Scalable implementation really has two parts. Division of work into discreet and relatively independent sub-processes and the level of indirection build into the dispatching mechanism. Providing the additional level of indirection for process dispatching is easy thing to do (web load balancer or Messaging system), but division of work sometimes takes a bit more though and analysis.
  3. The definition of "scalable"[ Go to top ]

    I think there's a bigger problem with the word "scalable" than the common misunderstanding of its definition. The problem is that when someone describes a system as scalable, there's this implication that it's performant as well which may, in fact, not be the case at all. For example, if I say that a system is scalable and it has the following characteristics: can perform 1 transaction per day per machine. It's technically accurate because you can add another box and then get 2 transactions per day. However, the fact that it's "scalable" is insufficient to describe the performance of the system (which in my example is obviously horrid). Perhaps I'm a victim of social programming here but when I hear that word, I by default assume the system performs well with a reasonable amount of hardware. I know a lot of people do. But what does "reasonable amount of hardware" mean? It's entirely subjective. In general, I think it means "whatever the guy buying the system thinks."

    Finally, who gets to describe what scalable means? If you google "define:scalable" you will get numerous results, most of which are different than the one above.
  4. The definition of "scalable"[ Go to top ]

    Right!

    Actually I am not clear about the definition of "performance" in his definition of "scalibility".

    Dose "performance" mean thoughput, response time, or both, or something else?

    Q
  5. The definition of "scalable"[ Go to top ]

    Right!Actually I am not clear about the definition of "performance" in his definition of "scalibility".Dose "performance" mean thoughput, response time, or both, or something else?Q
    I would suggest 100% scalibility means the response time remains constant while throughput improves in relation to the resource added when scaling horizontally. In the case of vertical scaling the picture may change because improving the ability of a node by adding memory or CPUs or disks may improve may result in an improvement in response time as well with total throughput increasing in relation to the resources being added.
    The challenge is to determine which resources need to be increased to attain the expected scalibility.

    Corneil
  6. The definition of "scalable"[ Go to top ]

    I would suggest 100% scalibility means the response time remains constant while throughput improves in relation to the resource added when scaling horizontally. In the case of vertical scaling the picture may change because improving the ability of a node by adding memory or CPUs or disks may improve may result in an improvement in response time as well with total throughput increasing in relation to the resources being added.The challenge is to determine which resources need to be increased to attain the expected scalibility.Corneil

    I believe what you suggested regarding "performance" is true in most (but not all) cases. Given that, seems the original definition on "scalibility" wouldn't make concrete sense unless couple of things are defined beforehand:

    1. What type of scalibility: horizontal, vertical, ...
    2. What it means by performance: defined the way you suggested, or some other way.
  7. The definition of "scalable"[ Go to top ]

    I think it only makes sense to talk about scalability after the system performs to your goals. Though your point about cost is lost on me. Even in the 1 transaction per day situation you describe, the system is still scalable. It may be cost prohibitive, but it is - by definition - scalable. Thinking otherwise is just the kind of all-in-one thinking I was happy this definition avoided; that is, being scalable is just one attribute of a well designed system.
  8. The definition of "scalable"[ Go to top ]

    I think it only makes sense to talk about scalability after the system performs to your goals. Though your point about cost is lost on me. Even in the 1 transaction per day situation you describe, the system is still scalable. It may be cost prohibitive, but it is - by definition - scalable. Thinking otherwise is just the kind of all-in-one thinking I was happy this definition avoided; that is, being scalable is just one attribute of a well designed system.

    I've seen systems that did "one transaction per day", and that was all that they were supposed to do. End-of-day trading processes often run all night, for example, and often barely finish before the start of the next trading day .. but they don't have to be particularly scalable because there's (at most) one trading day per calendar day ;-)

    OTOH, at a certain large bank, they wanted to turn that processing into a real-time process, so all of a sudden they did have a scalability problem, i.e. how to add resources to get the processing down to a point where humans didn't have to wait on it and the automated systems weren't backing up on the queue. Now _that's_ an interesting problem (9 hours down to 9 seconds ;-)

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  9. The definition of "scalable"[ Go to top ]

    For example, if I say that a system is scalable and it has the following characteristics: can perform 1 transaction per day per machine. It's technically accurate because you can add another box and then get 2 transactions per day. However, the fact that it's "scalable" is insufficient to describe the performance of the system (which in my example is obviously horrid).

    But performance != scalability.

    Back to your example, if a single transaction can do one transaction per day, then two transactions takes two days. Add another system, and I get two transactions in one day. However, I can't get a single transaction in 1/2 a day.

    When most folks talk about scaling, they're already past getting the transaction time as fast as practical, and now working do as many transactions as practical.

    Typically, vertical scaling (i.e. bigger cpus, etc) can reduce transaction time, but is limited by the speed of a single machine.

    Horizontal scaling is where transaction time remains essentially consistent, but you have more agents able to perform the transaction.

    The classic example, of course, is a bank adding tellers at lunch time. This scales great until every agent needs to talk to the manager, then they block while waiting for the resource.

    So, I think in the common case, "scaling" today means horizontal scaling. As represented by machine such as Suns new T1000 and T2000 machines, these are machines that apply the horizontal scaling techniques within a single chassis.

    So a scalable solution does not necessarily need to actually be performant. It really depends on the situation.
  10. Yeah, no matter how often horizontally scaled, 9 women can't have a baby in a month... the mythical woman-month! :)
  11. Yeah, no matter how often horizontally scaled, 9 women can't have a baby in a month... the mythical woman-month! :)

    ROTFL. I showed my wife this comment and she really appreciated it Henrique.
  12. The definition of "scalable"[ Go to top ]

    Yeah, no matter how often horizontally scaled, 9 women can't have a baby in a month... the mythical woman-month! :)
    ROTFL. I showed my wife this comment and she really appreciated it Henrique.

    This is where benchmarks get deceiving...

    9 women can create 9 babies in 9 months, thus 1 baby/month...but as they say, lies, damn lies, and statistics.
  13. Latency vs Throughput[ Go to top ]

    Yeah, no matter how often horizontally scaled, 9 women can't have a baby in a month... the mythical woman-month! :)
    ROTFL. I showed my wife this comment and she really appreciated it Henrique.
    This is where benchmarks get deceiving...9 women can create 9 babies in 9 months, thus 1 baby/month...but as they say, lies, damn lies, and statistics.

    In other words pregnant women scale in terms of throughput but not in terms of latency.
  14. But That IS possible![ Go to top ]

    Arrange so that requests for babies are dispatched to a different one of the 9 women each month. By the 9th month, you will begin receiving a response of 1 baby per month, which will continue for 9 months.

    Problem: you didn't specify your initial conditions completely. But it _was_ funny.
  15. An operational definition of scalability was provided by Michael D. Kersey on an old newsgroup thread:
    http://groups.google.com/group/microsoft.public.inetserver.asp.components/msg/d9846b908f678f15?hl=en&
    To quote from that newsgroup post:

    “IMO a reasonable definition of scalability for a given platform P and application A is
    S(A,P) = R(A,P) / C(A,P)
    where
    R = Maximum number of requests processed per second by application A on platform P,
    C = Cost of hardware and software to develop and support application A on platform P.

    I’ve assumed 100% availability for the purposes of this discussion. Availability could be added as an input to the definition if desired. This term displays the expected behavior shown by common usage of the term “scalability”:
    1. As throughput R increases, scalability increases,
    2. As cost C increases, scalability decreases,
    3. Different platforms and different software may be compared using this definition,
    4. You can use this definition to estimate costs of a proposed system, given an anticipated user load.
    5. Both R and C can be estimated using known techniques.

    So using this definition, scalability’s dimensions would be “requests processed per second per dollar”. Given the following known values for a single application Z:

    running on platform X:
    R(Z) = 1000 requests/second,
    C(Z) = $40,000
    S(Z) = 1000 requests/second / $40,000 = 0.025

    running on not-so-fast but less expensive platform Y:
    R(Z) = 500 requests/second,
    C(Z) = $10,000
    S(Z) = 500 requests/second / $10,000 = 0.05

    While platform Y’s throughput (performance) is much less than that of platform Y, Y is much more scalable than (in fact is twice as scalable as) platform X when running application Z.

    This definition can also be used to estimate the utility of using various software methodologies. For example, heavy use of components or object technology may or may not change each factor in the definition: the degree to which each is changed determines whether the resultant system is more or less scalable.
  16. Right![ Go to top ]

    It is nice to see this word defined and used correctly. I cringe when I hear managers and architects abuse the word scalable on a constant basis. Many seem to think if the hardware can handle a specific load it is "scalable".
  17. Right![ Go to top ]

    It is nice to see this word defined and used correctly. I cringe when I hear managers and architects abuse the word scalable on a constant basis. Many seem to think if the hardware can handle a specific load it is "scalable".

    Indeed, Werner Vogels is an authority on the subject matter and is used to thinking and expressing himself rigorously. Refreshingly different from the Fleury fuzzy thinking of the previous thread, if you ask me!
  18. That's Not All Folks[ Go to top ]

    A service is said to be scalable if when we increase the resources in a system, it results in increased performance in a manner proportional to resources added

    I'd take this a step further and state that a system is robust if performance reduces proportional to the number of resources required without crashing or alienating your client. A system should also be robust first and scalable second (although both need to be designed into the system from inception). There's no point in trying to scale a system when it's dishing out HTTP 500 errors to your customers.
  19. You should also add that a scalable system should have a drop down somewhere in the UI that let's the user select how much scaling to apply; e.g. 1x, 2x, 4x, 8x, 16x, etc...

    The system should definitely scale by the amount the user has selected.

    You wascal you!
  20. Scaling[ Go to top ]

    Personally, I find a fish knife to be best for scaling, you use the ridges on the back of the knife to scrape them off. Now you have a scalable architecture!
  21. Speaking of scalability[ Go to top ]

    It's perhaps a little ironic but here's what I just got upon checking this thread:

    java.rmi.ServerException: RuntimeException; nested exception is: kodo.util.DataStoreException: java.util.NoSuchElementException: Timeout waiting for idle object
  22. Colbert would say[ Go to top ]

    I don't need facts and architecture. I have no proof, but I believe it "scales" and it does.

    One less silly note. It's amazing how many times I've seen systems get re-designed because scaling out wasn't considered at the beginning. I'm sure others have similar experience.

    peter
  23. scaling out wasnt considered[ Go to top ]

    I have seen systems where scaling out wasn't considered not get redesigned. On the other hand, I've seen systems that scaled just fine get redesigned for no good reason at all - like jumping on the latest framework fad. Let's face, we've all seen a lot of stuff.


    You wascal you!
  24. Scaling, be precise[ Go to top ]

    Perfect scaling simply means no increase in latency as you add load to the system. Nothing more, nothing less.
  25. scalibity, definiton[ Go to top ]

    Perfect scaling simply means no increase in latency as you add load to the system. Nothing more, nothing less.

    Every system has a load capacity. If that capacity is reached, increasing the load will degrade the system's performance. Therefore, perfect scaling means no performance degradation, as you add load and more resource to the system..
  26. Is achieving good scalability possible? Absolutely, but only if we architect and engineer our systems to take scalability into account.

    It's only possible to the extent that the workload is inherently scalable, a concept closely related to Amdahl's Law that if the serial part of a workload is 20%, say, then the most you can ever speed a single-threaded system by adding threads is a factor of 5.

    Enjoy the Fastest Known Reliable Multicast Protocol with Total Ordering

    .. or the World's First Pure-Java Terminal Driver
  27. Food for Thought[ Go to top ]

    Is it possible to make a hit counter for a single web page scalable, assuming it is required to be accurate?

    Ponder that for a while before asserting a complex system is truly scalable.
  28. Simple[ Go to top ]

    use google analytics and it's done. sorry, I couldn't resist make a joke. A single web page hardley seems worth tracking :)

    peter
  29. Food for Thought[ Go to top ]

    Is it possible to make a hit counter for a single web page scalable, assuming it is required to be accurate?Ponder that for a while before asserting a complex system is truly scalable.

    What do you mean by "accurate"? Can I queue up counter increments and update it asynchronously, so that lock contention doesn't block a worker thread?

    Does it have to be realtime accurate, or approximately accurate and not lose any data, so that the total count will eventually catch up (assuming the load is such that the queue isn't just getting bigger and bigger all the time)?

    It's often these kinds of considerations that define the scaling profile of your application. For instance, if it has to be realtime accurate, you're only going to scale to the point where lock contentions for updating the single counter start to block your worker threads and throttle throughput.
  30. Food for Thought[ Go to top ]

    Is it possible to make a hit counter for a single web page scalable, assuming it is required to be accurate? Ponder that for a while before asserting a complex system is truly scalable.

    What do you mean by "accurate"? Can I queue up counter increments and update it asynchronously, so that lock contention doesn't block a worker thread? Does it have to be realtime accurate, or approximately accurate and not lose any data, so that the total count will eventually catch up (assuming the load is such that the queue isn't just getting bigger and bigger all the time)?

    It's possible even to do this is in relatively scalable way (e.g. tens or even hundreds of thousands of pages per second), such that each page gets a value in a FIFO manner, without duplicates, respectful of request ordering, and immune to server failure (i.e. clustered).

    Not bad for a global -- a single point of contention in a distributed environment ;-)

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  31. Food for Thought[ Go to top ]

    It's possible even to do this is in relatively scalable way (e.g. tens or even hundreds of thousands of pages per second), such that each page gets a value in a FIFO manner, without duplicates, respectful of request ordering, and immune to server failure (i.e. clustered).

    No, that's not "relatively" scaleable. You're saying that incrementing a counter and passing the result somewhere can be done so fast that for any reasonable requirement it won't be a bottleneck. Which is true. But that doesn't change the fact that ultimately you only have one counter. So you're using "relatively scaleable" to mean "we can make if blindingly fast in absolute terms" but not to the theoretical infinite number of concurrent requests going to an infinite number of servers.

    The point is (and I think those who have responded get it, BTW) that scaleability is about parallelising tasks (so you can add processors/memory/server at will) and task parallelization only works if there's no resource contention, which typically happens when a value must be written.

    I really like the bank tellers and bank manager example someone posted earlier.
  32. Food for Thought[ Go to top ]

    The point is (and I think those who have responded get it, BTW) that scaleability is about parallelising tasks (so you can add processors/memory/server at will) and task parallelization only works if there's no resource contention, which typically happens when a value must be written.

    What you say about parallelising as a key to scalability is true as far as it goes. Given that, my definition of what makes code scalable is how well it plays with others - in other words - how efficiently it uses shared resources. We all know that the bottlenecks are the resources (often data) that we must share. For example, only one thing can update a bit of data at a time. Often that update is only valid or is premised on other values holding a certain state when the update occurs. So how do we implement that? How well we answer that question is how well we scale.
  33. Food for Thought[ Go to top ]

    It's possible even to do this is in relatively scalable way (e.g. tens or even hundreds of thousands of pages per second), such that each page gets a value in a FIFO manner, without duplicates, respectful of request ordering, and immune to server failure (i.e. clustered).

    No, that's not "relatively" scaleable. You're saying that incrementing a counter and passing the result somewhere can be done so fast that for any reasonable requirement it won't be a bottleneck. Which is true. But that doesn't change the fact that ultimately you only have one counter. So you're using "relatively scaleable" to mean "we can make if blindingly fast in absolute terms" but not to the theoretical infinite number of concurrent requests going to an infinite number of servers.

    I claimed neither linear nor infinite scalability. I can show that an application with this requirement will scale (with a scaling factor relatively close to 1.0) up to a healthy number of servers.
    The point is (and I think those who have responded get it, BTW) that scaleability is about parallelising tasks (so you can add processors/memory/server at will) ..

    Trust me, I spend thousands of hours each week (thanks to scalability!) solving just this problem. Check out the current edition of SD Times:

    http://www.sdtimes.com/article/story-20060401-19.html
    .. and task parallelization only works if there's no resource contention, which typically happens when a value must be written.

    I was aluding to an interesting solution to the write contention that occurs in a distributed environment in which a synchronous transactional redundant copy must be maintained. Without that solution, the number is more like 1000 TPS, tops ;-)

    Peace,

    Cameron Purdy
    Tangosol Coherence: Clustered Shared Memory for Java
  34. I like your aggregate example[ Go to top ]

    I've seen the aggregate problem first hand and boy, lots of people are still having the problem. I know a few places that still do that kind of aggregates in an over night process.

    peter
  35. Food for Thought[ Go to top ]

    What do you mean by "accurate"? Can I queue up counter increments and update it asynchronously, so that lock contention doesn't block a worker thread?

    Every user must receive a unique number. No numbers may be skipped. Numbers must be assigned in accordance to when the "hit count" request is received by the "hit counter subsystem."
    Does it have to be realtime accurate, or approximately accurate and not lose any data, so that the total count will eventually catch up (assuming the load is such that the queue isn't just getting bigger and bigger all the time)?

    I think the requirement to transmit the number back to the client makes this moot - queueing up the increment also queues up sending the response back to the client, reducing response time to client requests.
    It's often these kinds of considerations that define the scaling profile of your application. For instance, if it has to be realtime accurate, you're only going to scale to the point where lock contentions for updating the single counter start to block your worker threads and throttle throughput.

    Yes, it does. Because returning approximate numbers back to the client but ensuring that an exact count is available (or can be made available, say by just logging each hit, then counting the hits) would make it so you could parallelize the process, and therefore make it scaleable.
  36. And in the real world . . .[ Go to top ]

    We can parse the definitions of scalability in many ways, but the most fundamental one would be to say that "you can increase a system's potential load by adding additional hardware resources."

    In most cases, for scalability to be useful, it needs the following characteristics.

    - Close to linear--I get twice as many transactions or pages served for twice the total hardware cost.

    - Minimally impactful on latency--My transactions and pages served don't take longer than they did before.

    - Operationally efficient--It should take less human resources (relative to hardware) to manage the system as it scales up.

    - Resilient--While the overall chance of any given hardware failure event increases as you scale, the potential user impact from such an event should be reduced.

    To achieve this certainly requires proper design from the ground-up, but there are some products out there that can help you fix scalability limitiations with limited refactoring rather than system re-writes.

    Cheers,

    Gideon
    GemFire--The Enterprise Data Fabric
    http://www.gemstone.com
  37. I like the term scalability.[ Go to top ]

    I like the term scalability. A service is scalable when we increase the resources in a system. Increasing performance is serving more units of work. Good service is urgent for all industries. Without an excellent service how can companies survive in the long run? When picking a server hosting enterprise, the years of experience the hosting company has is important.

    Paul - http://www.connetu.com