Discussions

News: 20 Real-Life Challenges in Cloud Computing

  1. 20 Real-Life Challenges in Cloud Computing (7 messages)

    Nikita Ivanov from GridGain Systems, an Open Source Grid Computing Middleware, in his recent blog has shared his experiences based on some extensive work on Cloud Computing at GridGain (i.e. deploying Java grid application on cloud environment), and described a distinct set of problems and challenges associated with that work whether it is related to the cloud computing in general or to grid middleware you are using on it. Here is a short list in no particular order that he had accumulated over a year by now and something that drives many of the improvements in GridGain that are currently in the works: 1. Most likely you do NOT need cloud computing – but if you do you would know it for sure by now; people who have legitimate both technical and business-wise use cases for cloud computing have been trying to do it internally for many years 2. The best way to think about cloud computing is “Data Center with API” – that should clarify most of the questions… 3. Creating the image for something like Amazon EC2 is worth about 45 minutes of your effort – but you will spend weeks and months after that fine tuning your application and developing additional functionality; plan accordingly 4. You are about to deal with 100s and 1000s of remote nodes… Things that worked in 10s of nodes often “mysteriously” don’t work on the “cloud” scale. We were surprised by the amount of configuration tweaks we had to make to run GridGain on 512 nodes on Amazon EC2 under the load. Proven grid middleware is essential (quite obviously) 5. You cannot rely on the fact that environment will be homogeneous – most likely it will be not: different CPUs, different amount of memory, etc. 6. Debugging problem on that scale requires pretty deep understanding of distributed computing; learning curve is very steep; trial and error is often the only solution; plan accordingly 7. IP multicast will likely not work or work with significant networking limitations. For example, you may not get all the computers in your cloud in the same IP multicast group. QoS on IP multicast can be unknown, at best 8. Traffic inside is very cheap or free – but traffic outside is expensive and can “get you” very quickly 9. If you have to use cloud all the time – economics go down and in many case it is cheaper to traditionally rent in data center; that means that in many cases using clouds is best as an options to “outsource” pick loads – in such cases the economic effect can be dramatic 10. Up time and per-computer reliability is low – comprehensive failover support on grid middleware is a must 11. Static IPs are not guaranteed – it kills automatic deployment for 90% of the grid framework out there 12. Almost always plan on having multiple clouds, at least one internal and one or many external; you are always going to have data and processing that cannot cross the boundaries of your internal data center; without comprehensive support from grid middleware for location transparency (a.k.a. virtual cloud) this is a show stopper 13. External clouds (i.e. hosted NOT by you) present problem of sharing data: - Do you copy data to the cloud? - Usually no local-DB access from the cloud - Can you legally copy the data? - Double storage of data locally and on the cloud? - Synchronization? - Security? - Data affinity? - Local data is removed once image is undeployed… - Etc, etc. 14. Carefully think through dev/qa/prod layout and how this is all organized – things get way hairy with multiple cloud, etc. 15. Clunky (re)deployment of your application onto the cloud can slow down development process to a halt – support from grid middleware is absolutely essential here 16. Often connections are one-directional, i.e. you can connect to the cloud – but NOT from the cloud back to you – comprehensive communication capabilities supporting one-directional connectivity and disjoint clouds in grid middleware is a must 17. Cloud are implemented based on hardware virtualization – make sure your grid middleware can dynamically provision such images on demand, i.e. basically start the image (start paying) when certain conditions are met and stop the image (stop paying) when other conditions in your system are met 18. Stick with open source stack (no, this is not a plug) – having a source code helps greatly during debugging in such unusual situations 19. Linear scalability can only be achieved in a control test environment (like in our recent test) – real world applications will exhibit some sort of non-linier scalability; it is essential to have at least a ballpark number of what you are expecting the scalability and performance should be when you run your application on the cloud – battery of performance and scalability tests developed upfront is usually the best option 20. Personal recommendation: use Amazon EC2/S3 services – the best offering at this point by a long, long mile

    Threaded Messages (7)

  2. Most likely you do NOT need cloud computing – but if you do you would know it for sure by now; people who have legitimate both technical and business-wise use cases for cloud computing have been trying to do it internally for many years
    That's a ridiculous statement. I think almost anyone, other than someone who can run their dev, integration and prod environments on one box, can somehow benefit from cloud computing. When referring to on demand scalability feature, then maybe only apps that have variable scalability requirements can benefit from them. 1. You can benefit from clouds automatic provisioning and management of nodes. Say you have a QA cycle that requires more instances than usual. Yes, you can buy the hardware, but why? Just provision a node within 2 minutes on EC2 and QA now has a completely isolated box to do their testing on. Ah, and you can shut this box down 10 hours later, owing only a few dollars and never have to use it again. 2. Yes, setting up clouds is not a matter of starting a node, it takes hours or days to then configure it, install all necessary packages, create some automated scripts, etc... But this work is a benefit for lots of money saved by provisioning on demand hardware that you now won't have to maintain in house. 3. Running an application on a cloud is not as straight forward as just starting multiple nodes. The application or its middleware has to be cloud aware. For the most part it means that it has to be horizontally scalable without any reliance on a static node configurations. If your application is capable of doing it now, it can most likely run in a cloud environment. Nikita has a great point though, running an app on a few nodes is different than running it on 100 nodes. But most likely you won't need 100 nodes and if you do, you probably don't want to write the supporting infrastructure yourself. There are many middleware products out there now that are cluster away. GridGain, GigaSpaces (my favorite), and many more... Ilya
  3. Not that simple...[ Go to top ]

    That's a ridiculous statement. I think almost anyone, other than someone who can run their dev, integration and prod environments on one box, can somehow benefit from cloud computing. When referring to on demand scalability feature, then maybe only apps that have variable scalability requirements can benefit from them. You can benefit from clouds automatic provisioning and management of nodes. Say you have a QA cycle that requires more instances than usual. Yes, you can buy the hardware, but why? Just provision a node within 2 minutes on EC2 and QA now has a completely isolated box to do their testing on. Ah, and you can shut this box down 10 hours later, owing only a few dollars and never have to use it again.
    That is usual over-simplification in my opinion. If you would work with deploying on clouds you would see plenty of actual challenges this break this ideal picture (many of them mentioned in the original post, by the way). I'm very optimistic, however, about overall promise and technology behind cloud computing (i.e. deploying grid applications on cloud infrastructure) - but I think the middleware stack has to come a long way to make it a much simpler reality. We at GridGain (can't speak for others) are working very hard on pretty much a break-through technology & approach to this problem and we'll be releasing it in the next 3-5 months. In a nutshell, we will bring the same simplicity that GridGain users are enjoying right now to a seamless and on-demand running (scaling out) of your grid applications on the cloud infrastructure. What's even more unique about is that you will be able to do all that without ever leaving your Java IDE or emacs or vi - the same straightforward approach and unique capabilities like grid P2P class loading, location transparency, adaptive load balancing, etc, etc. will still be available out-of-the-box. Stay tuned, Nikita Ivanov. GridGain - Grid Computing Made Simple
  4. Re: Not that simple...[ Go to top ]

    I do work and deploy on clouds and have faced many challenges on getting an application to be "cloud compliant", if there is such a thing. If nothing else, the lack of good SLAs in return for being able to dynamically bring up nodes without any static mappings is a challenge to any traditional application that relies on such facilities. Relational dbs for example are currently not cloud compliant. Yo;u can easily deploy one on a single node, but if you want to do any sort of replication, it requires a more static infrastructure with greater availability and it can't reprovision itself if failure/reconfiguration of the cluster occurs in a cloud at runtime. But what I see you constantly push is grids, of course I understand your bias. Grids are great, I recently wrote an app based on GigaSpaces (they work flawlessly on ec2). Not everyone needs grids and there are many more use cases for clouds than grid computing. Ilya
  5. Most likely you do NOT need cloud computing – but if you do you would know it for sure by now; people who have legitimate both technical and business-wise use cases for cloud computing have been trying to do it internally for many years


    That's a ridiculous statement. I think almost anyone, other than someone who can run their dev, integration and prod environments on one box, can somehow benefit from cloud computing. When referring to on demand scalability feature, then maybe only apps that have variable scalability requirements can benefit from them.
    Hi Ilya, I you think about it, what is even more ridiculous is arguing over a term as ill defined as "Cloud Computing". It seems to me that this term can mean almost anything you want it to mean :) It is this loose language that gets us into trouble and allows room for marketing speak void of real technical content, which is probably what this article is really about. Why not be more specific about what you mean? I like the term Software-as-a-service, which doesn't need Amazon EC2 or GridGain incidently:) Or even better code-on-demand from Roy Fieldings REST Dissertation. Loose talk, loose minds... Food for thought? Paul.
  6. It is this loose language that gets us into trouble and allows room for marketing speak void of real technical content, which is probably what this article is really about.
    I am failing to see a marketing pitch here. The post clearly shares the experiences an individual had when deployed a large grid infrastructure onto Amazon EC2 cloud. Would you rather prefer if the post read as follows: Nikita Ivanov from unknown company with unknown experience in Cloud Computing has vetted in his blog about cloud computing and is sharing with us experiences we are not sure he ever had? Lighten up. This post is definitely not an PHD thesis on cloud computing, but surely does provide useful pointers about the subject. Best, Dmitriy Setrakyan GridGain - Grid Computing Made Simple
  7. It is this loose language that gets us into trouble and allows room for marketing speak void of real technical content, which is probably what this article is really about.

    I am failing to see a marketing pitch here. The post clearly shares the experiences an individual had when deployed a large grid infrastructure onto Amazon EC2 cloud
    OK. So why not call it: "An Article on my Experiences Deploying a large Grid Infrastructure onto Amazon EC2"? This would be so much more useful, especially if he gave some more context about precisely the type of Application upon which his experience was based. Of course I have no means of knowing the motives of the author, and beyond my speculation they could be very noble. My real point is about language and the descriptive power of the term "Cloud Computing". If you speak to psychologists they will tell you that language is a very important part of the thought process. In fact without language thought itself may be impossible, yet in Software we use loose, imprecise language all the time. Is it any surprise that people end up talking (arguing) past each other? Paul.
  8. Paul, completely agree. Cloud computing, which amazon is trying to make synonymous with ec2 is a loose term. Maybe I'm really speaking of ec2, which is much more than just resources in a virtual cloud. I think their resource provisioning and management facilities can benefit any datacenter, whether in production or just in qa/devel phase. So I agree, I should change everything in my reply to rather say "ec2".