SwarmStream: Grid Computing Without the Grid

Discussions

News: SwarmStream: Grid Computing Without the Grid

  1. SwarmStream: Grid Computing Without the Grid (17 messages)

    In a new Javalobby Expert Presentation, Justin Chapweske of Onion Networks describes the fascinating "swarm streaming" that accelerates and improves the reliability of existing applications without requiring an actual grid network.

    Check out the presentation and see how you can speed up your network based application's IO by 400%. The presentation is full of code and you can even try out some webstart demos that show just how much you can increase performance for your specific network.

    Threaded Messages (17)

  2. SwarmStream Links[ Go to top ]

    SwarmStream Home Page
    WAN Acceleration Benchmark (Webstart)
    Graphical Swarming Simulation (Webstart)
    SwarmStream Forums
  3. Yes, TCP has limits[ Go to top ]

    Of course the main things with TCP are:
    1. Window sizes, which control the maximum amount of data that can be in flight at any given time. As latency increases, window size needs to go up to keep the same bandwidth.
    2. Congestion control; TCP implementations tend to assume that packet losses are indicative of congestion and back off the speed when there are lost frames. If you have a particularly lossy network, this can get you.

    Of course there are many cases where your network application will be limited by round trip time instead of bandwidth. This is common with app-server or web-server style small-request-followed-by-small-response communication.

    Also, there are lots of grad students (they invented TCP, if you'll recall) working on alternative protocols/implementations, such as B-TCP, but most are geared for Internet 2/IPv6.

    So what are my recommendations?
    1. If your application can tolerate losses, try UDP, which does not suffer from bandwidth limitations. Of course UDP is not a streaming protocol so it requires a lot of work to use.
    2. Tweak the TCP settings of your application or server. Some are global. Some are per-socket. It is platform dependent. But changing the TCP timeout values, window sizes, and so on can have a huge impact on performance. In the early days of our company, we successfuly covered up for our product's high packet losses (now fixed) by adjusting TCP parameters.
    3. If you're doing request/response stuff, go asynchronous. Send a bunch of requests before taking any responses. Some web services libraries make this simple. Most RMI implementations do not.
    4. If you've got large blocks of data (big files, high-bandwidth streams, or whatever), lossy networks, large latencies, and TCP isn't delivering the bandwidth you need (even after tweaking), take a look at other protocols. If you need to stick with the here and now (read IPv4), why not check this technology out!
  4. SwarmStream is not just TCP, it is HTTP over TCP, which means that it is firewall friendly and can leverage an existing HTTP infrastructure such as load balancers, caching proxies, CDNs, standard web servers, and servlet containers. These practical considerations become extremely important when looking for a solution to real world problems.

    However, if you want a UDP-based solution, we also have a transport for Multicast and unicast delivery of data using UDP and Forward Error Correction. This product, WAN Transport XNE, is still mostly under wraps but is being deployed for some very large scale satellite delivery applications. If you want more information about this, please feel free to contact us.
  5. Cameron[ Go to top ]

    It'd be interesting to hear from cameron on this topic, any thoughts?
  6. Cameron[ Go to top ]

    It'd be interesting to hear from cameron on this topic, any thoughts?

    I thought it was pretty cool .. it's client access against the server(s) plus any clients that have already made the same request. I haven't used it, or tried it, etc. so I don't know how good it is, but it seems like a pretty scalable model that could solve the "overwhelming streaming access" problem, kind of like the Slashdot effect or perhaps MP3 sharing. It does require code on the client, but if the client _is_ the code (e.g. a file sharing client, or a custom browser) then it could be pretty neat.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  7. Cameron[ Go to top ]

    Hi Justin,

    Congratulations ! Quite an impressive peace of engineering, glad to read this kind of news here. I do not wish you success, as obviously you already have some, i just wish you more !
    It does require code on the client, but if the client _is_ the code (e.g. a file sharing client, or a custom browser) then it could be pretty neat.Peace,Cameron Purdy

    Any proxy could host this code i believe, so nothing to deploy on the client side, just setup the proxy. Access providers will be very interested by these libraries i suppose.

    Christian
  8. client libraries[ Go to top ]

    Access providers will be very interested by these libraries i suppose.Christian

    Kind of Akamai-ish. Good point .. I hadn't thought of that.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  9. client libraries[ Go to top ]

    Access providers will be very interested by these libraries i suppose.Christian
    Kind of Akamai-ish. Good point .. I hadn't thought of that.Peace,Cameron PurdyTangosol, Inc.Coherence: Shared Memories for J2EE Clusters

    I thought you would have thought of that. Bad point ! ;o)

    Do you have any office in France ? Are you willing to host incubating projects, that might be related to caching ?

    Christian
  10. client libraries[ Go to top ]

    Do you have any office in France ? Are you willing to host incubating projects, that might be related to caching ?

    We don't have an office in France, but we are in the very early stages of planning for one.

    On the other topic, contact me by email (you can just use "cameron" at the obvious "tangosol.com").

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  11. Proxying[ Go to top ]

    The cool thing about the proxy integration with our software is that the entire proxy environment is embeddedable, so you can deploy it with your application and run it the same VM instance. Otherwise, as you said, you can run it on a seperate machine or as a separate process on the same machine.

    The lower level APIs can be used effectively anywhere that you'd use URLConnection and also give you much more control over being able to lazily add sources, lazily specify file destinations, get progress events, use custom protocols, traffic shaping, etc.

    --
    Justin Chapweske, Founder and CEO - Onion Networks, Inc.
    http://onionnetworks.com/
  12. Swarmstreaming for Servers[ Go to top ]

    Thanks Cameron.

    One thing to point out is SwarmStream is just as useful for server-side applications as client applications.

    For example, if you need to replicate some data to a cluster or even a geographically disparate set of servers throughout the world, SwarmStream can accelerate it for you.

    Even if you just need point-to-point/server-to-server communication, SwarmStream can increase the reliability and speed of that as well. Basically, anywhere you use a URLConnection (or a Socket), SwarmStream will probably be applicable.
  13. If I get it right, this SwarmStream works by breaking down the contents into piece and transpoorting them over multiple HTTP/TCP/IP connections. Some download managers like FlashGet works pretty much the same way. Maybe you throw in RAID error correction to squeeze more bandwidth.

    But this is just one client grabbing more shares of the network bandwidth from others. According to Game Theory, if all your clients start opening up tons of connections, they just kill the server faster.

    And according to Information Theory, you can't get faster than the physical limition of your network. Before that matters, your ISP will certainly implement rate-limiting policy to prevent some clients from going too fast and opening up too many connections. The packet loss is usually the results of congestion/metering in the network itself. (Packet dropping is a common way to limit the traffic rate.) Openning up more connections should not reduce the overall packet loss rate. So after squeezing out other clients, your data rate is still confined to the max network bandwidth allowed.

    Can you explain the above?
  14. So after squeezing out other clients, your data rate is still confined to the max network bandwidth allowed.Can you explain the above?
    Hey, cool ! Justin just infirmed the Information Theory ! We are just getting closer to travelling faster than light, go back to the future, and live our lives again and again for ever !

    Hey Justin, if you have the secret please keep it for yourself, as th world is already such a mess without time-based paradoxal ghosts... :o)))

    Chris
  15. TCP Doesn't Scale Well[ Go to top ]

    The problem that we're focusing on for point-to-point is long fat networks. The truth is that TCP simply doesn't scale well in the face of latency and a small amount of packet loss. For instance, if you have a T3 with 100 ms of latency and .5% packet loss, TCP can't scale to more than 5 Mbps. If you want more info on this, contact us and we'll send you a spreadsheet to calculate max bandwidth for a given packet loss and latency (and no, this isn't a buffer size thing).

    SwarmStream is not designed to prioritize its flows over other flows, its designed to allow your application to utilize its fair share of bandwidth regardless of latency.

    As far as multi-source goes, its designed for one of two scenarios:

    1) You want automatic failover if one source fails or if its performance degrades.

    2) You have more bandwidth at the receiving location than an individual server can provide. This is very common for broadband where the servers are capped at 1 Mbps while the receivers have 3+ Mbps down.

    We work hard to keep people educated about these issues and by default the software is limited to four (4) total connections unless you explicitly override a property.

    I hope this answers your questions/concerns. Let me know if I can clear anything else up.
  16. If I get it right, this SwarmStream works by breaking down the contents into piece and transpoorting them over multiple HTTP/TCP/IP connections. Some download managers like FlashGet works pretty much the same way. Maybe you throw in RAID error correction to squeeze more bandwidth.But this is just one client grabbing more shares of the network bandwidth from others. According to Game Theory, if all your clients start opening up tons of connections, they just kill the server faster.

    There appear to be two parts of this Smarmy(tm) Swarming(tm) technology ;-)

    1) what you described -- basically issuing 10 requests in parallel instead of one (primarily b/c the TCP properties are not configured with an adequately sized window?)

    2) something you missed -- it can issue 1 of those to the orig server, 1 of those to another server that already got the file from the orig server, and 8 of those to various client machines that already got the file from any of the above (including each other).

    Whether or not it works efficiently is a different topic .. one cannot simply guess at such a thing without actually testing it. However, the primary benefit is spreading the load, not attacking the same single point in parallel.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Shared Memories for J2EE Clusters
  17. 2) something you missed -- it can issue 1 of those to the orig server, 1 of those to another server that already got the file from the orig server, and 8 of those to various client machines that already got the file from any of the above (including each other).Whether or not it works efficiently is a different topic .. one cannot simply guess at such a thing without actually testing it.

    I knew the distributed part. In fact, FlashGet allows you to download the same content from other known servers (but of course I won't use it in the unsecure Internet environment.)

    However, the bandwidth limitation problem applies to both the server end and the client end. You need to squeeze out others bandwidth from ends.

    I'm sure it will work very well once you tried to open up 10 connections to download a huge file, say the JDK1.5, using FlashGet.

    I just point out how this technology works and what are issues.

    By the way, the world actually operates like this technology. 10% people own 90% of economic resources. ;-)
  18. 1) what you described -- basically issuing 10 requests in parallel instead of one (primarily b/c the TCP properties are not configured with an adequately sized window?)

    Typically buffer sizes are only the bottleneck on networks with zero packet loss, high bandwidth, and high latency. More commonly, the latency is so long that TCP simply doesn't have enough time to double its rate before it experiences another packet loss and has to drop back.

    For more information on this and a formula you can use to estimate throughput with a given packet loss and latency, see:

    ftp://gaia.cs.umass.edu/pub/Padhye-Firoiu98:TCP-throughput.ps.Z
    2) something you missed -- it can issue 1 of those to the orig server, 1 of those to another server that already got the file from the orig server, and 8 of those to various client machines that already got the file from any of the above (including each other).

    The software is capped at four total connections by default, though it monitors the latency and throughput it sees on those connections in real time and automatically uses the closest (latency wise) locations. So you can give it a list of 10 locations, and it will only open connections up to the best four.