Variable replication in the cloud

Discussions

News: Variable replication in the cloud

  1. Variable replication in the cloud (4 messages)

    I've written a prototype of a new tool called ReplCache, which provides a hashmap whose elements can be replicated to other nodes in the cloud a predefined number of times. For example, - put(key, value, -1) replicates key and value to all nodes in the cloud, ie. every node stores key/value. - put(key, value, 3) uses consistent hashing and picks 3 nodes in the cloud on which to store key and value. - put(key, value, 1) uses consistent hashing and stores key/value only once in the cloud. This is distribution, not replication, and is the same as what memcached provides. The advantages of variable replication are that - replication can be turned on or off per data element - if replication is used, then the user can define what kind of reliability is desired for that data - we can put the aggregated memory to better use The blog entry, article, and demo can be found at http://belaban.blogspot.com/2009/01/replcache-storing-your-data-in-cloud.html

    Threaded Messages (4)

  2. which nodes?[ Go to top ]

    Great work, Bela. I'm glad to see we can choose how important our data are. Does jgroups do any magic to ensure the replicant nodes are on different machines? For example, if I specify a replication count of three, will jgroups try to spread the copies across multiple machines?
  3. Re: which nodes?[ Go to top ]

    Great work, Bela. I'm glad to see we can choose how important our data are.

    Does jgroups do any magic to ensure the replicant nodes are on different machines? For example, if I specify a replication count of three, will jgroups try to spread the copies across multiple machines?
    Yes. The hash function can be provided by the user, the interface is public interface HashFunction { /** * Function that, given a key and a replication count, returns replication_count number of different * addresses of nodes. * @param key * @param replication_count * @return */ List hash(K key, short replication_count); /** * When the topology changes, this method will be called. Implementations will typically cache the node list * @param nodes */ void installNodes(List nodes); }
    The current implementation I provided is quite simplistic, I guess there could be better impls.
    For example, if you have racks with 5 blades each, you might want to place your replicas across different racks if possible, in case a rack loses power (no UPS).
  4. no ups[ Go to top ]

    cool, so I don't need a UPS anymore! jk. nice. thanks for clearing that up.
  5. Re: no ups[ Go to top ]

    cool, so I don't need a UPS anymore!

    jk. nice. thanks for clearing that up.
    Glad you like it. Note, however, that this is a prototype. I don't expect it to be very buggy as it is only 900 lines of code, and runs on JGroups which is mature, but some things could be improved (e.g. replication with K=1)... :-) Cheers,