Accessing exclusive resources in a cluster

Discussions

Performance and scalability: Accessing exclusive resources in a cluster

  1. We are in the process of clustering our communication management solution and there is one unique problem I have a hard time finding existing solutions for:

    Given each node in the cluster is identical wrt to the installed software.
    We are polling POP3 mailboxes to receive e-mails and distribute them.

    Now, you can't have more than one process connect to a given POP3 account without causing major headaches.
    So access to this resource is limited to one process/node.

    Now imagine we have several of those exclusive resources.
    We would like to
    - configure how many nodes in the cluster work on those kind of resources
    - ensure that each exclusive resource is being accessed
    - ensure that in case of a node failure the resource is being picked up by one of the remaining nodes
    - have different types of resources and different distribution rules for each type
    - have static rules for nodes accessing an exclusive resource like a scanner, which is connected to only one machine

    And of course: There can be no delegating service on only one machine, which would be a single point of failure.

    Also I'm still looking for a proper name or definition for this kind of problem.

    Any insights?
    Thank you,
      Jochen Bedersdorfer

    Threaded Messages (3)

  2. Partition by POP3 account[ Go to top ]

    Hi Jochen,

    This is a common problem that we see; our users generally use Coherence's partitioning services to solve it.

    You can partition "ownership" of the POP3 accounts across multiple servers (we generally use hash-based partitioning). This ensures that only one server will access a given POP3 account at any point in time.

    Failover/Failback is the tricky part; with Coherence, this is taken care of for you by automatic re-partitioning. In the absence of automatic failover/failback facilities, you'll need to find a reasonable approach. If you have a concept of cluster membership, you can simply redistribute the POP3 accounts evenly across the current cluster membership. In the worst case you can maintain this membership information in a shared database (though this will have a noticeable efficiency impact during failover/failback).

    To address the single point of failure, you can either design your account access to be stateless (or at least fully recoverable), or you can maintain backup state on a separate server. Doing the latter is obviously much more challenging to do reliably without clustering services.

    The other advantage to partitioning is that you can coordinate all access to a given POP3 account without the need for cluster communication.

    Jon Purdy
    Tangosol, Inc.
  3. Dynamic partitioning?[ Go to top ]

    Jon, thanks for the reply.
    Now I don't feel that lonely and lost anymore :)

    I'll take a good look at Coherence in the near future.

    We would need some kind of dynamic partitioning: New POP3 accounts (or other types of exclusive resources) can be added or removed at any time and need to be serviced by a group of nodes.
    Is this something Coherence is capable of?

    Thanks again for the answer. Highly appreciated,
      Jochen
  4. Dynamic partitioning?[ Go to top ]

    Hi Jochen,

    Yes, that's not a problem. Usually, we recommend that your architecture be as homogeneous as possible. In other words, any server should be capable of handling any incoming request. Obviously, you can restrict the ability to manage a given resource to a subset of the cluster, but usually this isn't required.

    With Coherence, dynamic addition/removal of entities (like POP3 accounts) is easily handled since each server knows which entities it is responsible for managing, whether or not the entity actual exists yet.

    During a rebalancing (resulting from server failover/failback), ownership of a POP3 resource should (ideally) be capable of migrating from one primary server to another. As long as you have a means of dropping one connection and then establishing another, this shouldn't be an issue.

    The pattern is that one server manages the entity at any point in time, but that all servers can access the entity (through the primary owner).

    If you do end up looking at Coherence, I'd suggest sending our support an email (support at tangosol dot com) with some more detailed information on your requirements. This is the easiest way of getting some detailed feedback on whether Coherence is a good fit for what you're doing.

    Jon Purdy
    Tangosol, Inc.