Discussions

News: JGroups 2.2.9 released

  1. JGroups 2.2.9 released (7 messages)

    JGroups 2.2.9 was released last week. Contrary to what the minor release number suggests, this release contains a large number of fixes and new functionality.

    The main features added are summarized below.

    I also posted performance results for a small test I ran with 2 nodes, each sending 500 million 1K messages to the cluster on my (new) blog at http://jboss.org/jbossBlog/blog/bela/

    JMX support
    -----------
    The channel and most protocols can now be accessed via JMX. This can be used in any environment that provides an MBeanServer, e.g. JBoss or JDK 5. With JDK 5's jconsole, for example, retransmission counters can be viewed in realtime, or operations can be invoked that dump the retransmission windows for NAKACK etc. More information is available at http://wiki.jboss.org/wiki/Wiki.jsp?page=JMX.


    Push model available in JChannel
    --------------------------------
    Instead of having to pull (receive()) a message out of a channel, a Receiver listener can be registered with a channel. When a message, view change, state request etc
    is available, the listener is called immediately. This avoids a context switch, given that all messages are usually placed in a queue inside the channel and then pulled out by the application thread. Here's an example of how this can be used:

    JChannel ch=new JChannel();
    ch.setReceiver(new ReceiverAdapter() {
        public void receive(Message msg) {
            System.out.println("-- received " + msg);
        }

        public void viewAccepted(View new_view) {
            System.out.println("-- new view: " + new_view);
        }
    });
    ch.connect("demo");
    ch.send(new Message());

    Note that ReceiverAdapter is a class which implements the Receiver interface, so that only the methods one is interested in have to be overridden.


    Fine-grained interface binding
    ------------------------------
    Attributes receive_on_all_interfaces and receive_interfaces enable receiving multicast packets on all or a number of interfaces, e.g. receive_interfaces="hme0,hme1,192.168.5.3"


    Retransmission from random member
    ---------------------------------
    [NAKACK] This is helpful if we have a large group, and want to avoid having to ask the original sender of a message for retransmission. By asking a random member, we take some potential load off of the original sender.


    Added payload to MethodCall
    ---------------------------
    Needed to pass additional information with a method call, required in JBossCache.


    Common transport protocol TP
    ----------------------------
    UDP and TCP now derive from this, therefore common functionality has to be implemented and tested only once. TCP now has many more properties supported by TP.


    Bounded buffer in message bundling
    ----------------------------------
    Message bundling has by default always used unbounded buffers. For very fast senders, this could result in more and more messages being stored in the queue before the bundling thread could dequeue and send them, resulting in out-of-memory issues.

    The bundling buffer is now bounded, by default a size of 2000 elements is configured. This can be set via outgoing_queue_max_size="<num_elements>".


    Performance improvements
    ------------------------
    50% speed improvement for
    RpcDispatcher/MessageDispatcher/RequestCorrelator/MethodCall.
    Most headers now support size() and Streamable, making marshalling and unmarshalling faster.


    Discovery of all clusters in a network
    --------------------------------------
    With JGroups/bin/probe.sh or probe.bat, it is now possible to discover *all* clusters running in a network. This is useful for
    - management tools that needs to discover the clusters running, and then drill down into each individual cluster
    - for diagnostics and trouble shooting
    Details at http://wiki.jboss.org/wiki/Wiki.jsp?page=Probe


    View reconciliation (VIEW_SYNC protocol)
    ----------------------------------------
    When a coordinator sends out a new view V2 and then leaves (or crashes), it is possible that not all members receive that view. So we could end up with some members still having V1, and others having V2. The members having V2 will discard all messages from members with V1.

    Note that this is a very rare case, but when it happens, the cluster is screwed up.

    VIEW_SYNC solves this by having each member periodically broadcast its view. When a member receives a view that is greater than its own view, it installs it. Thus, all members will eventually end up with the same view should the above problem occur. Note that the view sending is done by default every 60 seconds, but it can also be triggered through JMX by calling the sendView() method directly.
    See JGroups/doc/ReliableViewInstallation for details.


    Address canonicalization
    ------------------------
    To avoid many instances of Address, 2.2.9 now uses canonicalization, which points the same addresses to the same memory location. Therefore most addresses can be garbage collected as soon as they have been created, resulting in reduced memory overhead. Can be turned off with -Ddisable_canonicalization=true.


    TCP_NIO support
    ---------------
    A new implementation of NIO support has been added. See configuration sample tcp_nio.xml. You can configure the number of "reader_threads", "writer_threads" and "processor_threads". This should allow for large scale TCP based deployments, compared to TCP.

    Bug fixes
    ---------
    Critical: in rare cases, the digests could be computed incorrectly, leading to higher message buffering than necessary

    Critical: message bundling (in TP) changed the destination address, so when unicast messages had to be retransmitted, because dest=null, the receiver would drop them. This would cause UNICAST to stop delivering messages, which would accumulate forever ! This happened only in very
    rare cases when a high sustained throughput was encountered (e.g. 20 million messages sent at the highest possible speed). Workaround: set enable_bundling="false" in UDP.

    Minor: Apply recv_buf_size/send_buf_size to all TCP socket connections.
    Many smaller bug fixes.

    Threaded Messages (7)

  2. JGroups 2.2.9 released[ Go to top ]

    Very interesting... It looks indeed like a major release. I would be interested in better understanding how far JGroups can scale in real-world scenarios. Is there some reasonably up-to-date information available about that ? Do you have the resources (hardware and time) to do some relatively large-scale testing, or are members of the JGroups community doing that and publishing their results ? The performance page mentions some joint work with HP Labs but it's obviously out-of-date.
  3. Scalability[ Go to top ]

    Very interesting... It looks indeed like a major release. I would be interested in better understanding how far JGroups can scale in real-world scenarios. Is there some reasonably up-to-date information available about that ? Do you have the resources (hardware and time) to do some relatively large-scale testing, or are members of the JGroups community doing that and publishing their results ? The performance page mentions some joint work with HP Labs but it's obviously out-of-date.

    I'm interested in looking at scalability too, but I don't have access to large clusters to do this. We're expecting 8 boxes to be aded to our JBoss Atlanta office in January, so I can run my tests on a cluster of 8. I have done some preliminary testing on 8 nodes (8 members, 8 senders), and the numbers look excellent, but I want to do this more methodically, e.g. varying
    - number of members
    - number of messages sent
    - message size (100b, 1K, 5K, 10K)
    - JDK 5 versus JRockit
    - 100Mbps vs gige switch

    Regarding large clusters: I'd be happy to talk to anyone who can provide me access (for testing) to large clusters. I know CERN (www.cern.ch) are doing some grid stuff with JGroups, which they want to run on a cluster of 2000 machines.
  4. Performance Test[ Go to top ]

    Bela,

    1) Do I understand correctly that you did test with two jvms running on the same box and forming a view of two nodes?
    If not, please expand.

    2) Also, I know your stack is pretty flexible so please list the protocols you used. The relevant ones here are mnak and flow control I imagine.

    3) Latency. By allowing unbounded latency you can achieve enourmous throughput. Please state the latency. Even better would a graph of latency vs. throuhgput, but that's probably too much work.

    Guglielmo
  5. Performance Test[ Go to top ]

    Bela,1) Do I understand correctly that you did test with two jvms running on the same box and forming a view of two nodes?

    Correct
    2) Also, I know your stack is pretty flexible so please list the protocols you used. The relevant ones here are mnak and flow control I imagine.

    I used fc-fast-minimalthreads.xml shipped with JGroups (JGroups/conf). Again, this is a very preliminary test, that's why I posted it on my blog... I started some serious testing some months ago, but haven't been able to progress because of the lack of a homogeneous cluster. Those tests will include everything needed to reproduce the tests, down to the switch type etc
    I'll resume those tests in January. BTW: Novell will soon publish some numbers on a 4-node cluster running TCP_NIO
    3) Latency. By allowing unbounded latency you can achieve enourmous throughput. Please state the latency. Even better would a graph of latency vs. throuhgput, but that's probably too much work.Guglielmo

    I don't see why latency should matter ? I'm sending the data asynchronously, and take the diff between receiving the first and last message. My test is part of JGroups (org.jgroups.tests.perf.Test), so you can look at the code. Its design is discussed at http://wiki.jboss.org/wiki/Wiki.jsp?page=PerfTests
    Cheers,
  6. Performance Test[ Go to top ]

    I don't see why latency should matter ? I'm sending the data asynchronously, and take the diff between receiving the first and last message.

    Latency is a very interesting number to look at.

    Based on this comment I would guess that since you know your design you concluded that the latency is already as low as it can ever be - it cannot be optimized further.

    Did you get a lot of retransmissions?
  7. Performance Test[ Go to top ]

    Latency is a very interesting number to look at. Based on this comment I would guess that since you know your design you concluded that the latency is already as low as it can ever be - it cannot be optimized further.

    In this particular test, I wasn't interested in latency or RTT at all. I have other tests that measure this, e.g. synchronous method invocations on a cluster.
    Did you get a lot of retransmissions?

    I didn't look at the stats (JMX stats, e.g. XMIT counters and such), which the test can spit out after the run. I assume I didn't get too many XMIT requests, because I kept the loss rate to a minimum by throttling the sender with the (conservative) flow control protocol.
    I'm looking forward to posting more detailed performance numbers from runs across 8 nodes, some time in March...
  8. We compared the performance of TCP-NIO against TCP in a four machine cluster.

    TCP-NIO Number of messages read per second:
       13402
       13155
       13344
       13107

    TCP Number of messages read per second:
       13490
       13462
       13476
       13440

    CPU utilization was around 64%.

    Read the details at http://www.jgroups.org/javagroupsnew/docs/Perftest.html