Tech Talk with Larry Jacobs on Transactional Messaging, Caching

Discussions

News: Tech Talk with Larry Jacobs on Transactional Messaging, Caching

  1. In this interview, Larry Jacobs, Director of Development, Oracle9i Application Server Web Cache, looks at the Two Phase Commit protocol and transactional messaging and how they help interoperability between transactional, heterogeneous systems. He also discusses caching as a means of improving performance and scalability vs simply 'throwing more hardware at the problem'.

    Watch Larry Jacob's Interview Here

    Threaded Messages (90)

  2. 2PC Overheads[ Go to top ]

    Overall a very good (if too short) interview. Most of the observations about messaging vs. database transactions are very relevant as enterprise apps increasingly integrate with one another. As he says, it would be nice to use XA/2PC to update every component of a system transactionally, but the cost of this is very high in terms of performance as well as the availability implications.

    However, I think he fell down a bit on one point (and perhaps this has to do with how Oracle's own products work). He said:

    \Quote\
    So in the case where we're doing the funds transfer example from New York to San Francisco, you could remove the hundred dollars from the New York account and put it on a queue as one transaction. And a separate transaction, a transaction messaging system will deliver that message from New York to San Francisco. And then in a third and independent transaction, you take the message off of the queue and add it to the account in San Francisco
    \Quote\

    In a world where your using JMS for messaging, and that JMS provider isn't unified with your database, the "remove the hundred dollars...and put it on a queue" and "take the message of the queue and add it to the account" bits generally both end up being XA transactions. This is because you want your "remove 100 bucks/enqueue" piece to be atomic, and since the remove 100 bucks piece is an RDBMS and the enqueue is a JMS queue (or publish), you've got
    two resources and need to do XA. The same is true on the receiver end.

    Personally I prefer the messaging solution, since it decouples the senders and receivers, but the author is overstating the performance and availability benefits. He's just moved the problem from databases to the messaging layer. Your messaging in hand with database work is going to incur the XA performance hit, and if the messaging server goes down, your system is just as hosed as if you were going directly database to database under XA.

        -Mike
  3. 2PC all over the place[ Go to top ]

    Exactly! You don't get rid of the 2PC in transactional messaging. Availability! Th e story was about a TX that should spawn over 10 DB servers and the fact that if one server is down the TX will abort. What is the asynchronous architecture that can model a TX that have (business requirement) to spawn over 10 DBs servers? What about parallel processing? (All the examples were about one operator that moves money from NY to San Francisco) You have to do business separation if you want to maitain performance and use messaging to "distribute" transactions (How can you use messaging when 1,000 operators are moving money independently?).

    I think that the 2 processing models have their own benefits and should be used with care.

    Regards,
    Horia
  4. Who is Accomi (Text version)?[ Go to top ]

    Maybe Akamai ... :)
  5. 2PC all over the place[ Go to top ]

    \Muntean\
    The story was about a TX that should spawn over 10 DB servers and the fact that if one server is down the TX will abort. What is the asynchronous architecture that can model a TX that have (business requirement) to spawn over 10 DBs servers?
    \Muntean\

    Well, if you use a messaging model that features guaranteed messaging in some form, then you can do it. At the message injection and reception ends you may still need XA, but for the distribution of messages (e.g. getting to all the reception ends) you rely on the guaranteed messaging aspect. BUT - and this is a big but - you have to have an application that can live with data sources being slightly out of sync. And you have to live with an application that can either:

       - Get good performance by dealing with out-of-order messages
       - Force message order at a significant hit to performance

    \Horia\
    What about parallel processing? (All the examples were about one operator that moves money from NY to San Francisco) You have to do business separation if you want to maitain performance and use messaging to "distribute" transactions (How can you use messaging when 1,000 operators are moving money independently?).
    \Horia\

    This is an excellent point, and I'm sorry I didn't point it out myself :-)

    Assuming you're using a system like JMS queues to get messages point-to-point to the right place, you can get parallelism/load balancing by having lots of listeners on the queue. The problem, of course, is the ordering problem I mentioned previously. In order to get good performance, a queue-based system is going to spread the queue messages over all the listeners - and this means you lose message ordering. Even if you have only one listener, in the case of a failure it's common for messages to arrive out of order. To do otherwise, once again, entails a big performance hit in normal-path message delivery.

    Why does this matter? Well, imagine an equity trading system that's sending out these messages:

        - Create trade
        - Cancel trade (trader screwed up!)

    If you have N queue listeners, it's entirely possible that the cancel will get processed first, followed by the create. So you get an error on the cancel, the create goes through, and you've got bad data in your database.

    There are, of course, ways around this. But they incur either alot of application complexity, or taking a big hit in performance to guarantee messaging order.


    \Muntean\
    I think that the 2 processing models have their own benefits and should be used with care.
    \Muntean\

    I agree. And I'll go further and say that the two models are fundamentally different in enough ways that it's going to have a big impact on your architecture. 2PC does imply a performance hit (I don't quite buy the HA hit in the article), but async messaging is _not_ a drop in replacement that fixes all of 2PC's ills without any downside. They're two very distinct ways of passing data around in a distributed system.

    As a complete side note - I wish the keepers of the J2EE transactional spec would go back and reconsider asynchronous XA resources. The biggest hit in J2EE XA processing is the synchronous, serial nature of the spec. In a typical system with 2 XA resources, 6 very expensive forces-to-disk happen serially - 1 for the tran manager 2PC start, 1 prepare per resource, 1 commit per resource, and 1 "done" for the tran manager. If JTA would allow asynchronous resource management, then the XA resource prepare() and commit() calls could be done in parallel (e.g. issue prepare() to all XA Resources in a transaction without waiting for a result, then then wait for replies to come back, and do the same on commit()). This can result a 20%-40% performance improvement, and it scales much better to many XAResources.

    For those interested, look at the original C-based X/Open spec and its async support. J2EE could benefit from such a model, particularly as 2PC transaction processing becomes more common as higher level of integration are being sought on many projects.

        -Mike
  6. You may be interested to read about some research work on "Conditional Messaging" and "Dependency-Spheres" which allows to combine standard 2PC-transactions with transactional messaging in a single atomic unit-of-work.
    http://www.research.ibm.com/AEM/d-spheres.html
    Stefan
  7. \Tai\
    You may be interested to read about some research work on "Conditional Messaging" and "Dependency-Spheres" which allows to combine standard 2PC-transactions with transactional messaging in a single atomic unit-of-work.
    \Tai\

    In the fairness of objectivity, perhaps you should point out that you are one of the primary creators of the D-Spheres research :-)

    I've read about this work several times over the past few months, but unfortunately I haven't had the time to study it in-depth. From what I've read, though, I have a few issues...

    - Loss of isolation (the I in acid). I understand D-Spheres needs to pump messages around, and therefore a global transaction can't really be isolated. As a result, pieces of global transaction are visible before its fully commited. This can be problematic for some applications.

    - Atomicity. There are repeated claims in the D-Spheres papers as to it achieving atomicity, but I think its overtaxing the word. The overall result of transactions in this model appear to ultimately achieve consistency, but it's stretching things to claim that each global transaction is atomic in any meaningful way.

    - Compensating actions. This is related to the above - the middleware at times needs to invoke application-defined compensating actions in the event of certain failures. This really doesn't meet any useful definition of the term "atomic". On top of that nitpick, it puts a heavy burden on application developers. Compared to relying on XA/2PC automatic rollback by the XA Resources, in this model the app effectively has to direct a sort-of manual rollback operation.

    Overall I think there are some good ideas in this research, but I'm doubtful of this or anything like it becoming a widely-implemented standard. Beyond my questions above, the sheer complexity of the model would mitigate against it becoming a widely used model e.g. people think EJBs are complex - what are they going to think about D-Spheres and all of its subtle implications?.

         -Mike
  8. \Mike\
    Loss of isolation (the I in acid). I understand D-Spheres needs to pump messages around, and therefore a global transaction can't really be isolated. As a result, pieces of global transaction are visible before its fully commited. This can be problematic for some applications.
    \Mike\

    That's true. D-Spheres may not be appropriate for some applications. However, there are many applications for which the D-Spheres model and its "relaxed isolation" is applicable, if not even desirable. Many business transactions, for example, need to be built using object middleware and messaging middleware in combination (many legacy apps can only be accessed using messaging); a pure synchronous, tightly-coupled JTS transaction simply would not do. Also, the isolation provided by the messaging ops in a D-Sphere is the same as with conventional transactional messaging.

    \Mike\
    Atomicity. There are repeated claims in the D-Spheres papers as to it achieving atomicity, but I think its overtaxing the word. The overall result of transactions in this model appear to ultimately achieve consistency, but it's stretching things to claim that each global transaction is atomic in any meaningful way.
    \Mike\

    I do not agree. Atomicity is clearly achieved; a D-Sphere may be longer-running than a simple JTS transaction, but that is only a reflection of many business transaction scenarios and needs.

    \Mike\
    Compensating actions. This is related to the above - the middleware at times needs to invoke application-defined compensating actions in the event of certain failures. This really doesn't meet any useful definition of the term "atomic". On top of that nitpick, it puts a heavy burden on application developers. Compared to relying on XA/2PC automatic rollback by the XA Resources, in this model the app effectively has to direct a sort-of manual rollback operation.
    \Mike\

    D-Spheres combine rollback and compensation techniques; either one can be applied where applicable. If rollback (and corresponding imaging and locks) are possible, fine; if not (and that is also common practice), then compensation is the choice. D-Spheres do not promote either one technique, but support both. And while compensation today is mostly hand-coded and a responsibility of the app developer, a middleware like the D-Spheres system reduces the burden of the app developer by supporting "guaranteed compensation".

    \Mike\
    Overall I think there are some good ideas in this research, but I'm doubtful of this or anything like it becoming a widely-implemented standard. Beyond my questions above, the sheer complexity of the model would mitigate against it becoming a widely used model e.g. people think EJBs are complex - what are they going to think about D-Spheres and all of its subtle implications?.
    \Mike\

    Thanks very much for your feedback and opinion; this is very valuable and interesting.

    Of course, as one of the authors of the D-Spheres technology (as you rightly point out), I am biased. I tend to think that the problem that D-Spheres is trying to solve is inherently complex, and that D-Spheres makes solving the problem only easier. I have yet to see an alternative solution that has comparable features and that would be even easier to use.

    Stefan
  9. \Tai\
    However, there are many applications for which the D-Spheres model and its "relaxed isolation" is applicable, if not even desirable.

    [...]

    Also, the isolation provided by the messaging ops in a D-Sphere is the same as with conventional transactional messaging.
    \Tai\

    Of course you're right about that. Part of my personal difficulty here is the common use of the word "transaction" between traditional transactional systems (including 2PC) and systems like you describe. Superficially they both appear transactional, but the differences down deep are so significant that it doesn't feel right to use the same terminology to describe them. A transaction involving an RDBMS is largely transparent and its mechanisms automatic from a developer's point of view - particularly, rollbacks "just happen". And this model holds for 2PC, albeit with some wrinkles like in-doubt transactions and heurstic decisions. But when you move to a fully async messaging system as you describe in D-Spheres, you lose the transparency/automation. I see there are attempts to automate, and that the middleware takes on a much bigger role than today's messaging middleware does, but still the dirty details of commit/rollback are much more exposed.

    \Tai\
    I do not agree. Atomicity is clearly achieved; a D-Sphere may be longer-running than a simple JTS transaction, but that is only a reflection of many business transaction scenarios and needs.
    \Tai\

    I think I have a tighter definition of atomic than you do. I see atomicity and isolation going largely hand in hand, so that in a perfect "ACID" world the system smoothly flips states before a transaction and after its commit - even if that transaction is monstrously complex. Of course you can ease some of the ACID restrictions, almost always in the name of performance, but perfect atomicity is the ideal.

    In contrast, I don't see a D-Spheres transaction as truly atomic. Rather, there are many individual local transactions that are loosely coordinated and are allowed to be out of step at various times by design. That doesn't feel atomic to me.

    \Tai\
    D-Spheres combine rollback and compensation techniques; either one can be applied where applicable. If rollback (and corresponding imaging and locks) are possible, fine; if not (and that is also common practice), then compensation is the choice. D-Spheres do not promote either one technique, but support both. And while compensation today is mostly hand-coded and a responsibility of the app developer, a middleware like the D-Spheres system reduces the burden of the app developer by supporting "guaranteed compensation".
    \Tai\

    When dealing with a mix of systems, especially where some are transactional and some are not, then compensatating transactions/messages are of course the only way to go. But I haven't read enough of your research to tell how much "true" rollback is allowed, or if compensators are needed even if an underlying resource is transactional.

    \Tai\
    Thanks very much for your feedback and opinion; this is very valuable and interesting.

    Of course, as one of the authors of the D-Spheres technology (as you rightly point out), I am biased. I tend to think that the problem that D-Spheres is trying to solve is inherently complex, and that D-Spheres makes solving the problem only easier. I have yet to see an alternative solution that has comparable features and that would be even easier to use.
    \Tai\

    You've got an excellent point there. I certainly haven't seen anything better, given what you're trying to achieve. But I still have hopes for an asynchronous XA model to arise which can be closer to traditional transactions in terms of atomicity/isolation but not have such severe performance problems. I haven't seen such a beast, but my gut tells me there's a way to do it...

        -Mike
  10. I'm in principle not against (transactional) messaging, but what Larry said about 2-phase commit (2pc) and transactional messaging is not correct. When talking about 2pc, he (by accident) mentions storing logs to persistent store, a RAID system to make it more fail-safe, sending prepare and commit requests and their responses, etc. On the other side, when speaking about messaging, he just speaks about sending message. I don't think that Larry doesn't know that it is not so easy.

    First, to have messaging with the same level of data stability as distributed atomic transactions have, you have of course use the same means, e.g. logging any sent message, having persistent message queues, etc. There is a difference with high-level abstractions, but the low level tools for ensuring reliability remain the same. In the example Larry gives, if the bank in San Francisco doesn't know the requested account number, a message has to be sent back to New York to cancel the banking transaction. In other words, you have to provide the same "two-phase commit dance" as in true 2pc, but in terms of "messages".

    Second, with messaging as Larry describes, you don't have the same level of data stability at all. If you commit your local transaction at New York before the message is received in San Francisco, you have a risk of compensating your transaction in New York; compensation sometimes cannot be completed (e.g., there is no longer enough money on the bank account). If you have multiple banking transactions at the same time, it's also not so simple that you send a message from one bank to another one, there is e.g. a risk of negative account balance when executing multiple withdraw operations from the same account.

    So, please, do not say A without saying B, don't provide us with "simple solutions" by hiding lots of details, don't say transaction monitors need RAID systems and reliable logging while transactional messaging systems do not, etc. Otherwise your interview looks like a paid advertisement of Oracle products.

    Marek Prochazka
    ObjectWeb/JOTM
    INRIA Rhone-Alpes
  11. Tech Talk with Larry Jacobs[ Go to top ]

    Good interview, great food for thought!

    On the messaging side, the reason that so many messaging implementations are slow is because they use the database. Even MQ Series uses DB2 inside it.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Easily share live data across a cluster!
  12. \Purdy\
    On the messaging side, the reason that so many messaging implementations are slow is because they use the database. Even MQ Series uses DB2 inside it.
    \Purdy\

    While this is true, it's not the only slow aspect when you consider JMS implementations. For example, if you're publishing under XA then the JMS implementation must log its prepare and "finished" actions to disk in a disk-forcing manner. By disk-forcing I mean that you're not just doing I/O, but you must force the data all the way out to disk. This is typically done with a rolling-forward transaction log.

    On a plain old disk this forcing can take 10-20 milliseconds. Adding in miscellaneous overhead, each XAResource in the transaction is going to add 80-100 milliseconds to the transaction length at a minimum (large objects on the JMS side, and complex transactions on the RDBMS side can obviously increase this).

    In the end, a naive JMS implementation with a simple transaction log is going to max out around 30 transactions/second on a regular disk. You can perhaps double this with a fast RAID array. Techniques like batching disk-forces across multiple transactions can be used to amortize the disk forcing cost over multiple transactions, and boost you up to the 150-200 TPS range, at the expense of slightly lengthing the average transaction time. Even so, these numbers aren't great, and the JMS server spends most of its time waiting for disk I/Os to complete. The problem is exacerbated on the transaction manager side - it spends most of its time both waiting for disk I/O and for the XAResources to return from their prepare or commit calls.

    About the only thing that can be done in these cases to speed things up is to get very expensive disk arrays that force to memory cache.

    As more and more applications start using JMS, their JMS calls are invariably going to get tied into RDBMS work - and hence we get stuck in the 2 phase commit
    cycle. This normally wouldn't be so terrible, but the JTA spec is forcing us into serial, synchronous access to the XA resources by the transaction manager.

    The JMS acking model also slows things down significantly - it's ashame they don't allow message gap-detection as an alternative model, which is must faster (but harder to implement).

    To give you an example - in the work I've done on a JMS product, I can pump out 600 messages/second per JMS server on a 4-way HP-UX machine. If I eliminate the acking model, that tips over 1000 messages/second. But involve the same JMS server in XA transactions, and the messaging rate drops to about 160 messages/second. And that rate was only achieved through very aggressive optimization of the transaction logs. And meanwhile, the a single app server/tran manager physically cannot drive that rate by itself - because its spending the bulk of its time waiting on its own tran log forces or XA resource calls to return. I need 3 app server processes to drive the JMS provider adequately.

    So - while using a RDBMS under the covers will certainly contribute to performance problems in JMS implementations, the JMS & JTA specs force even more problems. If the spec were changed to allow alternate implementations of guaranteed messaging, and to allow for asynchrous dispatch to XAResources, then existing JMS implementations combined with a global transaction manager could easily get 3x-5x gains in their publishing rates.

         -Mike
  13. messaging performance problems[ Go to top ]

    Mike: "To give you an example - in the work I've done on a JMS product, I can pump out 600 messagesper JMS server on a 4-way HP-UX machine. If I eliminate the acking model, that tips over 1000 messages/second. But involve the same JMS server in XA transactions, and the messaging rate drops to about 160 messages/second."

    Those numbers are in the same ballpark that I've witnessed with some of the "leading" integrated J2EE messaging implementations. They are atrocious compared to the 30,000+/sec that you would expect to see from a messaging system.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Easily share live data across a cluster!
  14. Overly Negative?[ Go to top ]

    I'm a definite novice when it comes to transactions message queues and maybe that's the reason why I don't understand many of the points here.

    First there is the complaint about message queue performance for transactions because the message queue has to flush to disk.
    1.) It seems like you can get some pretty high numbers, much higher than quoted here. Here is one benchmark (http://www.swiftmq.com/developers/performance/) that shows performance of around 10,000 messages per second for non durable, non transactional messages and around 1,000 m/s for durable transactional messages. That seems pretty good to me. Assuming a message size of around 1k, you are more than saturating a T1 line at that rate. Anyone working with direct connection private leased lines of greater than T1 from New York to San Fran all to yourself? ... I didn't think so.
    2.) There seems to be little point being irritated that something that is inherently disk IO bound. The bottle neck is the hard-drive and there doesn't seem to be a lot you can really do about that. You need faster than you need a fault tolerant RAM hard-drive. Surely someone has made one by now (virtualization or something like that don't they call this idea?)
    3.) Aren't we comparing a message quere solution with the alternative here, i.e. talking directly to the DB? Everything performance wise that one would criticize a message queue for will also be suffered by a DB, only worse. In the example given where one is doing a transaction accross the country I can't imagine the comparable performance of doing 2PC directly with the DBs being anything but horrendous. I 'think' that compared to talking with the DBs directly a message queue implementation would scream (wouldn't it?)

    Second, (and this is really where my lack of knowledge of transactions shows) I think that when you are working with more than one DB, there is always going to be a time (however short) where your data is inconsistent. Message queues don't solve this, but I don't think they make it any worse either. Even with a single DB you will get some level of inconsistency (dirty reads, phantom reads, etc) unless you are willing to lock everything concerned in which case you are going to suffer some serious perfomance issues due to contention (and you can dream about the days when you worried about message queues 'only' getting handling 100s of transactions/second ;-).
  15. save rate[ Go to top ]

    Curious about a system to which you can get 1000 writes
    a second to the disk. How is that done?
  16. Overly Negative?[ Go to top ]

    \Thistle\
    1.) It seems like you can get some pretty high numbers, much higher than quoted here. Here is one benchmark (http://www.swiftmq.com/developers/performance/) that shows performance of around 10,000 messages per second for non durable, non transactional messages
    \Thistle\

    I'll have to check into that. 10,000 messages per sec for a single server is pretty tough given the JMS acking model. I've seen it in messaging solutions with alternate guaranteed delivery models, but under JMS. I'll have to check that out....

    \Thistle\
    and around 1,000 m/s for durable transactional messages.
    \Thistle\

    I'd love to know how they perform 1,000 disk forces a second - unless you're talking about using something like an EMC array which forces to internal cache memory instead of all the way to disk. For a normal disk system, you're lucky to get 100-200 a second. And of course, for XA you have to double the disk forces. As I said in previous posts, you can play tricks like combining forces for multiple transactions, but still 1,000 seems absurdly high without some really high power disk arrays.

    \Thistle\
    That seems pretty good to me. Assuming a message size of around 1k, you are more than saturating a T1 line at that rate. Anyone working with direct connection private leased lines of greater than T1 from New York to San Fran all to yourself? ... I didn't think so.
    \Thistle\

    Well, that goes back to the original example, which was a bit contrived. Most messaging in real systems happens on a LAN, not a WAN, where 100Mbits is common, and gigabit ethernet is starting to catch on.

    \Thistle\
    2.) There seems to be little point being irritated that something that is inherently disk IO bound. The bottle neck is the hard-drive and there doesn't seem to be a lot you can really do about that. You need faster than you need a fault tolerant RAM hard-drive. Surely someone has made one by now (virtualization or something like that don't they call this idea?)
    \Thistle\

    You can't get around disk forces, that's for sure. The problem is hitting XAResources serially from the transaction manager. Let's assume a typical force for everybody is 20 milliseconds. For 2 XAresources in a transaction where resources are hit serially, that's 120 milliseconds. For 3 resources, it goes up to 160. For 4 resources, it goes up to 200 milliseconds. In contrast, a naive async implementation could easily make those numbers 80 millis, 80 millis, 80 millis - because the resource forces are interleaved with one another.

    On fault tolerant RAM hard-drives - things like EMC arrays will get you this sort of thing. And it will likely cost more than your high-end server does.

    \Thistle\
    Second, (and this is really where my lack of knowledge of transactions shows) I think that when you are working with more than one DB, there is always going to be a time (however short) where your data is inconsistent. Message queues don't solve this, but I don't think they make it any worse either. Even with a single DB you will get some level of inconsistency (dirty reads, phantom reads, etc) unless you are willing to lock everything concerned in which case you are going to suffer some serious perfomance issues due to contention (and you can dream about the days when you worried about message queues 'only' getting handling 100s of transactions/second ;-).
    \Thistle\

    Yes, there are consistency gaps. On the systems I work on, in the absence of faults, two resources can be out of sync for up to 200 milliseconds, typically less than 100 milliseconds, which is acceptably short :-) If a failure happens, a small number of transactions will be in-doubt - typically fewer than 20, and they're easily identified. Most other transctions will be pre-prepare(), and will auto rollback. In a little less than half the in-doubt cases, all resources are in doubt, and there's no consistency problem. For the very small number of remaining cases, resources can be out of sync until the fault is fixed. But - and this is important - the systems will not get further out of sync, and the transaction manager can automatically resolve the inconsistency.

    As you say - you're always balancing performance with consistency. But I keep trying to strike that balance. When you move to a messaging solution as described in the interview/article, you're largely abandoning consistency. Sometimes this is fine - but many times it's not.

         -Mike
  17. Here's the answer to Swift's blazing transactional message rate (http://www.swiftmq.com/products/router/swiftlets/store/tuning/index.html):

    \Swift\
    XA LOG DISK SYNC
    Here, the same applies according to the sync of the transaction log. The setting "true" to attribute "force-sync" stands for a disk sync after every write of a prepare log record. We recommend again the default setting "false".
    \Swift\

    Their entire log system is configurable, and their recommended settings for disk forcing is to turn it off!!!

    They go on to say:

    \Swift\
    A disk sync of the transaction log may be forced over the attribute "force-sync". The default setting is "false". In the case of "true" a disk sync is executed with every interval of the log manager. During one call, the log manager may write several log records into the transaction log (group commit) as it works asynchron. However, this is only the case when having a high parallelism of transactions. One disk sync takes between 20 and 50 milliseconds, concerning to the disk speed. Thus, your performance with disk sync will go down to approx. 20 messages per second. By the way, this value is due to all JMS providers. If a provider will sell higher values to you, he surely does not implement a disk sync on every transaction. So, look closely. We recommend to disable the disk sync of the transaction logs (default).
    \Swift\

    They're right on disk sync timings (for PCs at least - high level RAID arrays on Unix will get you down to 5-10 millis per sync). But wrong on the max rate if you're always forcing. With one client, yes, you're going to max around 20-30 TPS. But with multiple clients driving you (e.g a cluster of app servers acting as transaction managers), you can batch commits over multiple transactions (this is what I do).

    The crowning point of their argument comes down to this:

    \Swift\
    But if the computer fails, e.g. because of a power failure, this data is lost and an inconsistent state may emerge. Thus, if a maximum fail-safety is to be guarantied you should also make sure that the disk sync of the transaction log is enabled. But on the other hand you also need to pay for a disk sync in the form of a considerable performance loss. So, you should eventually consider to invest in an UPS instead of accepting a permanent performance loss as this may be more expensive.
    \Swift\

    I guess Swift considers a UPS to be adequate protection against a computer going down, and assuming you've got a UPS you can just blithely turn off disk forcing and live a happy and productive life :-)

    Unfortunately for myself and others like me, servers go down for all sorts of reasons, and "hardening to disk" means really hardening to disk, not pretending to.

         -Mike
  18. SwiftMQ's blazing performance[ Go to top ]

    First of all, the 10-20'000 msgs/s is non-persistent! Check the Performance Profile to see the results for persistent messages. Some tests just hit the limits of that machine.

    Their entire log system is configurable, and their recommended settings for disk forcing is to turn it off!!!

    That's true and it's explained in the document. The default fits most requirements. In case of a power fail you can use an UPS to ensure an orderly shutdown of that machine (and the last checkpoint). Of course, if that is not enough, then just enable "force-sync" and you're on the safe side. But then you pay for performance and you are disk-bound. Period.

    Everything is explained.

    Btw, I don't know a single JMS vendor (may be except MQ) that forces disk syncs by default (the reason is speed competition). I know vendors which even don't have a way to enable it. If you look at some of their file based persistent stores, they don't have things like a transaction log at all. They even don't know what a write-ahead log is. I really don't know how they recover at all.

    SwiftMQ can also use JDBC to persist messages. So that's an option too. We provide connection pooling, prepared statement caches and all that.

    FYI, I have tested it with Oracle, DB2, TimesTen. One of them has a throughput of 300, one of 30, one of 1000 msgs/s (I don't say which one has which result). But as you see, only one of them uses disk syncs by default (that one with the 30 msgs/s). And that are enterprise DBMS!

    They're right on disk sync timings (for PCs at least - high level RAID arrays on Unix will get you down to 5-10 millis per sync). But wrong on the max rate if you're always forcing. With one client, yes, you're going to max around 20-30 TPS. But with multiple clients driving you (e.g a cluster of app servers acting as transaction managers), you can batch commits over multiple transactions (this is what I do).

    Yes, and this is what we do as well. Our Store has of course group commit. However, you need a high paralellism of txn. It doesn't speed up single txn, only the overall throughput.

    I guess Swift considers a UPS to be adequate protection against a computer going down, and assuming you've got a UPS you can just blithely turn off disk forcing and live a happy and productive life

    We think an UPS is a solution to get a last checkpoint on power fails. For all other things you'd certainly either need to force disk sync or to use a Oracle Cluster and access it via JDBC, respectively.

    Unfortunately for myself and others like me, servers go down for all sorts of reasons, and "hardening to disk" means really hardening to disk, not pretending to.

    Yeah, might be in your and the other's case. Then pay for performance. However, it's not that black/white as you states here. There are trillions of uses cases.

    And, btw, (a last plug), SwiftMQ has also online backup which you can schedule as a job, may be every 5 minutes if you want. Backup is performed by intercepting a checkpoint. The generated save set is synced with the disk. No option to disable it.

    So, there are many options you have. Pick one.

    For the JTA stuff. I don't think it's forbidden to drive the 2PC async from a Tx manager as long you access the XAResource from the same thread, means, dispatch all prepares on different threads and wait until all have been completed. Then do the same with commit. I can't imagine that the JTA spec forces to drive the 2PC synchronously. That would be too slow. If you have other infos, please post it.

    -- Andreas
  19. SwiftMQ's blazing performance[ Go to top ]

    \Mueller\
    First of all, the 10-20'000 msgs/s is non-persistent! Check the Performance Profile to see the results for persistent messages. Some tests just hit the limits of that machine.
    \Mueller\

    Understood, and I think that was already pointed out. It's also consistent given that your testbed for the test was using gigabit ethernet. It would, however, be interesting to know what the results are with the far more common 100MBit connections.

    \Mueller\
    That's true and it's explained in the document. The default fits most requirements. In case of a power fail you can use an UPS to ensure an orderly shutdown of that machine (and the last checkpoint). Of course, if that is not enough, then just enable "force-sync" and you're on the safe side. But then you pay for performance and you are disk-bound. Period.
    \Mueller\

    You're not even in the ballpark for failure scenarios here. Some examples (not exhaustive in the least):

       - Disk controller failure
       - Kill the JMS server process with extreme prejudice (kill -9(
       - JMS Server process runs out of memory/core dumps/otherwise dies abnormally
       - Bad memory chip panics the kernel
       - Rarely hit kernel bug panics the kernel
       - Flakey video driver panics the kernel once in a while
       - Water drips on the server and fries it (more common than you might think - some datacenters have badly placed air conditioners!)

    Etc. etc.

    Also - youre "you are disk-bound. Period." isn't even close to being true. From your data on the performance site alone and your notes on tuning the logs, it's abundantly clear that you've never tested with a RAID array with a Unix host. You quote 20-50 millisecond times for a disk force. On my HP-UX system with a standard drive, the average is 10-20 millis. Add in a nicely striped RAID array to that same machine and this drops to 5-10 millis.

    Add in intelligent batching/grouping of commits, and you can easily hit transaction rates of 150 tran/sec on ordinary hardware with only 20 client threads driving the transactions.

    Add in some expensive disk array hardware like EMCs or competing stuff from IBM et al, and you bump that 150 tran/sec to around 500 tran/sec.

    And all of this can be done without sacrificing one iota of data consistency/correctness.

    \Mueller\
    Btw, I don't know a single JMS vendor (may be except MQ) that forces disk syncs by default (the reason is speed competition). I know vendors which even don't have a way to enable it. If you look at some of their file based persistent stores, they don't have things like a transaction log at all. They even don't know what a write-ahead log is. I really don't know how they recover at all.
    \Mueller\

    This is all true. However, for people who care about true data reliability and consistency, it's important to know a) that you can turn disk forcing on, and b) how the product performs. It would also be nice to have numbers on a variety of hardware platforms - I'm rather surprised you're basing all of your performance tuning on a supposedly enterprise product using a couple of stock PCs.

    \Mueller\
    Yes, and this is what we do as well. Our Store has of course group commit. However, you need a high paralellism of txn. It doesn't speed up single txn, only the overall throughput.
    \Mueller\

    Understood - and this is what most people care about. A single thread driving transactions isn't a very interesting test of performance or scalability. Drive with 20 or 30 users, and you'll find that you get sufficient parallelism in the transaction flow to boost your overall transaction rate to the 150+ TPS realm.

    \Mueller\
    Yeah, might be in your and the other's case. Then pay for performance. However, it's not that black/white as you states here. There are trillions of uses cases.

    And, btw, (a last plug), SwiftMQ has also online backup which you can schedule as a job, may be every 5 minutes if you want. Backup is performed by intercepting a checkpoint. The generated save set is synced with the disk. No option to disable it.
    \Mueller\

    I hear what you're saying, but I think Swift has given up rather early in trying to balance performance vs. correctness. You are basically saying "good perf or correctness - pick one" - and this is _not_ the only option.

    In short your company is doing itself a disservice recommending repeatedly to customers to turn off disk forcing, and implying a UPS is enough to save their data. If you try your ideas out on an expanded range of platforms, and if you try a bit harder to optimize your persistent messaging & XA transaction logs, you might be surprised how fast you can go and still keep your data hardened. And you'll impress a much larger clientel who want both worlds. As it is, by saying you can only go fast by losing your data safety net, you're losing a whole class of intelligent customers who need data safety, but need it at better than 20TPS, and are willing to spend $$$ on hardware and clever software that can achieve it.

    \Mueller\
    For the JTA stuff. I don't think it's forbidden to drive the 2PC async from a Tx manager as long you access the XAResource from the same thread, means, dispatch all prepares on different threads and wait until all have been completed. Then do the same with commit. I can't imagine that the JTA spec forces to drive the 2PC synchronously. That would be too slow. If you have other infos, please post it.
    \Mueller\

    Do you know any application servers that drive 2PC asynchronously right now? I've only seen synchronous 2PC to date. The problem here is that with the current spec, an app server would need an awful lot of threads to handle multiple simultaneous transactions asynchronously. If someone's done it without creating a gazillion threads, I'd love to take a look at it.

         -Mike
  20. All this fun talk about the maximum throughput for persistent messages reminds me that I once wrote a symmetrically replicated message broker. The basic idea is, if the disk is not fast enough for you, then use a lot of memory and do symmetric replication using multicasting on a few machines - usually three.

    I did this in Java on two or three PCs once, and hit about 5000 1Kb-messages per second. If you use big machines and gigabit ethernet you can probably hit much larger throughputs.

    Of course, you still have to save the long-term data to a database.

    If anyone wants details, email me at guglielmo.lichtner@gs.com
  21. All this fun talk about the maximum throughput for persistent messages reminds me that I once wrote a symmetrically replicated message broker. The basic idea is, if the disk is not fast enough for you, then use a lot of memory and do symmetric replication using multicasting on a few machines - usually three.

    What? 3 resources? XA!

    (a joke)

    -- Andreas
  22. SwiftMQ's blazing performance[ Go to top ]

    You're not even in the ballpark for failure scenarios here. Some examples (not exhaustive in the least):

    >
    >    - Disk controller failure
    >    - Kill the JMS server process with extreme prejudice (kill -9(
    >    - JMS Server process runs out of memory/core dumps/otherwise dies abnormally
    >    - Bad memory chip panics the kernel
    >    - Rarely hit kernel bug panics the kernel
    >    - Flakey video driver panics the kernel once in a while
    >    - Water drips on the server and fries it (more common than you might think - some datacenters have badly placed air conditioners!)
    >
    > Etc. etc.

    force-sync="true" or use the JDBC Store Swiftlet. Period. ;-)

    Also - youre "you are disk-bound. Period." isn't even close to being true. From your data on the performance site alone and your notes on tuning the logs, it's abundantly clear that you've never tested with a RAID array with a Unix host. You quote 20-50 millisecond times for a disk force. On my HP-UX system with a standard drive, the average is 10-20 millis. Add in a nicely striped RAID array to that same machine and this drops to 5-10 millis.

    The 20-50 millis is from a test with standard SCSI drives, 15'000 U/Min and Adaptec Ultra-SCSI controller. It's just a value. If you have a faster disk/RAID, man, be happy! However, it's still disk bound. So my statement above is just true.

    Don't know what your point is, except that you like to present your knowledge. Yes, you have deep knowledge in this area. Is that enough? Is that what you wanted?

    Add in intelligent batchingof commits, and you can easily hit transaction rates of 150 tran/sec on ordinary hardware with only 20 client threads driving the transactions.

    Add in some expensive disk array hardware like EMCs or competing stuff from IBM et al, and you bump that 150 tran/sec to around 500 tran/sec.

    And all of this can be done without sacrificing one iota of data consistency/correctness.

    Do you mind that I told you that we DO provide group commit? So what is your point?

    However, for people who care about true data reliability and consistency, it's important to know a) that you can turn disk forcing on

    http://www.swiftmq.com/products/router/swiftlets/store/tuning/index.html

    and b) how the product performs.

    It's also mentioned in the above document. I admit that we should add some stuff here. However, look below how JMS evaluation works.

    It would also be nice to have numbers on a variety of hardware platforms

    Sure. Go, tell me the URL where I can see such a detailed performance profile from another JMS vendor. I would be happy with a single platform. You'll not find anything, even not the big ones like IBM or Tibco or BEA. So what you are talking about?

    I'm rather surprised you're basing all of your performance tuning on a supposedly enterprise product using a couple of stock PCs.

    That's not true and the only sense of that statement is to spread FUD over SwiftMQ. We have tested SwiftMQ, including performance, on many platforms, including xx CPU big irons from Sun and IBM. We are glad to have some big customers providing us with these testing capabilities.

    Understood - and this is what most people care about. A single thread driving transactions isn't a very interesting test of performance or scalability.

    Usually, people expect performance of particular clients, e.g. a publisher should have a minimum publishing rate or a consumer should have a minimum consuming rate. So a single tx performance matters. The scalability is another issue which you can be eventually solved by using multiple routers.

    Drive with 20 or 30 users, and you'll find that you get sufficient parallelism in the transaction flow to boost your overall transaction rate to the 150+ TPS realm.

    See above. If there is a minimum performance requirement per tx, the overall performance doesn't matter. In that case one would certainly live with a non-sync'ed disk or bumps up hardware.

    I hear what you're saying, but I think Swift has given up rather early in trying to balance performance vs. correctness. You are basically saying "good perf or correctness - pick one" - and this is _not_ the only option.

    Sure, it is - if you are not the owner of expensive EMC stuff.

    In short your company is doing itself a disservice recommending repeatedly to customers to turn off disk forcing, and implying a UPS is enough to save their data.

    The point with the UPS refers to a power fail scenario. Here is what we state:

    "Thus, if a maximum fail-safety is to be guarantied you should also make sure that the disk sync of the transaction log is enabled."

    If you try your ideas out on an expanded range of platforms, and if you try a bit harder to optimize your persistent messaging & XA transaction logs, you might be surprised how fast you can go and still keep your data hardened.

    I don't think that you even can imagine how far our Store is optimized. Tsss, tsss...

    And if you use expensive RAIDs to increase your TPS rate, then, of course, you have this rate with our Store! So what? What you are talking about? I don't understand your point. All what you state is to bump up hardware.

    And you'll impress a much larger clientel who want both worlds. As it is, by saying you can only go fast by losing your data safety net, you're losing a whole class of intelligent customers who need data safety, but need it at better than 20TPS, and are willing to spend $$$ on hardware and clever software that can achieve it.

    Oh, thank you very much for your tips! Are you applying for a job? No, thanks.

    I guess you'd be impressed if you'd know how many really large clientel are already using SwiftMQ in really, really mission critical deployments. ;-)

    Nobody buys a JMS server for mission critical apps just from the web and pays with credit card. We spend most of our time with presales support and help such companies during their evaluation process. They usually have a loooong list of requirements incl. disk syncs and all that. They have dedicated teams, dedicated projects "JMS Evaluation" for that. Thanks to the many options SwiftMQ provides, they can choose what they want. They also do their very own performance and scalability tests in their very own environments. They don't trust vendor benchmarks. We win most of the deals in this kind of evaluation scenarios. We've sold > 3000 licenses in 2002 alone.

    Don't think we're stupid or at the same level as OpenJMS...

    Do you know any application servers that drive 2PC asynchronously right now? I've only seen synchronous 2PC to date. The problem here is that with the current spec, an app server would need an awful lot of threads to handle multiple simultaneous transactions asynchronously. If someone's done it without creating a gazillion threads, I'd love to take a look at it.

    You ranted about almost everything here. Provide some clear spec details and we talk further. Everything else is speculation.

    -- Andreas
  23. SwiftMQ's blazing performance[ Go to top ]

    \Mueller\
    The 20-50 millis is from a test with standard SCSI drives, 15'000 U/Min and Adaptec Ultra-SCSI controller. It's just a value. If you have a faster disk/RAID, man, be happy! However, it's still disk bound. So my statement above is just true.
    \Mueller\

    I think you missed my point - you're presenting benchmarks, and saying that for most customers they should turn off disk forcing to get any kind of decent performance. As I stated, that's not true - because you're basing your advice on one very limited platform.

    If you want to dispense advice to your customers, I would think you'd base it on a wide range of performance tests to make sure that you're underlying assumptions are true.

    On disk bound - how do you know? What's the CPU load on the machine, what's the disk load on the machine? If you use EMC, are you really, really sure that you're still disk bound?

    That's point of measuring performance and resource utilization - to know for fact what the system is doing. Not inferring it, not guessing, not making leaps from minimal data, but definitively measuring it.

    \Mueller\
    Don't know what your point is, except that you like to present your knowledge. Yes, you have deep knowledge in this area. Is that enough? Is that what you wanted?
    \Mueller\

    No, not at all. You keep making false statements about achievable performance, and Swift is advising clients to turn off disk forcing (and be magically protected from data loss by a UPS). I'm trying to point out the flaws in your argument and your advice to customers.

    \Mueller\
    Do you mind that I told you that we DO provide group commit? So what is your point?
    \Mueller\

    From the Swift site on tuning logs, it says:

    \Mueller\
    During one call, the log manager may write several log records into the transaction log (group commit) as it works asynchron. However, this is only the case when having a high parallelism of transactions. One disk sync takes between 20 and 50 milliseconds, concerning to the disk speed. Thus, your performance with disk sync will go down to approx. 20 messages per second. By the way, this value is due to all JMS providers. If a provider will sell higher values to you, he surely does not implement a disk sync on every transaction. So, look closely. We recommend to disable the disk sync of the transaction logs (default).
    \Mueller\

    The above is a bit conflicted - it says you can't get more than 20 messages a second (and goes on to say that anyone who says different is a liar or not disk forcing!). At the same time, it mentions group commits. So my question is - with disk forcing on and group commits, assuming you do have multiple threads hitting the server with transactions, how high up does the server throughput go? Your baseline is 20 for one thread, how does it do with 10/15/20/30 users?

    \Mike\
    I hear what you're saying, but I think Swift has given up rather early in trying to balance performance vs. correctness. You are basically saying "good perf or correctness - pick one" - and this is _not_ the only option.
    \Mike\

    \Mueller\
    Sure, it is - if you are not the owner of expensive EMC stuff.
    \Mueller\

    No, it's not, as I have repeatedly tried to point out. Swift says the disk forces incur the following "One disk sync takes between 20 and 50 milliseconds, concerning to the disk speed". This is patently, absolutely false. Even without EMC, with just a cheap RAID array you can easily do alot better than 20-50 millis per force. Even without an array, on my HP-UX setup without RAID I get 10-20 millis per force. With an array, it's 3-10 millis per force.

    In short, you're wrong.

    \Mueller\
    Oh, thank you very much for your tips! Are you applying for a job? No, thanks.

    I guess you'd be impressed if you'd know how many really large clientel are already using SwiftMQ in really, really mission critical deployments. ;-)
    \Mueller\

    I'm gainfully employed, thank you.

    \Mueller\
    Nobody buys a JMS server for mission critical apps just from the web and pays with credit card. We spend most of our time with presales support and help such companies during their evaluation process. They usually have a loooong list of requirements incl. disk syncs and all that. They have dedicated teams, dedicated projects "JMS Evaluation" for that. Thanks to the many options SwiftMQ provides, they can choose what they want. They also do their very own performance and scalability tests in their very own environments. They don't trust vendor benchmarks. We win most of the deals in this kind of evaluation scenarios. We've sold > 3000 licenses in 2002 alone.
    \Mueller\

    Look - docs all over your site say "we recommend you turn off disk forcing". All that I'm saying is that this is bad advice, and the performance implications of doing disk forcing are not as dire as you or your company say. And in trying to convince people that all disk forces everywhere take 20-50 millis, in trying to say a UPS will solve your data consistency problems, you're losing credibility. That's all.

    \Mike\
    Do you know any application servers that drive 2PC asynchronously right now? I've only seen synchronous 2PC to date. The problem here is that with the current spec, an app server would need an awful lot of threads to handle multiple simultaneous transactions asynchronously. If someone's done it without creating a gazillion threads, I'd love to take a look at it.
    \Mike\

    \Mueller\
    You ranted about almost everything here. Provide some clear spec details and we talk further. Everything else is speculation.
    \Mueller\

    Huh? I asked a very clear question, and I'll repeat it in short form "Do you know any application servers that drive 2PC asynchronously right now?". It's a simple yes/no question, no ranting required :-) .

        -Mike
  24. SwiftMQ's blazing performance[ Go to top ]

    think you missed my point - you're presenting benchmarks, and saying that for most customers they should turn off disk forcing to get any kind of decent performance. As I stated, that's not true - because you're basing your advice on one very limited platform.

    That's not true. As I stated already, we have done our benchmark on several platforms where we had the full machine only for SwiftMQ.

    And for the performance: You can do what you want, your single tx will be disk bound (bound to the speed of your disk/array/whatever)! You can only increase the overall throughput - scalability - with group commits - and that's what we provide.

    At the end you have to decide: low TPS for a single client or maximum performance and in the worst case a lost of transactions. It's a matter of cost.

    That's point of measuring performance and resource utilization - to know for fact what the system is doing. Not inferring it, not guessing, not making leaps from minimal data, but definitively measuring it.

    Boy, I told you that several times: WE HAVE DONE BENCHMARKS ON SEVERAL PLATFORMS!!!

    You keep making false statements about achievable performance, and Swift is advising clients to turn off disk forcing (and be magically protected from data loss by a UPS).

    Wrong! We don't say to turn off disk sync and being completely safe with a UPS. We say to turn on disk sync if maximum reliability is required. However, we also state that the price for the performance loss is HIGH and a UPS for the case of a power fail is LOW.

    The above is a bit conflicted - it says you can't get more than 20 messages a second (and goes on to say that anyone who says different is a liar or not disk forcing!).

    Sure. But it doesn't mean that if one says it reaches 25 msgs/s that he is a liar. It means if one says it reaches 700 msgs/s AND states it syncs the disk, he is a liar, because that's physically not possible.

    At the same time, it mentions group commits. So my question is - with disk forcing on and group commits, assuming you do have multiple threads hitting the server with transactions, how high up does the server throughput go? Your baseline is 20 for one thread, how does it do with 1020/30 users?

    It depends how many log records the Log Manager can process within one interval.

    No, it's not, as I have repeatedly tried to point out. Swift says the disk forces incur the following "One disk sync takes between 20 and 50 milliseconds, concerning to the disk speed". This is patently, absolutely false. Even without EMC, with just a cheap RAID array you can easily do alot better than 20-50 millis per force. Even without an array, on my HP-UX setup without RAID I get 10-20 millis per force. With an array, it's 3-10 millis per force.

    That's fine. But you are bound to the speed of your disk, aren't you? That's the point. The 20-50 millis above is only an example.

    Look - docs all over your site say "we recommend you turn off disk forcing". All that I'm saying is that this is bad advice, and the performance implications of doing disk forcing are not as dire as you or your company say. And in trying to convince people that all disk forces everywhere take 20-50 millis, in trying to say a UPS will solve your data consistency problems, you're losing credibility. That's all.

    Again: 20-50 millis is an example (there's an "approx." in front). I'm also not trying to say that an UPS solves all consistency problems. The docs (the doc of a Swiftlet, btw!) only state that an UPS is a way to get over power fails. Not more, not less.

    Huh? I asked a very clear question, and I'll repeat it in short form "Do you know any application servers that drive 2PC asynchronously right now?". It's a simple yesquestion, no ranting required :-)

    It wasn't a follow-up on that question but your initial rant on JTA that the spec requires to do a 2PC synchronously. Some posts later you state that you don't know that. Same with the d-sphere, btw. You rant about it but later state that you don't have read their work.

    Same with SwiftMQ. You rant and rant but don't read what I write. Mostly any JMS vendor disables disk force, even enterprise DBMS vendors do that. The reason is to get an initial high throughput during the first evaluation. That's why all of us, incl. DBMS vendors, disable disk forces. We are in speed competition. If we would publish a benchmark with disk forces, we would end up in comparisions like this where Fiorano has compared SonicMQ with disk forces against FioranoMQ whithout forces (they even have no documented switch to enable it). That's lying!

    But we don't lie as you like to demonstrate here. We don't do our business that way. Our docs state the same (if not more) as e.g. database vendors state on this issue. Some of those docs only describe a flag and period.

    Meditate over it. Currently you seems to me like a fault-finder.

    -- Andreas
  25. SwiftMQ's blazing performance[ Go to top ]

    <mueller>

    Mostly any JMS vendor disables disk force, even enterprise DBMS vendors do that. The reason is to get an initial high throughput during the first evaluation. That's why all of us, incl. DBMS vendors, disable disk forces. We are in speed competition. If we would publish a benchmark with disk forces, we would end up in comparisions like this where Fiorano has compared SonicMQ with disk forces against FioranoMQ whithout forces (they even have no documented switch to enable it). That's lying!



    Both WebLogic JMS and IBM MQSeries JMS sync to disk by default, and MQSeries is arguably the largest JMS vendor out there. WebLogic JMS does provide a setting to disable synchronous writes, but specifically documents the dangers of doing so. Also, I can not think of an enterprise level transaction monitor that allows disabling synchronous writes to their transaction logs.

    It still seems to be a bit misleading to disable synchronized writes by default, even if there is "speed competition". Let the competition "lie", savvy customers can sort out the difference. Disabling synchronous writes violates even the non-transactional guarantees required in the JMS 1.0.2 spec, as a crash could result in duplicates of persistent messages that have already been acknowledged - and the spec prohibits duplicates of persistent messages. A UPS is not sufficient, as it does not help in the case of an operating system crash, where in-memory buffered writes can get wiped out.

    These opinions are my own, and do not reflect the opinions of my employer.

    Tom, BEA
  26. SwiftMQ's blazing performance[ Go to top ]

    /Tom/
    These opinions are my own, and do not reflect the opinions of my employer.
    /Tom/

    Correction. Make that "and *may* not reflect the opinions of my employer".

    Tom, BEA
  27. SwiftMQ's blazing performance[ Go to top ]

    Both WebLogic JMS and IBM MQSeries JMS sync to disk by default, and MQSeries is arguably the largest JMS vendor out there.

    I've stated "except MQ" somewhere above. That BEA syncs by default is new for me.

    We will rethink our policy for our next release.

    -- Andreas
  28. SwiftMQ's blazing performance[ Go to top ]

    Both WebLogic JMS and IBM MQSeries JMS sync to disk by default, and MQSeries is arguably the largest JMS vendor out there. WebLogic JMS does provide a setting to disable synchronous writes, but specifically documents the dangers of doing so.

    You're right with your 7.0 release. There you sync by default. However, I don't found it in e.g. your 6.0 release. There's nothing about disk sync. You JMS File Store don't has this attribute.

    So, is this new in 7.0?

    To make nit-pickers happy, the Performance Profile now contains a note that disk syncs for persistent messages were disabled for the test and our Store docs was updated as well.

    -- Andreas
  29. SwiftMQ's blazing performance[ Go to top ]

    /Paraphrase Mike/

    [in 7.0] you sync by default. However, I don't found it in 6.0.
    So, is this new in 7.0?

    /End Paraphrase Mike/

    All versions of BEA WebLogic JMS, as well as BEA's other two queuing products, use synchronous writes by default. We see no need to document this, as synchronous writes are the only way to achieve safe transactional behavior - something our customers assume they are getting when they buy our products. In JMS 6.0SP? and in 6.1, we added a command-line -D switch to optionally disable sync writes. In 7.0 and 8.1, you can configure via the console. The -D switch is mentioned in the JMS performance guide white-paper on dev2dev.bea.com, not sure where else. Wherever we document the switch, we also document the QOS trade-offs of using it.

    Tom, BEA
  30. SwiftMQ's blazing performance[ Go to top ]

    \Barnes\
    All versions of BEA WebLogic JMS, as well as BEA's other two queuing products, use synchronous writes by default. We see no need to document this, as synchronous writes are the only way to achieve safe transactional behavior - something our customers assume they are getting when they buy our products.
    \Barnes\

    Well, given BEA's expertise in this area, particularly as its embodied in Tuxedo, this is hardly surprising.

    I'll also point out that Websphere 4.x and up also do synchronous writes to its XA logs.

        -Mike
  31. SwiftMQ's blazing performance[ Go to top ]

    \Mueller\
    To make nit-pickers happy, the Performance Profilenow contains a note that disk syncs for persistent messages were disabled for the test and our Store docswas updated as well.
    \Mueller\

    Nit-pickers? I hardly think that pointing out that your performance tests got high numbers because it ran in an environment where data integrity was not guaranteed is a nit-pick. It's a fundamental aspect of both persistent messages _and_ XA transactions. It's not a minor point, it's not a trivial detail, it's right at the heart of why persistent messages and hardening points in the XA protocol exist!

    This reminds me of a bug I had about a month ago. I went to the project manager and said "I have good news and bad news. The good news is that we're pumping out 5x more messges in the latest preliminary release; the bad news is subscribers never get the messages". His response: "Well, if you're not really publishing I'm surprised you're not getting 100x more messages - you're slipping on the job, Mike!".

        -Mike
  32. SwiftMQ's blazing performance[ Go to top ]

    \Mike\
    think you missed my point - you're presenting benchmarks, and saying that for most customers they should turn off disk forcing to get any kind of decent performance. As I stated, that's not true - because you're basing your advice on one very limited platform.
    \Mike\

    \Mueller\
    That's not true. As I stated already, we have done our benchmark on several platforms where we had the full machine only for SwiftMQ.
    \Mueller\

    I can only go on the information I've seen. The only benchmarks I've seen for SwiftMQ were on one platform. Your tuning pages on your site seem to echo the numbers from that performance test. If you have benchmark numbers for different platforms, then you should publish them. I'd think this would only help your company.

    \Mueller\
    And for the performance: You can do what you want, your single tx will be disk bound (bound to the speed of your disk/array/whatever)! You can only increase the overall throughput - scalability - with group commits - and that's what we provide.

    At the end you have to decide: low TPS for a single client or maximum performance and in the worst case a lost of transactions. It's a matter of cost.
    \Mueller\

    <sigh> I think you mean low TPS on low-end hardware for a single client (in fact I'll be nitpicky and say "for a single client thread").

    \Mueller\
    Wrong! We don't say to turn off disk sync and being completely safe with a UPS. We say to turn on disk sync if maximum reliability is required. However, we also state that the price for the performance loss is HIGHand a UPS for the case of a power fail is LOW.
    \Mueller\

    Well, one way or another I seem to have made my point. Your tuning site has changed since I last read it a couple of days ago. Here's what it originally said:

    \Swift\
    But if the computer fails, e.g. because of a power failure, this data is lost and an inconsistent state may emerge. Thus, if a maximum fail-safety is to be guarantied you should also make sure that the disk sync of the transaction log is enabled. But on the other hand you also need to pay for a disk sync in the form of a considerable performance loss. So, you should eventually consider to invest in an UPS instead of accepting a permanent performance loss as this may be more expensive.
    \Swift\

    What you say now is:

    \Swift\
    The default mode of "false" is less reliable and you may loose data if the computer crashes. However, the throughput is many times higher than with disk sync. Forcing disk syncs is reliable, but the performance is then bound to the speed of the disk, which varies.

    Which mode you use is up to you and usually a matter of costs.
    \Swift\

    Please note the sentence from the original "So, you should eventually consider to invest in an UPS instead of accepting a permanent performance loss as this may be more expensive".

    In any case, your site docs around disk forcing and performance seems to have changed drastically in the past few days, and IMHO for the better. So I guess I'm a ranting fault-finder, and your company has changed its pages to match my ranting fault-finder ways.

    \Mueller\
    That's fine. But you are bound to the speed of your disk, aren't you? That's the point. The 20-50 millis above is only an example.
    \Mueller\

    That's not what your site docs said. Here's exactly what your docs said on Friday:

    \Swift\
    One disk sync takes between 20 and 50 milliseconds, concerning to the disk speed. Thus, your performance with disk sync will go down to approx. 20 messages per second. By the way, this value is due to all JMS providers. If a provider will sell higher values to you, he surely does not implement a disk sync on every transaction. So, look closely. We recommend to disable the disk sync of the transaction logs
    \Swift\

    Those docs clearly imply that disk syncs take 20-50 millis. They concretely say you'll go down to "approx 20 messages per second". Based on what I read, it sure sounded to me like someone selling 50 TPS would be out of your "approx" range.

    \Mueller\
    It wasn't a follow-up on that question but your initial rant on JTA that the spec requires to do a 2PC synchronously. Some posts later you state that you don't know that. Same with the d-sphere, btw. You rant about it but later state that you don't have read their work.
    \Mueller\

    Funny, I thought I quoted the spec differences between X/Open and JTA, and said I was unsure about one point.

    You sir, on the other hand, haven't quoted a damn thing from any spec.

    On d-spheres, I have the read the work, just not completely and in-depth. There are on the order of 10 detailed papers on D-Spheres on the IBM sites, and it's a complex subject. I did read enough to ask some questions, and those were answered quite ably.

    Please do not confuse ranting with serious questions on a subject I happen to know alot about. In the end, I think you've made far more mis-statements in this thead I have _by far_. Undoubtedly you feel I'm attacking your product, and by extension yourself, and you feel justified in labelling a ranter and fault-finder. But consider: what if the faults I'm finding are accurate?

    And by the way - I'm still waiting for you to name that transaction manager which handles XAResources asynchronously.

    \Mueller\
    Same with SwiftMQ. You rant and rant but don't read what I write. Mostly any JMS vendor disables disk force, even enterprise DBMS vendors do that. The reason is to get an initial high throughput during the first evaluation. That's why all of us, incl. DBMS vendors, disable disk forces. We are in speed competition. If we would publish a benchmark with disk forces, we would end up in comparisions like thiswhere Fiorano has compared SonicMQ with disk forces against FioranoMQ whithout forces (they even have no documented switch to enable it). That's lying!
    \Mueller\

    FioranoMQ is not the entire RDBMS & JMS space. And frankly, others have already pointed out that your position on disk forces in the industry are bull. Thankfully most vendors take data integrity a bit more seriously than you do.

    \Mueller\
    But we don't lie as you like to demonstrate here. We don't do our business that way. Our docs state the same (if not more) as e.g. database vendors state on this issue. Some of those docs only describe a flag and period.

    Meditate over it. Currently you seems to me like a fault-finder.
    \Mueller\

    I wouldn't say that you lie, or your company does - but I would say that you've made a number of statements that haven't been very accurate. And perhaps if you're a bit more open to other people's views and experience in the future, you might back-check your data, and might learn something. The alternative of labelling people (and in the process doing far more ranting than anyone else here) isn't doing yourself or your company any favors. Think of the effect your words here have had on people who are considering SwiftMQ but are on the fence.

         -Mike
  33. SwiftMQ's blazing performance[ Go to top ]

    In any case, your site docs around disk forcing and performance seems to have changed drastically in the past few days, and IMHO for the better. So I guess I'm a ranting fault-finder, and your company has changed its pages to match my ranting fault-finder ways.

    No. I only realized that it is better to remove everything else than just the facts to avoid that ranting fault-finders like you spread FUD over SwiftMQ.

    And perhaps if you're a bit more open to other people's views and experience in the future, you might back-check your data, and might learn something. The alternative of labelling people (and in the process doing far more ranting than anyone else here) isn't doing yourself or your company any favors. Think of the effect your words here have had on people who are considering SwiftMQ but are on the fence.

    I know that you are a 'difficult' person ;-) since I read your title "SwiftMQ's blazing performance == loss of consistency". I'm glad to have you not as my team mate. The persistent operations are only a part of the benchmark and what we do is what I consider still as usual. Someone mentioned 10'000 messages per second but that's not interesting for you. You prefer to hack on disk syncs and state that we use stock PCs and so on. I'm a friend of clear speech so for me you are just an a****le.

    So long.

    -- Andreas
  34. \Mueller\
    No. I only realized that it is better to remove everything else than just the facts to avoid that ranting fault-finders like you spread FUD over SwiftMQ.
    \Mueller\

    FUD? Ranting? Your docs were wrong in some places, and highly misleading in other places. Now they're more accurate - and you seem unhappy that they're more accurate!!

    \Mueller\
    I know that you are a 'difficult' person ;-) since I read your title "SwiftMQ's blazing performance == loss of consistency". I'm glad to have you not as my team mate. The persistent operations are only a part of the benchmark and what we do is what I consider still as usual. Someone mentioned 10'000 messages per second but that's not interesting for you. You prefer to hack on disk syncs and state that we use stock PCs and so on. I'm a friend of clear speech so for me you are just an a****le.
    \Mueller\

    Very nice.

    Insults and name calling aside, your biggest issue here seems to be that you made inaccurate statements here, and your site also made inaccurate statements, and you can't seem to live with being wrong once in awhile.

    We're talking about enterprise software here. Getting a piece of it right, or even the vast majority of it right, isn't enough. It's also very complex, and getting everything right will never happen. But we should strive to come as close as we can, to correct problems as we find them, and to affect change in specs if they're holding us back in some way. Talking about async XA, I'm trying to improve the spec and see if we can make transaction managers faster. In the process, I got some of it right, and some of it wrong. But I think everyone has a clearer insight into the issues.

    On SwiftMQ - yeah, you guys have gotten alot right. By all indications it seems to be a great product. But that doesn't mean it's perfect. Likewise, documentation is never perfect, but ever-evolving. You appear to despise me for pointing out errors in your docs, and you appear to despise me more now that your documentation is better as a result of this exchange!

        -Mike
  35. disk sync[ Go to top ]

    FYI, an excerpt from the prop file of Sun ONE Message Queue 3.0.1 (Enterprise Software?!) just to show you what's usual:

    # Controls whether persistence operations synchronize in-memory state with # the physical storage device. When this is enabled, data loss due to system # crash will be eliminated at the cost of performance. # # Default: false imq.persist.file.sync.enabled=false

    Have a nice day.

    -- Andreas
  36. disk sync[ Go to top ]

    Congrats, you found one! I wonder how long it took to find that one?.

    Here's what BEA says:

    \Bea\
    A user-defined policy that determines how the JMS file store writes data to disk. This policy also affects the JMS file store's performance, scalability, and reliability. The valid policy options are:

    Disabled- Transactions are complete as soon as their writes are cached in memory, instead of waiting for the writes to successfully reach the disk. This policy is the fastest, but the least reliable (that is, transactionally safe). It can be more than 100 times faster than the other policies, but power outages or operating system failures can cause lost and/or duplicate messages.

    Cache-Flush- Transactions cannot complete until all of their writes have been flushed down to disk. This policy is reliable and scales well as the number of simultaneous users increases.

    Direct-Write- File store writes are written directly to disk. This policy is supported on Solaris and Windows. If this policy is set on an unsupported platform, the file store automatically uses the Cache-Flush policy instead.
    \Bea\

    (The deafult is cache-flush, which is reliable).

    Sonic says:

    \Sonic\
    Warning Evaluation Mode is selected on evaluation installations. This setting relaxes the
    timing of hard disk write operations to improve performance for persistent and
    transacted messages, but can allow messages to be lost in the case of system failure.

    This mode allows SonicMQ to emulate the default behavior of certain competing
    messaging products that exhibit this potentially dangerous behavior and enables
    SonicMQ to be directly compared with these products
    Sonic Software Corporation does not recommend that you select Evaluation Mode in
    production or for product comparisons where true guaranteed delivery is required.
    The use of nonpersistent messages is recommended only when message loss is
    acceptable during a system failure.
    \Sonic\

    IBM doesn't seem to even consider the possibility of not forcing to disk. I couldn't find any option to do so. However, they do have alot of docs on improving disk performance. This is from (http://www7b.boulder.ibm.com/wsdd/library/techarticles/0111_dunn/0111_dunn.html):

    \IBM\
    I/O - Disks will go so fast and then no faster.
    Although it is possible to reduce I/O times by using solid state disks or disks with caches, the fastest I/O is no I/O! I/O happens with the Websphere MQ Integrator product because of: Queue data being written to or read from disk.
    Queue manager logging to provide recovery.
    Data being written to or read from a database causes the database to perform I/O.
    \IBM

    I won't show all of the information the above three document on data consistency and integrity (I don't think I could, physically), but the two who have an option to turn off disk forcing go to great lengths to persuade you not to do it. IBM makes this far more pointed - they say that if I/O performance is slowing you down, then you need to rethink what you make persistent (not make it kinda-sorta reliable for a limited number of failure scenarios).

    Would you like me to go on and point out yet more examples, or do you get the point?

    As for RDBMS vendors, I'm not even going to bother.

    So are you still going to cling to the belief that turning off disk forces is "usual"? Are you going to keep selling people on unbelievably fast messaging rates at the cost of losing or corrupting data?

        -Mike
  37. disk sync[ Go to top ]

    Congrats, you found one! I wonder how long it took to find that one?.


    It was very quick. I only had to check the prop file from a recent download.

    Since it was Sun ONE Message Queue, a well known JMS provider, I'm wondering whether Tom from BEA will tell'em that this is a non-compliant JMS 1.0.2 behavior. My guess is he will not... ;-)

    Disabling disk sync by default is usual. Some don't, of course, however, the majority does. Even Sonic do it. In earlier releases they had enabled it and I guess due to their bad experience from competitive benchmarks and customer questions "Why is that so slow?" they decide to disable it. Btw, the same reason why we have disabled it by default.

    I guess I told you that already in one of my recent postings.

    Have I told you that you can enable disk sync with our product?

    We are still talking about default behavior, nit-picker.

    Enabling/diabling of disk sync is a matter of cost. How much does it cost me to either live with lost data or more-than-once delivered data in a case of a system failure? Am I able to reconstruct lost data, e.g. by resending messages, and how much does it cost? What is the probability of such a system failure in my environment? What is the cost of a permanent performance lost (= difference between sync'ed / non-sync'ed disk)? Am I able to reach my throughput goals per client with extended hardware and disk sync? What is the cost for that? What is my timeline? Can I live for some time without disk sync? How much does it cost to wait another month before going into production? Project budget?

    Many questions, nit-picker. It seems you have a very limited view. There are other projects out there, other's than yours.

    Anyway, since I know the only sense of your nit-picking effort to spend your weekend with googling is to put pot on my head, I'm tired of your cock-crows and tired to talk with you further on the emotional level of a teenager as you've stated very clearly. Grow up a bit, do some more projects, gain some more experience (this makes you calm and pragmatically) and - may be - we can talk again.

    EOD.

    -- Andreas
  38. Spille[ Go to top ]

    Mike, are you the Usenet troll listed from the link below? Discussing in endless threads who says what? ;-)

    blahblah

    -- Andreas
  39. Mueller[ Go to top ]

    Andreas,

    the only person I can see on this thread being degenerated into a troll is you.

    I find the information Mike provided very useful and helpful. The fact that you started name calling creates very little respect for you or your product.

    HTH

    /T
  40. disk sync[ Go to top ]

    \Mueller\
    Since it was Sun ONE Message Queue, a well known JMS provider, I'm wondering whether Tom from BEA will tell'em that this is a non-compliant JMS 1.0.2 behavior. My guess is he will not... ;-)
    \Mueller\

    If a JMS provider tells a publisher that a persistent message has been successfully published, and the JMS provider then crashes and comes back up, and loses the message in the process, then you're out of spec. Period.

    Vendors may provide unsafe optimizations, but those unsafe optimizations are specifically out of spec.

    \Mueller\
    Disabling disk sync by default is usual. Some don't, of course, however, the majority does.
    \Mueller\

    No, the majority does not. So far you've found one, which is hardly a majority.

    \Mueller\
     Even Sonic do it. In earlier releases they had enabled it and I guess due to their bad experience from competitive benchmarks and customer questions "Why is that so slow?" they decide to disable it. Btw, the same reason why we have disabled it by default.
    \Mueller\

    I guess you didn't read my message through. I'll repeat Sonic's advice on this:

    \Sonic\
    Warning

    Evaluation Mode is selected on evaluation installations. This setting relaxes the timing of hard disk write operations to improve performance for persistent and transacted messages, but can allow messages to be lost in the case of system failure.

    This mode allows SonicMQ to emulate the default behavior of certain competing
    messaging products that exhibit this potentially dangerous behavior and enables
    SonicMQ to be directly compared with these products
    Sonic Software Corporation does not recommend that you select Evaluation Mode in production or for product comparisons where true guaranteed delivery is required.

    The use of nonpersistent messages is recommended only when message loss is
    acceptable during a system failure.
    \Sonic\

    Please note their words of caution - don't do this in production, potentially dangerous, explicitly for evaluation only.

    \Mueller\
    Have I told you that you can enable disk sync with our product?

    We are still talking about default behavior, nit-picker.

    \Mueller\

    Well, we were actually discussing default behavior and vendor recommendations. Your original company's recommendation was to use the default of no sync. You haven't changed your default, but at least the tuning pages on Swift's sites are now more balanced in presenting the issue. To this I say bravo.

    \Mueller\
    Enabling/diabling of disk sync is a matter of cost. How much does it cost me to either live with lost data or more-than-once delivered data in a case of a system failure? Am I able to reconstruct lost data, e.g. by resending messages, and how much does it cost? What is the probability of such a system failure in my environment? What is the cost of a permanent performance lost (= difference between sync'ed / non-sync'ed disk)? Am I able to reach my throughput goals per client with extended hardware and disk sync? What is the cost for that? What is my timeline? Can I live for some time without disk sync? How much does it cost to wait another month before going into production? Project budget?
    \Mueller\

    The companies I work for and with are aware of these sorts of issues, as am I. And the choices for them are very simple and completely within the spec. If they're not concerned with message loss - they can easily reconstruct information, they have alternative sources, the messages aren't that important - they use non-persistent messaging. If they are concerned with message loss, then they use persistent messages.

    What they don't do is accept half-assed persistence.

    The same is true of XA vs. not-XA.

    I don't think you really understand what "persistent messaging" means. It means the developers are saying "this data must survive crashes, shut downs, etc". It's not advice or a hint - it's an absolute.

    To do something like turning off disk syncs is stupid because it gives people a false sense of security. They think they have reliability - until the kernel panics one day, or someone trips over the UPS plug.

    People shouldn't be deciding "should I sync or not sync?". They should be deciding "what do I really need persistent? Is the messaging layer the right place for persistence? Can I afford more hardware, and will that give me reliability and performance?".

    Your attitude sounds very reasonable sir, and many people do the sorts of analysis you describe, and say "I can live with sync off". And then one day the server crashes, an operator accidentally does a kill -9 or whatever, and they lose data. And lose money. And people lose jobs.

    You keep saying "it's a matter of cost", but that's not always true. Some companies need persistent data to be persistent - period. It's a hard requirement. Other companies think they need persistence, but it's more of a convenience than anything else. For them, they should go whole-hog and not use persistent messaging. This way they get great performance and aren't kidding themselves about their reliability.

    \Mueller\
    Many questions, nit-picker. It seems you have a very limited view. There are other projects out there, other's than yours.

    Anyway, since I know the only sense of your nit-picking effort to spend your weekend with googling is to put pot on my head, I'm tired of your cock-crows and tired to talk with you further on the emotional level of a teenager as you've stated very clearly. Grow up a bit, do some more projects, gain some more experience (this makes you calm and pragmatically) and - may be - we can talk again.
    \Mueller\

    You may perhaps consider turning this advice to yourself. With every posting you're losing more and more credibility. And if you keep this up I wouldn't be surprised if you start losing a potential customer or two who reads this forum.

        -Mike
  41. disk sync[ Go to top ]

    No, the majority does not. So far you've found one, which is hardly a majority.

    Not true. My guess was that only MQ forces syncs because they are simply out of competition with 75% market share. New for me is that BEA does it as well since they have file stores at all (since 6.0).

    I know no other JMS vendor to have enabled disk syncs. As you might imagine, I know a lot. I pointed you to Sun ONE only for the reason to show you one enterprise vendor.

    You are welcome to show me some others with disk sync enabled per default.

    I don't think you really understand what "persistent messaging" means. It means the developers are saying "this data must survive crashes, shut downs, etc". It's not advice or a hint - it's an absolute.

    If you have such absolute requirements, then use disk sync. But then do it everywhere. After every file write, do a fd.sync(). Don't save your Java sources without a sync. Beat your IDE vendor if he doesn't do that. Check everything you use, cvs, perforce, ant tasks etc pp whether it syncs after writes. Because persisting data is not advice or a hint! It's an ABSOLUTE! Live what you pray, buddy!

    Or do you relax your absolutism sometimes? More like a relative absolutism? Hmmm?

    Look, same with persistent messaging. Under normal operations, without stupid operators and without crashes, persistent data is still present after a reboot. SwiftMQ even writes a last checkpoint during the shutdown hook. So you don't even have a recovery phase. Most projects can live with that. It's what I said: a matter of cost. My very personal opinion is to avoid disk syncs if you can, because you have a magnitude higher throughput then. If you can handle crash situations, this is the cheapest way to go. Use disk syncs only if you must use it. It's just the reverse than you state. And I'm not a beginner. I do my job > 20 years and have seen countless projects, from the early 80's to now.

    What do you think we have write caches for? It's not a devil like you pray here. It's a good technique to have both, persistence and performance.

    I don't know how high your throughput requirements are per client currently. Say 70 TPS and you reach 100 TPS now. Say you have 1000 concurrent clients and you reach this rate. You have bought hardware whatever possible, cost doesn't matter. Thus, you have 70% load on your system and 30% left. You are happy now, because you fullfil your boss' requirements and your own absolute view as well.

    You are a hero, and you have fun to bash JMS vendors in public forums due to their default settings. You know it all, you've done it. You are the MAN!

    But in a half year, economy starts booming again, your boss will tell you that they have hired 200 new traders and the system needs now 150 TPS per client by 2000 clients. Mike, handle it. You have 2 weeks to go in production, otherwise we loose 100M per day. At least!

    Ooops!

    But wait, what was with the disk sync...? You tell it to your boss. Your boss states "Mike, why do you ask me that? Disable it and take care of a recovery scenario, just for the case, you know?".

    And you are still a hero, Mike.

    Having worked in the financial industry myself where it was just normal to release buggy software or to fix bugs directly in production, because otherwise they would have lost tons of money. Again, a matter of cost.

    You may perhaps consider turning this advice to yourself. With every posting you're losing more and more credibility. And if you keep this up I wouldn't be surprised if you start losing a potential customer or two who reads this forum.

    I suggest everyone who is concerned about me: do not buy SwiftMQ. It's that easy. I always say my opinion, good or bad. Some like it, some don't. Those who don't might contact a sales person from our competitors. They will always tell you what you like to hear, because you pay for it.

    -- Andreas
  42. disk sync[ Go to top ]

    \Mueller\
    Not true. My guess was that only MQ forces syncs because they are simply out of competition with 75% market share. New for me is that BEA does it as well since they have file stores at all (since 6.0).

    I know no other JMS vendor to have enabled disk syncs. As you might imagine, I know a lot. I pointed you to Sun ONE only for the reason to show you one enterprise vendor.

    You are welcome to show me some others with disk sync enabled per default.
    \Mueller\

    BEA & IBM MQ have disk sync on by default. Sonic MQ only allows disk syncs in EVALUATION MODE (their caps, not mine), and urges users to turn off evaluation mode for production use. What's the market share of those 3 JMS providers right there?

    So far, you have two examples - your own product, and Sun One.

    As to why IBM MQ forces to disk - it has nothing to do with market share. It's because they value correctness, and a big part of the sell of MQ is its reliability in high-load environments.

    When I get a chance I'll look around at the remaining JMS providers (can't do it now - I do have other things to do :-)

    \Mueller\
    If you have such absolute requirements, then use disk sync. But then do it everywhere. After every file write, do a fd.sync(). Don't save your Java sources without a sync. Beat your IDE vendor if he doesn't do that. Check everything you use, cvs, perforce, ant tasks etc pp whether it syncs after writes. Because persisting data is not advice or a hint! It's an ABSOLUTE! Live what you pray, buddy!

    Or do you relax your absolutism sometimes? More like a relative absolutism? Hmmm?

    \Mueller\

    That's the most irrelevant argument I've seen posted anywhere in a long time.

    Are you seriously comparing persistent messaging in an enterprise messaging system to IDEs and the like?

    Allright, fine - let's go with your example. There are obviously situations where convenience is important, and absolutely persistence isn't - like an individual developer's environment. There, their system may crash and they lose some their work. This is accepted for the convenience.

    The equivalent to this in JMS is non-persistent messaging. As long as everything works, the messages will get there. But if something crashes, you may lose stuff.

    A source control check in, on the other hand, should be giving you higher level guarantees. When you check something into source control, you're not doing it just for versioning, but also because it's your secure, safe repository. I'm willing to take the hit for check ins being slower than my IDE writing files out, because I need guarantees.

    This is like persistent messaging - and you should only apply it when you really, really need persistent messages.

    \Mueller\
    Look, same with persistent messaging. Under normal operations, without stupid operators and without crashes, persistent data is still present after a reboot. SwiftMQ even writes a last checkpoint during the shutdown hook. So you don't even have a recovery phase. Most projects can live with that. It's what I said: a matter of cost. My very personal opinion is to avoid disk syncs if you can, because you have a magnitude higher throughput then. If you can handle crash situations, this is the cheapest way to go. Use disk syncs only if you must use it. It's just the reverse than you state.

    \Mueller\

    Ah, I see - assuming nothing goes wrong, you don't need to sync.

    Yes, I've seen this attitude many times. Until the operator does screw up, or the system does crash, and the cost of the cleanup is far higher than anyone realized.

    \Mueller\
     And I'm not a beginner. I do my job > 20 years and have seen countless projects, from the early 80's to now.
    \Mueller\

    Congratulations, you have me beat. I only have 14 years experience. You win, you win!

    \Mueller\
    What do you think we have write caches for? It's not a devil like you pray here. It's a good technique to have both, persistence and performance.

    I don't know how high your throughput requirements are per client currently. Say 70 TPS and you reach 100 TPS now. Say you have 1000 concurrent clients and you reach this rate. You have bought hardware whatever possible, cost doesn't matter. Thus, you have 70% load on your system and 30% left. You are happy now, because you fullfil your boss' requirements and your own absolute view as well.
    \Mueller\

    It's a good technique until your first crash.

    As for your example - no, my bosses wouldn't be happy. Where I work no system is ever supposed to be loaded above 50% in normal operations. This is due to spikes on remaining machines due to a possible machine failure, and general capacity planning.

    \Mueller\
    You are a hero, and you have fun to bash JMS vendors in public forums due to their default settings. You know it all, you've done it. You are the MAN!

    But in a half year, economy starts booming again, your boss will tell you that they have hired 200 new traders and the system needs now 150 TPS per client by 2000 clients. Mike, handle it. You have 2 weeks to go in production, otherwise we loose 100M per day. At least!
    \Mueller\

    First, let me say you're flying seriously off the hook here. Why don't you take some deep breaths before posting again.

    As to your example - very interesting. You know traders that put out 150 transactions per second? Man are those sum-a-bitches fast!!!

    Here's something you may want to apply your 20 years of experience to: no GUI client is going to drive anymore than a few TPS. Automated systems and feeds obviously can drive far more - but those systems can be split into multiple threads, each of which appears as a "client" as far as the messaging provider is concerned. This is why scalability is more important individual client rates - because scalability is always where volume comes in. If any given client starts over-driving your messaging system on a single-client basis, you can split that load to make it appear as multiple clients.

    This obviously introduces complexities in the implementation, and you have to watch out for ordering issues and related things. But this sort of approach not only works, but it holds onto the reliability guarantees.

    I've seen some companies say "one client needs to hit 150 TPS", or whatever. They rarely mean that a single thread in a single process needs to hit 150 TPS. They mean "we need 150 TPS from A to B" - and you can usually do that with multiple threads.

    \Mueller\
    Ooops!

    But wait, what was with the disk sync...? You tell it to your boss. Your boss states "Mike, why do you ask me that? Disable it and take care of a recovery scenario, just for the case, you know?".

    And you are still a hero, Mike.
    \Mueller\

    "Disable it and take care of a recovery scenario". Yes, you're right - it's that easy. Just "take care of a recovery scenario". It's so simple I don't know why I didn't think of it.

    \Mueller\
    Having worked in the financial industry myself where it was just normal to release buggy software or to fix bugs directly in production, because otherwise they would have lost tons of money. Again, a matter of cost.
    \Mueller\

    OK. I myself have spent many years in the financial industry. And I can tell you that you are full of crap here. For the companies I've worked for and with, tampering in any way with production is an absolute last resort. It's taboo - often, developers cannot access production in anyway except by phone talking to an operator.

    Operators have big, fat run books (you've worked in finance, you know what a run book is, right?) that cover all sorts of contingencies, procedures to run when various events happen - including a server going down. Unplanned releases into production typically mean a Senior Managing Director is going to be breathing down everyone's ass the next day.

    Further - you have heard of "the books and records of the firm", right? Every major trading system has one database which is designated the official database of record - and this sucker typically runs on massively redundant hardware, almost always Tandem systems or Mainframes, occasionally really huge Unix systems. I can assure you, sir, that this gear is very expensive, and the reason for it is not only speed, but reliability. And you can bet your ass that they have disk syncing turned on.

    Now - most financial applications that I've seen don't bother with persistent messaging at all. There's no point - the message is typically hardened elsewhere, because they want the messaging system fast and there's a database of record running on multi-million dollar hardware.

    And this is where your own ranting shows its flaws. When you're talking about applications that care about data consistency and failure recovery, just about everyone hardens to a big, giant database somewhere and doesn't bother with persistent messaging. They don't need it in the messaging layer because it already exists elsewhere. In the very rare cases with unbelievably high volume feeds, they even forgo the database and use a custom store (which looks alot like a write-forward transaction log). But still - they don't rely on the messaging layer to provide persistence, they do it elsewhere.

    Incidentally - having worked in finance, you've quoted figures for running too slow. Well sir - do you know the numbers for taking too long to recover from a failure? You know how much it costs a firm to fail a trade? How much it costs to keep the fed wire open past 3pm? You know, people who worry such things don't want to hear about doubt. They don't want to hear that "Maybe the JMS layer lost data, maybe it didn't" if the messaging server went down. What they want to hear is "OK, we failed over to the backup machine. We've got the app logs. These 30 trades failed and need to be rebooked, everything else is good".

    What you're peddling, sir, is silver bullets based on shakey premises. There is no silver bullet - turning off disk syncing isn't a magic pill that you can take. It doesn't miraculously up your messaging rate at zero cost. There's always a cost - TANSTAAFL.

    \Mueller\
    I suggest everyone who is concerned about me: do not buy SwiftMQ. It's that easy. I always say my opinion, good or bad. Some like it, some don't. Those who don't might contact a sales person from our competitors. They will always tell you what you like to hear, because you pay for it.
    \Mueller\

    The issues dealt with here haven't been differences of opinions. It's an ignorance of the facts.

        -Mike
  43. disk sync[ Go to top ]

    That's the most irrelevant argument I've seen posted anywhere in a long time. Are you seriously comparing persistent messaging in an enterprise messaging system to IDEs and the like?

    I? No. I just wanted to show you the stupidity of your absolutism. But it seems that you even understand that.

    Every OS comes with enabled write caches. SwiftMQ does just the same per default. It uses write caches. That's all, buddy! If you don't want it then switch it off.

    I've seen many zealots during the times, Structured Programming, SA, SADT, CASE et al, OO, etc, but you are definitely the first disk sync zealot I've seen so far. Had to spend my work times in endless meetings, fighting with zealots like you are. The results were less than nothing. It's contra productive and frustrating. Take 2 guys who know what they do and they will use disk sync if they need it, and turn it off if they don't need it.

    There's one thing I learned over the time. I just have to wait until the zealots have their burn-out. Then they just disappear. Although I gave you already an EOD, the reason why I answer you further is simply that this thread is still on TSS' front page. When it is scrolled out, you can FUD whatever you want - it doesn't interest me a dime.

    Do you understand that?

    Here are some numbers I've collected with SwiftMQ 4.5.1 on my dev box (15K RPM Ultra-SCSI, highly fragmented, dual 2 GHz Xeon). Disk sync enabled, 1 KB message size, PTP, pair-wise with each pair a dedicated queue, TPS rate is that of the router, everything runs on this machine. It's just FYI, so no need to tell me that I don't used a E10K here.

    2 clients, 80 TPS
    10 clients, 440 TPS
    40 clients, 800 TPS
    100 clients, 1600 TPS

    Btw, to get the TPS rate of our performance profile, you just have to double it (except 1:n), because what you see there is the consuming rate, but there is always the same number of producers in parallel. So 10'000 msgs/s is 20'000 TPS. SwiftMQ can also run apps intra-VM. It then used intra-VM connections but runs the whole SMQP protocol over it. On my machine I get > 15'000 msgs/s or 30'000 TPS, respectively. This is with default settings (1 session thread).

    The chance you've definitely missed due to your FUD is to collect some hints "how to get this rate with the JMS acking model" to bump up your OpenJMS derivate.

    -- Andreas
  44. disk sync[ Go to top ]

    10 clients, 440 TPS

    A mistake:

    20 clients, 440 TPS

    -- Andreas
  45. disk sync[ Go to top ]

    \Spille\
    That's the most irrelevant argument I've seen posted anywhere in a long time. Are you seriously comparing persistent messaging in an enterprise messaging system to IDEs and the like?
    \Spille\

    \Mueller\
    I? No. I just wanted to show you the stupidity of your absolutism. But it seems that you even understand that.
    \Mueller\

    I was pointing out the fallacy of comparing enterprise messaging with an IDE saving a file to disk.

    Enterprise solutions tend to have a note of "absolutism", as you label it, in them because the cost of failures in business terms is high.

    \Mueller\
    Every OS comes with enabled write caches. SwiftMQ does just the same per default. It uses write caches. That's all, buddy! If you don't want it then switch it off.
    \Mueller\

    You seem to have forgotten that we're talking about JMS. Not IDEs. Not operating systems. JMS says that a persistent message must survive failures. _Any_ kind of failure (other than the physical disk itself). Not just an orderly shutdown, but many other types of failures.

    If you don't care about messages surviving failures, then don't use persistent messages. It's really that simple.

    \Mueller\
    I've seen many zealots during the times, Structured Programming, SA, SADT, CASE et al, OO, etc, but you are definitely the first disk sync zealot I've seen so far. Had to spend my work times in endless meetings, fighting with zealots like you are. The results were less than nothing. It's contra productive and frustrating. Take 2 guys who know what they do and they will use disk sync if they need it, and turn it off if they don't need it.
    \Mueller\

    I can see why you get into alot of fighting. You seem to abhor resilient systems in favor of pure speed. And the industry is on my side. Look at what IBM says, what Sonic says, what BEA says - they all say that your argument is crap, that not syncing is dangerous and you shouldn't do it. They all say that if you don't care about failover recovery, don't use persistent messaging.

    It's zealots like you, sir, who get pats on the back for the blazing speed of the system so long as everything works. And it's people like you who get cursed out when a real failure happens and the firm finds their "persistent" data really wasn't, because someone like you said it was OK to turn disk syncing off.

    \Mueller\
    There's one thing I learned over the time. I just have to wait until the zealots have their burn-out. Then they just disappear. Although I gave you already an EOD, the reason why I answer you further is simply that this thread is still on TSS' front page. When it is scrolled out, you can FUD whatever you want - it doesn't interest me a dime.
    \Mueller\

    I'm not sure who or what this is directed at. All I can say is that you haven't quoted from any specs. When I corrected your statements about how financial firms are really run and their systems operated, you didn't say anything - I wonder if you do know what a run book is, or anything else about production financial systems. I've asked you about app servers that do async XA - and you've never responded to that either. You keep saying that your viewpoint is "usual" and customary in the industry, and the _only_ example you've produced is SunONE - with BEA, IBM, and Sonic clearly contradicting you.

    You just seem to have a very difficult time admitting any mistake whatsoever.

    \Mueller\
    Here are some numbers I've collected with SwiftMQ 4.5.1 on my dev box (15K RPM Ultra-SCSI, highly fragmented, dual 2 GHz Xeon). Disk sync enabled, 1 KB message size, PTP, pair-wise with each pair a dedicated queue, TPS rate is that of the router, everything runs on this machine. It's just FYI, so no need to tell me that I don't used a E10K here.
    2 clients, 80 TPS
    20 clients, 440 TPS [corrected by your followup]
    40 clients, 800 TPS
    100 clients, 1600 TPS
    \Mueller\

    First, those are very good numbers. Your group commit algorithm appears to be as close to ideal as you can get.

    Now, based on the above, I'm assuming each transaction in the above is in fact hardened to disk, and that you're group commiting multiple transactions at once.

    In the 2 client case, assuming ideal batching you're doing 40 disk forces a second. So each tran is taking an average of 25 millis per sync. For 20 clients, you've got 45 millis transaction. For 40 clients, 50 millis per transaction. For 100 clients, 62 millis per transaction.

    Another way to look at it - at 2 clients each client was averaging 40 TPS. At 20 clients, each client was averaging 22 TPS. At 40 clients, 20 TPS apiece. At 100 clients, each was getting 16 TPS.

    So, first to Mr. Thistle's point - in how many applications do you need a single application thread to drive more than 20 or so TPS? Or conversely, to get better than a 50 millisecond average response time?

    Secondly - for the small number apps that need to go higher, do they really need the messages to be persistent? Most of the examples I can think of - maybe a legacy single-threaded stock ticker feed - have high volumes, but there's no need to persist the messages.

    The point being - it seems to me, based on your numbers, that keeping disk syncing on is perfectly fine for a broad range of applications. For the very small number of applications where you need to have persistent messages and very high TPS (or, conversely, very low response times), then I'd argue that improving your disk performance is a much better idea than turning off syncing off.

    If you add XA into the mix, the performance numbers of course all go down, because you need to sync twice per transaction, plus your work is being done serially with other XA Resources. In my experience, you'll typically only get a third of the TPS rate.

    Here also you seem to advocate not using disk sync. But consider - why in the world are people using XA at all? Usually, they're using XA because they very, very highly value the consistency of their data across all resources. Here the problem is knottier, though. XA is mandating many disk syncs, and this will slow everything down significantly. But again - I argue that people doing XA may want high performance, but they really want very high consistency as well. And turning off disk syncing isn't going to get them high consistency.

    In this case, going to better disks (maybe even all the way up to EMC arrays or the equivalent) is going to get them the performance they need and the consistency they need. And if App Server vendors start parallelizing their XA 2PC work so that they're hitting all resources in parallel, this will add another benefit in terms of performance of 25%-33% (or even better if you have alot of XA resources).

    The point of all of this? People who are serious about Enterprise systems don't just give up. They don't throw away their performance requirements to favor consistency, and they don't throw away consistency requirements in favor of performance. They certainly don't turn off disk syncs. Instead, they find creative ways to get around the bottlenecks but keep their consistency high.

    Now tell me sir - is this FUD? Am I ranting here?

    \Mueller\
    Btw, to get the TPS rate of our performance profile, you just have to double it (except 1:n), because what you see there is the consuming rate, but there is always the same number of producers in parallel. So 10'000 msgs/s is 20'000 TPS. SwiftMQ can also run apps intra-VM. It then used intra-VM connections but runs the whole SMQP protocol over it. On my machine I get > 15'000 msgs/s or 30'000 TPS, respectively. This is with default settings (1 session thread).

    The chance you've definitely missed due to your FUD is to collect some hints "how to get this rate with the JMS acking model" to bump up your OpenJMS derivate.
    \Mueller\

    An ACKing model is only going to show problems when you're dealing with true network links at 100MBit/s or lower rates. It can be particularly accute on 10MBit networks, or WANs. I acknowledge that it's not an issue if you're using gigabit ethernet, or running everything on the same machine (or running Intra-VM for that matter). But for slower networks ACKs can start taking up a significant part of your bandwith.

    This also is not FUD - it's very well known in the messaging industry. Take a look at many non-JMS solutions, and you'll see that they favor message gap detection at the client over ACKing models.

         -Mike
  46. disk sync[ Go to top ]

    Dumb question, but if these messages are small (say 2k), and being committed on some boundary (4k or 8k or something) so that your message density / disk density percentage is say 50%, then your log per 1000 messages would be about 4MB. If you can put a RAID-5 of say 7x 16GB flash devices in place, you can theoretically queue 100GB of undelivered messages, i.e. 25 million messages. Your disk commit times should be under 1ms for such a scenario, correct? We're only talking about a US$50000 "disk" subsystem here, right?

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Easily share live data across a cluster!
  47. \Purdy\
    Dumb question, but if these messages are small (say 2k), and being committed on some boundary (4k or 8k or something) so that your message density / disk density percentage is say 50%, then your log per 1000 messages would be about 4MB. If you can put a RAID-5 of say 7x 16GB flash devices in place, you can theoretically queue 100GB of undelivered messages, i.e. 25 million messages. Your disk commit times should be under 1ms for such a scenario, correct? We're only talking about a US$50000 "disk" subsystem here, right?
    \Purdy\

    That sounds pretty spot on. But I don't know much about what's essentially a completely solid-state disk. I'd be interested in knowing the costs of such systems, what kind of controllers they work on, and the biggie - their write characteristics. I know in the past the biggest problem was a limited number of re-writes that could be done, re-writing being very slow. But I'm pretty ignorant of flash memory's characteristics.

    Are there any such drives commercial available?

    Assuming it's workable, such a system would be ideal for transaction logs and for storage of small messages. On the tran-log side, the "disks" can be pretty small - many J2EE tran logs only need a few megabytes. And it seems it would be pretty fast.

    On small message size, typically you'd only need to hold a day's worth of messages. In the financial arena, with so called T+0 coming (same-day settlement), the window might need to be widened to 5 days worth of data (with migration off of solid state over the weekend).

    But the key is the writing characteristics (and also reliability, but I'd imagine a solid state disk would tend to be more reliable than spinning magnetic platters!).

         -Mike
  48. disk sync[ Go to top ]

    This requires some correction:

    The equivalent to this in JMS is non-persistent messaging. As long as everything works, the messages will get there. But if something crashes, you may lose stuff.

    Non-persistent is at most once. A JMS provider can drop non-persistent messages. Usually, if you restart a JMS provider, non-persistent messages are lost. So your assumption is wrong.

    -- Andreas
  49. disk sync[ Go to top ]

    \Mueller\
    Non-persistent is at most once. A JMS provider can drop non-persistent messages. Usually, if you restart a JMS provider, non-persistent messages are lost. So your assumption is wrong.
    \Mueller\

    Sigh - as I originally mentioned, your original comparison was already a very big stretch. I stretched it even further to try to illustrate the concept of guarantees and their effects.

    Back to messaging - non-persistent messages work fine so long as your JMS provider doesn't go down. If it does, messages will be lost.

    Persistent messages are supposed to always be delivered - even if the provider goes down.

    In your scenario of not disk forcing, persistent messages are delivered when operations are "normal", and persistent messages are delivered the provider is shut down in an orderly manner. In any other failure scenario, messages can be lost - just like non-persistent messaging.

    You appear to discount those other failure scenarios. As a result, you advocate kinda-sorta persistent messaging that injects alot of doubt into persistent messaging. Doubt that's not supposed to exist.

    Given that machines really do fail, operators really do sometimes make mistakes, and even OS' go down on occasion (gasp - possibly even the JMS provider might crash) - given that, from a failure/recovery perspective your advocated "persistent messaging" scheme provides no more guarantees than non-persistent messages. Sometimes the messages get through, sometimes they get lost. That sir is the definition of non-persistent messages - in case you forgot.

        -Mike
  50. disk sync[ Go to top ]

    from a failureperspective your advocated "persistent messaging" scheme provides no more guarantees than non-persistent messages. Sometimes the messages get through, sometimes they get lost. That sir is the definition of non-persistent messages - in case you forgot.

    LOL. pick...pick...pick...

    Basically there are 3 reliability options:

    1) np: very fast
    2) p, sync off: reliable under normal operations, fast
    3) p, sync on: 2) + survives crashes, damned slow

    You admit that 1) and 3) exists but not 2). Even BEA lists the above options as valid options, otherwise they wouldn't make it configurable. The only difference is that we choose 2) as default. I don't want to repeat what's usual, but what do you think - if an enterprise DBMS, used as the JDBC store from SwiftMQ, reaches 300 or even 1000 msgs/s consuming rate (600 resp. 2000 TPS), does it sync or not, given it was just clean installed without changing any config, except creating a database with the message schema?

    Don't you think it's enough now? I mean, if you haven't other things to do (e.g. improving OpenJMS), I can serve you as an answer box and repeat that on and on as long as the thread isn't rolled out from TSS' front page...

    -- Andreas
  51. disk sync[ Go to top ]

    \Mike\
    from a failureperspective your advocated "persistent messaging" scheme provides no more guarantees than non-persistent messages. Sometimes the messages get through, sometimes they get lost. That sir is the definition of non-persistent messages - in case you forgot.
    \Mike\

    \Mueller\
    LOL. pick...pick...pick...

    Basically there are 3 reliability options:

    1) np: very fast
    2) p, sync off: reliable under normal operations, fast
    3) p, sync on: 2) + survives crashes, damned slow

    You admit that 1) and 3) exists but not 2).
    \Mueller\

    #2 does not exist in the JMS spec. Period. All the major JMS vendors I've looked into other than SunONE and Swift speak directly on point to #2 - and say it's dangerous and you should consider non-persistent messages instead of something like #2. IBM doesn't even give you an option for two.

    The flags in JMS are DeliveryMode.PERSISTENT and DeliveryMode.NON_PERSISTENT. As the Javadoc for DeliveryMode states "Clients use delivery mode to tell a JMS provider how to balance message transport reliability with throughput". And there ain't no DeliveryMode.MOSTLY_PERSISTENT.

    \Mueller\
    Even BEA lists the above options as valid options, otherwise they wouldn't make it configurable.
    \Mueller\

    Here's what Tom from BEA had to say on the subject:

    \Tom-BEA\
    All versions of BEA WebLogic JMS, as well as BEA's other two queuing products, use synchronous writes by default. We see no need to document this, as synchronous writes are the only way to achieve safe transactional behavior - something our customers assume they are getting when they buy our products. In JMS 6.0SP? and in 6.1, we added a command-line -D switch to optionally disable sync writes. In 7.0 and 8.1, you can configure via the console. The -D switch is mentioned in the JMS performance guide white-paper on dev2dev.bea.com, not sure where else. Wherever we document the switch, we also document the QOS trade-offs of using it.
    \Tom-BEA\

    As Tom says "synchronous writes are the only way to achieve safe transactional behavior - something our customers assume they are getting when they buy our products".

    As to why the option exists at all, that's very clear if you just peruse the various JMS sites. The various vendors are competing for marketshare, and benchmark numbers are a piece of this competition. And some products seem to give outstanding performance in transactional/persistent situations - and that performance comes strictly from turning off disk syncing. Sonic directly addresses this - they have "EVALUATION MODE" which turns off disk syncing. And the only reason evaluation mode exists is to show comparable numbers to products like _yours_. They advise you turn evaluation mode off. As does BEA. And IBM doesn't even let you turn it off.

    And now, I think I will stop. I've tried to address every single point you've brought up. I've tried to show different scenarios. I've given up tons of data and references. I've tried to accurately explain and explore the issues.

    Meanwhile, I've repeatedly asked you 5 or 6 questions which you've just ignored. You've repeated what you call "usual" without any evidence to back up usual. You claim experience in finance services companies but don't seem to know how systems are run in those companies. You often completely ignore various other posters who are directly contradicting you. And when you're really backed against a wall you resort to the old stand-bys of name calling and insults.

    While other people in this forum have tried to honestly explore the issues and talk about creative solutions to problems, you just go on whining about how you're right about everything. Well, I for one am sick of it. And have fun reconstructing messages for your clients when they have a hardware crash.

        -Mike
  52. disk sync[ Go to top ]

    And now, I think I will stop.

    Yes, good idea. And thanks for the plug.

    -- Andreas
  53. disk sync[ Go to top ]

    Sonic directly addresses this - they have "EVALUATION MODE" which turns off disk syncing. And the only reason evaluation mode exists is to show comparable numbers to products like _yours_.

    Well, this point needs to be corrected.

    We have never compared SwiftMQ against SonicMQ. Never! If they have changed their disk sync policy due to another vendor, then because of Fiorano. They are well known as the "bad guys of JMS" [(c) messageq.com] and they have pestered Sonic alot (vice versa, btw).

    When we created the performance profile, I asked Dave Chappell about a permission to publish a competitive benchmark SwiftMQ/SonicMQ. And I have stated to be fair, that means, we use SonicMQ in eval mode (disk sync off) and SwiftMQ in default mode (disk sync off). Dave said "no, sorry" and that was it.

    Dave knows that I'm a fair sportsman. If we would loose such a benchmark, then we will improve. You might check the old JMS-INTEREST mail archive at sun.com, digger, and you will notice that I was the one which always were on Sonic's side (as a competitor!) when Fiorano starts a new pestering round.

    But that's all old stuff.

    Your assumption is - again - wrong. As many of your assumptions from my unanswered questions. I will tell you only what I want to tell you. Mike, the nosy. Your intention was and is to damage me and my company. That's why I call you nit-picker, fault-finder, a****le, etc. But don't take it too serious, I do it with a smile. :-)

    Cheers, mate.

    -- Andreas
  54. disk sync[ Go to top ]

    \Spille\

    Sonic directly addresses this - they have "EVALUATION MODE" which turns off disk syncing. And the only reason evaluation mode exists is to show comparable numbers to products like _yours_.
    \Spille\

    \Mueller\
    Well, this point needs to be corrected.

    We have never compared SwiftMQ against SonicMQ. Never! If they have changed their disk sync policy due to another vendor, then because of Fiorano. They are well known as the "bad guys of JMS" [(c) messageq.com] and they have pestered Sonic alot (vice versa, btw).
    \Mueller\

    I won't even pretend to know the intricacies of the turf wars in the JMS space. I do know that SonicMQ competes against everybody else, and they don't want their product compared unfavorably against products that don't disk force, or performance reports (such as yours) that show high TPS rates but also don't disk force.

    \Mueller\
    When we created the performance profile, I asked Dave Chappell about a permission to publish a competitive benchmark SwiftMQ/SonicMQ. And I have stated to be fair, that means, we use SonicMQ in eval mode (disk sync off) and SwiftMQ in default mode (disk sync off). Dave said "no, sorry" and that was it.

    Dave knows that I'm a fair sportsman. If we would loose such a benchmark, then we will improve. You might check the old JMS-INTEREST mail archive at sun.com, digger, and you will notice that I was the one which always were on Sonic's side (as a competitor!) when Fiorano starts a new pestering round.
    \Mueller\

    I'm not very interested in the politics of JMS vendors. All I know is that BEA, Sonic, and IBM state very clearly over and over again in their documentation that turning off disk forcing is a very bad idea. Sonic and BEA have added options to allow disk forcing to be turned off - and they've done it under pressure to compete with other vendors. Perhaps Fiorano is some sort of loose cannon, but you clearly favor not forcing, and use that as your default. And as such, you're forcing other vendors to provide similar options against their better judgement, just as much as any other vendor doing the same thing is. You may perceive yourself as some sort of good guy, and Fiorano as a demonized company of some sort, but actions speak louder than words. Your own published benchmarks for your own product - which require permission from no one else, obviously - do not use disk forcing. And you didn't even disclose this fact until last week. Remember that performance report that had a big giant section on persistence - "Notes on Message Persistence" - you make a big deal about reliability and consistency on your performance page, and then didn't reveal that you turned disk forcing off. You called it a "nit pick". Well how does Sonic feel about your nit pick?

    \Mueller\
    But that's all old stuff.

    Your assumption is - again - wrong. As many of your assumptions from my unanswered questions. I will tell you only what I want to tell you. Mike, the nosy. Your intention was and is to damage me and my company. That's why I call you nit-picker, fault-finder, a****le, etc. But don't take it too serious, I do it with a smile
    \Mueller\

    My intention is to damage you and your company? Please, sir, take a deep breath and a dose of reality along with it. Before last week I had only peripheral awareness of SwiftMQ, and no concept at all of who you were (or that you even existed). I frankly couldn't care whether you make 20 million dollars next week and retire in Fiji or go bankrupt this afternoon. I don't compete with you in any way, and I get no pleasure out of engaging with abusive people such as yourself.

    But you are a vendor in a space that I care about, and when you make statements that are just plain wrong I'm not going to just sit back and not say anything. We're in a technical forum, and you've made a surprising number of technical mis-statements in this thread that should not go uncorrected. And, as always, rather than sticking to the facts, you always end up resorting to name calling.
    Yes, sir, you're obviously quite a professional.

         -Mike
  55. disk sync[ Go to top ]

    But you are a vendor in a space that I care about, and when you make statements that are just plain wrong I'm not going to just sit back and not say anything.

    You have said enough. More than enough. You don't know when to stop. Scroll up, my dear. There you will find a reply to Tom from BEA where I wrote that we will rethink our policy for the next release. Thereafter the only point I had was that it is usual to disable syncs and that my very own opinion is to avoid syncs.

    Since I know what your intention is (to look good on my cost), I told you already that I don't take our conversion seriously. I'm calm and relaxed waiting that the thread scrolls out. You continously take the one point "usual" to create a chain of args which ends up that we aren't serious. Remember, a bunch of your messages contains FUD like "stock PCs" and so on. In fact, you started with a thread title that is FUD per se. You tell me that I don't answer questions but you continously ignore what I state about the defaults of some enterprise DBMS.

    It is easy to bash a small vendor, e.g. like Tom from BEA stating that disk syncs off aren't compliant which implicits that we aren't compliant. I don't think that he will state that in a public forum about Sun ONE Message Queue. Why not? It is actionable by court. The only instance which decides about compliance is the CTS. And Sun ONE Message Queue passed that. In fact, it is an administrative issue to configure lazy writes as well as any other issues which even may lead to a drop of persistent messages or to a change of a persistence setting of a messages on the fly.

    IMO, Mike, you are a greenhorn in respect to the JMS market. Some of the "good" guys you mentioned here are very lucky to have you as a buddy. You state that you are not interested in vendor wars but this is a game and you entered it already. There's no good and bad. We all competite with each other. You only have to dig a bit and you'll find issues which are bad from your view as a reliability freak. For example, one of the "good" vendors you mentioned had non-persistent as default for message producers (might be they have changed that in the meantime, don't know), although the spec states that persistent is the default. I guess you'd be able to create a chain of FUD on that issue because it effects reliability and initial speed alot! Another disables the message id per default to get 5 messages more per second. And so on.

    Cheers, mate.

    -- Andreas
  56. disk sync[ Go to top ]

    \Spille\
    But you are a vendor in a space that I care about, and when you make statements that are just plain wrong I'm not going to just sit back and not say anything.
    \Spille\

    \Mueller\
    You have said enough. More than enough. You don't know when to stop. Scroll up, my dear. There you will find a reply to Tom from BEA where I wrote that we will rethink our policy for the next release. Thereafter the only point I had was that it is usual to disable syncs and that my very own opinion is to avoid syncs.
    \Mueller\

    See where I said "just plain wrong"? In reference to that, your point "that is usual to disable syncs" is _just plain wrong_. I've provided many references in defense of that - and the only reference you have provided is SunONE.

    \Mueller\
    Since I know what your intention is (to look good on my cost), I told you already that I don't take our conversion seriously. I'm calm and relaxed waiting that the thread scrolls out. You continously take the one point "usual" to create a chain of args which ends up that we aren't serious. Remember, a bunch of your messages contains FUD like "stock PCs" and so on. In fact, you started with a thread title that is FUD per se. You tell me that I don't answer questions but you continously ignore what I state about the defaults of some enterprise DBMS.
    \Mueller\

    You know my intentions?

    You're calm and relaxed when you call me names?

    As for your "usual" argument - that's been throughly debunked. You haven't provided any references about enterprise DBMS systems. All you've done is make assertions that you're right, and everyone should believe you, and the only refernce you've provided is SunONE. If you want people to believe you, make specific references - demonstrate with specific information that your assertion is correct.

    Chanting the same thing over and over again without evidence does not make it true.

    I back up what I say with concrete references that anyone can verify. You make vague assertions and say things just like "some enterprise DBMS". Which is the creator of FUD here? The name caller who makes vague assertions, or the individual who takes the time to point out specific examples of what he means?

    \Mueller\
    It is easy to bash a small vendor, e.g. like Tom from BEA stating that disk syncs off aren't compliant which implicits that we aren't compliant. I don't think that he will state that in a public forum about Sun ONE Message Queue. Why not? It is actionable by court. The only instance which decides about compliance is the CTS. And Sun ONE Message Queue passed that. In fact, it is an administrative issue to configure lazy writes as well as any other issues which even may lead to a drop of persistent messages or to a change of a persistence setting of a messages on the fly
    \Mueller\

    Vendors can choose to hide behind courts if they wish, but it's not very effective. Anyone can go to a vendor's site and read their documented defaults, and can likewise read the JMS spec. Anyone can publish discrepancies between the two. I've quoted the spec and referenced your web site and demonstrated that.

    I realize that you can't do things like directly use a product and publish findings based on that - there's licensing involved in doing that. But you can comment on publically available information. I can't comment on whether you've passed CTS or not (or BEA, or IBM, or anyone else) - but I can point to your publically avaliable documentation and correlate it to a publically available specification.

    Do you see anyone suing me? You claim I'm damaging you somehow - so make it Actionable. If what I'm doing violates some civil, contract, or criminal law, then go for it.

    If vendors wish to avoid this sort of scrutiny, they should not make this sort of information publically available. This applies equally to any vendor, big or small.

    \Mueller\
    IMO, Mike, you are a greenhorn in respect to the JMS market. Some of the "good" guys you mentioned here are very lucky to have you as a buddy. You state that you are not interested in vendor wars but this is a game and you entered it already. There's no good and bad. We all competite with each other.
    \Mueller\

    I've entered it? In what way? As I said, I don't compete with you. My interest in this matter is as a user of JMS products, and a user of the JMS specification.

    \Mueller\
     You only have to dig a bit and you'll find issues which are bad from your view as a reliability freak. For example, one of the "good" vendors you mentioned had non-persistent as default for message producers (might be they have changed that in the meantime, don't know), although the spec states that persistent is the default. I guess you'd be able to create a chain of FUD on that issue because it effects reliability and initial speed alot! Another disables the message id per default to get 5 messages more per second. And so on.
    \Mueller\

    From the perspective of a user of J2EE, as someone who uses it in large part because of the guarantees it gives me with respect to portability among multiple products, I want to know where products are not in spec, or where default parameters may get me in trouble. I'm not a JMS vendor, so statements of vendors that they have passed some conformance suite mean little to me. I want to know how they behave with specific references ot the publically available JMS spec that developers use to create their systems.

    In terms of the J2EE marketplace, vendors that push products that break or significantly bend specifications weaken the market, and weaken the guarantees that J2EE application developers need. This applies to any vendor - everytime a developer gets whacked by a product that doesn't implement the spec correctly, or gets whacked by a default configuration setting buried in a sea of such settings - everytime this occurs, the credability of J2EE is damaged.

    I understand competitive pressures. I understand the wiggle room in various specfications. I understand the need to show "value" compared to competing products. I personally believe that any wiggling against the spec should be honestly documented by vendors. I believe that value-added features out of the spec should not be enabled by default - or there should be a clear and easily recognizable switch to enable full spec compliance. Out-of-spec features are necessary, necessary to fix real problems addressed by a specification, and necessary for the space to grow and adapt to changing conditions. But the baseline behavior should always be the spec. You can and should provide mechanisms to go outside of the spec when it makes sense, and a conscientious vendor should lobby to get high-value mechanisms of this type into the spec. BEA's early work with EJBs and its out of spec optimizations are a prime example of this.

    And none of this is "FUD". I'm not out to damage anyone. What I'm looking for is vendors to provide the consistency that the J2EE specifications offer. When they don't - when they get into vendor wars, when they tune their products to marketing needs rather than specification demands, when they do that sort of thing, user's get hurt, and the credability and viability of the marketplace is threatened.

    The entire premise of J2EE is interoperability, and portability between products, all enabled by comprehensive specifications that cover both high level and low level behavior. In this sphere there's plenty of room for vendors to add value and differentiate themselves. But when they violate the spec by default, they're doing a disservice to their customers as well as to the premise of J2EE itself.

    You may not care about this, but there are a dizzying number of multi million dollar projects throughout business which are based upon J2EE. J2EE is the foundation on which they're creating their IT infrastructures. And many of these projects have failed because the products they choose let them down. Others are in jeopardy because marketing claims and initial evaluations didn't match the reality of using the product in production. People lose their jobs, their credability, because they believed a vendor's claim to compliance and found out the hard way that it was an empty legalism with little resemblence to the published specifications.

    Customers have no way to verify the entire JMS conformance of a given vendor in a reasonable time frame. This is even more true for more complex areas of J2EE. Given the plethora of configuration options of typical products and the complexity of the specs, customers have no hope to find all the holes and non-conforming defaults in a product while they're evaluating it. Bugs happen - but vendors purposely providing numerous default values out of spec is done deliberately. Customers have to rely on vendors to verify the majority of the spec compliance - and when vendors purposely evade the spec by default, the customers get hurt.

    In your specific case - there's no reason to disable disk syncs by default. You're violating the spec as a default, and it's not immediately obvious when evaluating the product. And a customer can get hit hard - very hard - if they don't catch this and lose data in production. Instead - offer this as a non-compliant optimization, and clearly outline the benefits and drawbacks of this optimization. It's proper place is in tuning performance, something to possibly enable at a later time if the risks are worthwhile. It's proper place is not as the default.

         -Mike
  57. disk sync[ Go to top ]

    See where I said "just plain wrong"? In reference to that, your point "that is usual to disable syncs" is _just plain wrong_. I've provided many references in defense of that - and the only reference you have provided is SunONE.

    You have provided exactly 1 [ONE] reference of Sonic which disables disk sync by default. BEA itself stated it does (that was new for me) and MQ is my reference. It would be nice if you could post only one other reference of a JMS provider with a default persistent store of file store and a sync policy of enabled.

    I repeat: All others I know disable it. You want me that I tell you the names. No, sorry. I'm not an anonymous Mike Spille who thinks he can post what he wants.

    However, you prefer to implicitly make me a liar.

    You haven't provided any references about enterprise DBMS systems.

    Of course, I have. May be you have missed it due to your load of oversized postings. I suggest you to enable "mind sync" sometimes.
     
    Do you see anyone suing me? You claim I'm damaging you somehow - so make it Actionable. If what I'm doing violates some civil, contract, or criminal law, then go for it.

    In the level of discrediting our product you have reached (e.g. publicitly calling it "violates the spec") this is indeed a valid option. I suggest you to check further postings with your lawyer.

    -- Andreas
  58. disk sync[ Go to top ]

    \Mueller\
    You have provided exactly 1 [ONE] reference of Sonic which disables disk sync by default. BEA itself stated it does (that was new for me) and MQ is my reference. It would be nice if you could post only one other reference of a JMS provider with a default persistent store of file store and a sync policy of enabled.
    \Mueller\

    Actually, I specifically referenced Sonic, BEA, and IBM MQ. I did the work and went to the sites and posted the references.

    Those are three verified references.

    Beyond your own product, you've referenced one.

    I don't see how you can refute that.

    \Mueller\
    I repeat: All others I know disable it. You want me that I tell you the names. No, sorry. I'm not an anonymous Mike Spille who thinks he can post what he wants.

    However, you prefer to implicitly make me a liar.
    \Mueller\

    How very convenient for you. You know but you can't tell. As for being a liar - I don't call you a liar. But I see no evidence to back your claims, and plenty of contrary evidence against your statements. And given your attitude it's very, very difficult to take your word for anything.

    As for my being anonymous - it's amusing that you call me by name and then call me anonymous. Anyone can do a google search on '"Mike Spille" Bio' and find out who I am and my professional experience. I'm the second google result for that search.

    \Spille\
    You haven't provided any references about enterprise DBMS systems.
    \Spille\

    \Mueller\
    Of course, I have. May be you have missed it due to your load of oversized postings. I suggest you to enable "mind sync" sometimes.
    \Mueller\

    Name them, please. Web references, if you have the time to track them down. All you've said is "I have tested it with Oracle, DB2, TimesTen....One of them has a throughput of 300, one of 30, one of 1000 msgs/s (I don't say which one has which result)". This isn't a reference, it's an anecdote. Please provide a reference.

    \Spille\
    Do you see anyone suing me? You claim I'm damaging you somehow - so make it Actionable. If what I'm doing violates some civil, contract, or criminal law, then go for it.
    \Spille\

    \Mueller\
    In the level of discrediting our product you have reached (e.g. publicitly calling it "violates the spec") this is indeed a valid option. I suggest you to check further postings with your lawyer.
    \Mueller\

    The documents on your publically available web site state that your default behavior is to not disk sync for persistent messages and XA transactions, and this is contrary to the JMS specifications publically available on java.sun.com. Citing these references with small pieces of text from each is fair use under copyright law. And I have never used or downloaded your software, so your end user licensing agreement does not apply to me.

    So your only recourse would be libel law, under which you must prove that what I've stated in a non-verbal medium is false, and also prove that such a false statement was done with malicious intent to damage you in some way. You'll have a hard enough time proving falsehood. You'll have a much harder time proving that I belive it to be a falsehood. And you'll have a devil of a time trying to show that what I've said was willfully malicious and intended to damage you.

    And any court of law I think would get a hardy chuckle out of the postings here submitted into evidence - accusing me of damage while your own postings are abusive, include repeated cursing, and various other ad hominem attacks.

    For the record - I'm not out to damage you or your company. I did not even pull Swift into this - another poster did. All that I've done is point out statements that you have represented as fact are contradicted by published specifications. And the fact that your own publically available documentation has changed as a result of this exchange only reinforces this.

        -Mike
  59. disk sync[ Go to top ]

    \Spille to Mueller\
    For the record - I'm not out to damage you or your company [Swift].
    \Spille to Mueller\

    For the record, I would like to say the same thing.

    And although WebLogic JMS is a great product, I understand that there are sometimes valid reasons to use another vendor's JMS. In fact, BEA has devoted significant resources to make it *easier* for non-WebLogic JMS vendors to integrate into the application server.

    As for the topic at hand, I *personally disagree* with Mueller that it is OK to disable sync writes by default. I also *personally agree* with Mueller that there are use cases for disabling sync writes (unlike Spille).
  60. disk sync[ Go to top ]

    \Barnes\
    As for the topic at hand, I *personally disagree* with Mueller that it is OK to disable sync writes by default. I also *personally agree* with Mueller that there are use cases for disabling sync writes (unlike Spille).
    \Barnes\

    I think there are scenarios where disabling sync writes makes sense. But I don't think there are many such scenarios. In many situations where someone might consider not syncing, I think there are superior alternatives - discovering that persistent messaging isn't needed at all at the messagng layer, that XA wasn't the right choice. Or changing single threaded publishers to multi-threaded. Investigating higher-end disk sub-systems. Etc. In other words, you can shift the persistece burden or change your hardware without sacrificing consistency.

        -Mike
  61. disk sync[ Go to top ]

    If one defines JMS compliant as passing the CTS, they are fine. If one defines it as having the default config conform to the JMS 1.0.2 spec, which calls out reliability guarantees for persistent messages (no duplicates!), then they are not. ;-)

    Barnes, the emoticon at the end implicits that you vote for the latter, isn't it? See below. ;-)

    And although WebLogic JMS is a great product, I understand that there are sometimes valid reasons to use another vendor's JMS. In fact, BEA has devoted significant resources to make it *easier* for non-WebLogic JMS vendors to integrate into the application server.

    Having been the one fighting with your JMS groups to integrate SwiftMQ XA-wise (which wasn't possible in 6.x), I can tell you a story about your integration, Barnes. I don't talk about inbound messaging with XA which is easy now. I talk about outbound messaging with XA. I had a long fight with a well-known colleague of yours about handling of XA resources and session close. We both independently have contacted the JMS spec guys from Sun on this issue. Due to your handling it was not possible to use a foreign JMS provider in WLS with XA.

    Well, I told your colleague that I feel this is not JMS compliant from WLS. Should I tell you what he told me, Barnes:

    JMS compliancy is only confirmed by the CTS and WLS has passed the CTS! If I would further state that WLS isn't JMS compliant, this would be ACTIONABLE BY COURT!

    I suggest you, Barnes, to check back with your colleague on this issue to build up a unique opinion. Not CTS == compliant if WLS is involved and CTS != compliant if you see a chance to hack on another vendor.

    LOL.

    -- Andreas
  62. disk sync[ Go to top ]

    Andreas,

    Your passion is admirable, but distracting. I will respond here, but this is getting off topic. If you have WebLogic JMS specific feedback, I suggest google searching and posting to the weblogic.developer.interest.jms newsgroup on newsgroup server newsgroups.bea.com.

    Tom


    \barnes\
    And although WebLogic JMS is a great product, I understand that there are sometimes valid reasons to use another vendor's JMS. In fact, BEA has devoted significant resources to make it *easier* for non-WebLogic JMS vendors to integrate into the application server.
    \barnes\

    \mueller\
    Having been the one fighting with your JMS groups to integrate SwiftMQ XA-wise (which wasn't possible in 6.x), I can tell you a story about your integration, Barnes. I don't talk about inbound messaging with XA which is easy now. I talk about outbound messaging with XA. I had a long fight with a well-known colleague of yours about handling of XA resources and session close. We both independently have contacted the JMS spec guys from Sun on this issue. Due to your handling it was not possible to use a foreign JMS provider in WLS with XA.
    \mueller\

    I'm more than a bit surprised, considering other JMS vendors have had success. Inbound and outbound XA integration was most certainly possible with 6.0 and 6.1. Inbound is even easier in 7.0, and outbound even easier in 8.1. Note that BEA's messaging bridge, which is available in 6.1, uses the standard J2EE JMS and XA APIs for both "inbound and outbound messaging", and works with several vendors. Before assuming the worst, I suggest reading the dev2dev.bea.com white papers on the topic, writing things down, and contacting BEA again through customer support.


    \mueller\
    I suggest you, Barnes, to check back with your colleague on this issue to build up a unique opinion. Not CTS == compliant if WLS is involved and CTS != compliant if you see a chance to hack on another vendor.
    \mueller\

    On a daily basis, I work directly with the people that contributed to the original XA specifications, that wrote BEA JMS, BEA MessageQ, and BEA Tuxedo /Q, that wrote the JMS integration code, and that wrote the WebLogic transaction monitor. Call it hubris, but I trust my own opinions a bit. But they are just that, *opinions*!

    I *personally* think it is reasonable to assume that the CTS suites do not cover all of the bases. For instance - they do not check to make sure that transacted sessions are truly transactional in nature. The CTS would need to crash "things" (the JMS server, the client, the O/S, and even the hardware) multiple times to check for conformance.

    I *personally* disagree with certain design decisions in other products, and even certain BEA products, but on the other hand, sometimes such products have to go with the flow. Conversely, I would not be the least bit surprised if you disagree with certain design approaches in BEA JMS. Not everything has to be the same. Not everyone has to agree.
  63. disk sync[ Go to top ]

    Your passion is admirable, but distracting. I will respond here, but this is getting off topic. If you have WebLogic JMS specific feedback, I suggest google searching and posting to the weblogic.developer.interest.jms newsgroup on newsgroup server newsgroups.bea.com.

    My intention was not to distract. I only wanted to show you the difference of your colleague's statement and yours in the context of what means "JMS compliance" and who bashes whom on that definition. JMS compliance per se is very difficult to define, IMHO. I think to say "passes CTS" is ok.

    Anyway, I'm not a nit-picker don't want to roll again into that XA problems.

    Since the thread has been rolled out already, I'll stop here.

    -- Andreas
  64. JMS Compliance[ Go to top ]

    \Mueller\
    JMS compliance per se is very difficult to define, IMHO. I think to say "passes CTS" is ok.
    \Mueller\

    Ignoring corporate branding initiatives, threats of law suits and marketplace competition for a moment, it's pretty easy to define JMS compliance. Compliance is determined by adherence to the specification.

    This may not match J2EE branding strategies, or conformance tests, or what someone wants to put on a marketing brochure, but it's the test that every application developer in the world goes by. In fact it's the only test we can go by. So if some behavior in a J2EE product doesn't match the published specification, guess what - the product is out of spec. The product may have a little sticker on the box bestowed by Sun, but that doesn't mean much to the millions of Java developers out there. It's for more important that the product actually does what the group of published J2EE specifications say it's supposed to do.

    Otherwise - why bother publishing a spec?!?!

        -Mike
  65. disk sync[ Go to top ]

    Hi,

    Both Mike & Andreas have made some good point on JMS spec and disk sync. Bravo guys... I think both of you guys are really expert in the subject matter.

    I would like to ask a few question about JMS. From the top down of this message thread, I just can see we debate on the performance of text messaging which burden a lot of IO. Do you guys ever hard or implement such picture/*.wav/*avi messaging? Does this type of messaging also supported by JMS? Any other vendor implement this kind of mechanism?

    Thank.
  66. disk sync[ Go to top ]

    I would like to ask a few question about JMS. From the top down of this message thread, I just can see we debate on the performance of text messaging which burden a lot of IO. Do you guys ever hard or implement such picture/*.wav/*avi messaging? Does this type of messaging also supported by JMS? Any other vendor implement this kind of mechanism?<

    JMS defines 5 message types where one is the text message. You can transfer whatever you want in e.g. a BytesMessage. However, you would have to chunk it yourself if the content is oversized. JMS hasn't support for streaming.

    -- Andreas
  67. How much speed is needed?[ Go to top ]

    Some points of this discussion are interesting (mostly near beginning before it turned into big flame war) ;-) but I'd like to ask a further question in a different direction.

    There has been a lot of talk about speed here, how disks are slow, how if you are lucky you can get like 50 tps, etc, etc. So my question is how many TPS do you need? I often have a related argument with a friend of mine over speed of dynamic versus static content. He jumps through all these hoops, has all these complicated pre-generation schemes (this is just an alternative form of caching I would argue with him), sacrificing all manner of maintainability in the name of speed. I tend to take the opposite approach always striving for simplicity and worrying about performance as secondary. To him performance is the most important thing no matter what (an opinion I think he probably shares with a lot of developers on this site). To me, perfomance is only important if your system breaks some usability threshold (i.e. 5 seconds to load a page as an arbitrary example). I havn't had a problem yet, but maybe I've just been lucky.

    Anyway, so let's say 1 million transactions over a 5 hr period (actual transactions, not just database access) is a crap load of transactions that very few companies in the real world would every have to deal with. Am I wrong here? I don't know, I've worked for some pretty big companies and I've never seen a need for anything higher than a few TPS -- ever, but this is very anecedotal and maybe I am out to lunch here. Rememeber I'm talking actual transactions, not just arbitrary DB access.

    So if we agree that this is a lot of transactions (which is far from certain ;-) than 1 million transactions over a 5 hr period is roughly 55 TPS. So for 99% of companies out there (including the one I work for) anything that can handle 50 TPS is WAY WAY more than we will EVERY need.

    I don't know your situation Mike. Sounds like you work for a big bank or something, but I suspect (and I am curious and welcome information to the contrary), that for 99% of companies and developers out there, requirements to handle transactions of higher than 50 TPS is just something they are unlikely to every see and that for most this whole speed debate around disk write limitations is irrelevent?

    What do you think?
  68. How much speed is needed?[ Go to top ]

    I'd say you're about 90% right based on my experience. Certainly, individual clients don't usually need more than 10 TPS unless they're doing something very complex or very unusual. In fact, if you've had the endurance to read the past couple of days back-and-forth messages, you'll see that I make largely the same point that you do - that individual transaction times don't mean much (so long as they're no more than a couple hundred millis).

    That said, scalability is important. Any given guy might only drive 10 TPS, but in many situations you might have literally thousands of such guys. Here, disk times come into play. On a typical disk, with grouped commits, you can in my experience drive up to 140-200TPS on a single server going to a single transaction log. But around 150TPS, you can start seriously degrading the response times of individual transactions. A 100-200 milli transaction can start to creep up to 500, 600 millis and higher due to waiting for a disk sync to happen (because disk forcing is single-threaded, and there's likely alot of people in line in front of you).

    Even this might sound irrelevant - until you consider "bursty" traffic. Out of thousands of users, typically not many are actively doing something on a given second. But sometimes an event happens to cause an unusually large number of people to hit the server simultaneously - a hot e-mail, a press conference that spurs traders to buy/sell, whatever. These bursts are where can you really get hurt if you're starting to max out on disk commits.

    So - most people aren't concerned with sustained TPS rates like we're talking about over the course of hours. Instead, it's being able to handle unusual events that create spikes in usage (e.g. bursty traffic) while keeping good response times.

    In addition - a single process in the system can throw a monkey wrench into this. For example, there might be a daemon taking a feed from another system that can go very fast, and be very bursty. If it accesses your messaging system (or database) at its fastest possible rate, it can swamp the system. This is why alot of systems include "throttles" that limit how fast a given client can pump data into the system (or receive it).

    For alot of people, I understand that this doesn't matter too much. There's alot of systems out there with only a hundred or two users where "bursty" means 5 TPS. But you'd be surprised how many applications are out there (yes, especially in finance) where the volume numbers get really big, really fast.

    To tie it back into the most recent debate - following what you're saying (and you're likely right for the majority of systems), disk syncing should stay on because no one will see the performance difference, and effectively get the benefit for free (assuming they use persistent messages at all). For systems that need to scale up to 500/1000/5000 TPS, then it's much more difficult.

        -Mike
  69. How much speed is needed?[ Go to top ]

    Let me add in that there are some applications, like batch jobs, which can be very adversely affected by doing things like adding in XA. For example - a typical one-phase transaction might take 50 millis for a system to do. Add XA in, and that number can balloon out to 250 millis, even if the second resource isn't doing much work. The bulk of the added extra time is from disk syncing. This may not seem like a big deal - unless your batch job needs to run 10,000 transactions. If you could optimize the syncing time (like doing it async in parallel), then you can effectively speed up this type of job without anying coding changes.

    The alternative, if you must use XA, is to parallelize the work if you can.

        -Mike
  70. How much speed is needed?[ Go to top ]

    Interesting points.

    Yes probably message queues should have synch on by default for transactions since, as you point out, in most cases it won't make a difference.

    It's an interesting and complex problem. This discussion has spurred many creative ideas in my brain about how one might do fast transactions, (most too silly and bizarre to bring up here however). Someone needs to make a reliable RAM disk so this problem can go away (maybe this is what ECM does). Or maybe a different technique altogether. To think that in 2003 we are still relying on mechanical device to store data ...

    Cheers.
  71. disk sync[ Go to top ]

    Since it was Sun ONE Message Queue, a well known JMS provider, I'm wondering whether Tom from BEA will tell'em that this is a non-compliant JMS 1.0.2 behavior. My guess is he will not... ;-) <

    If one defines JMS compliant as passing the CTS, they are fine. If one defines it as having the default config conform to the JMS 1.0.2 spec, which calls out reliability guarantees for persistent messages (no duplicates!), then they are not. ;-)

    >>>> Enabling/diabling of disk sync is a matter of cost. How much does it cost me to either live with lost data or more-than-once delivered data in a case of a system failure? Am I able to reconstruct lost data, e.g. by resending messages, and how much does it cost? What is the probability of such a system failure in my environment? What is the cost of a permanent performance lost (= difference between sync'ed / non-sync'ed disk)? Am I able to reach my throughput goals per client with extended hardware and disk sync? What is the cost for that? What is my timeline? Can I live for some time without disk sync? How much does it cost to wait another month before going into production? Project budget? <
    I agree. I think these questions are obvious. Just realize that there are definite complexity trade-offs for using compensating transactions instead of synchronizing writes - compensating transactions result in additional code and can make developing an app far more complex. How does one "undeposit" already spent money, or "give back" a traded stock? How does one even detect that things are in an inconsistent state?

    Tom Barnes
  72. disk sync[ Go to top ]

    \Barnes\
    I agree. I think these questions are obvious. Just realize that there are definite complexity trade-offs for using compensating transactions instead of synchronizing writes - compensating transactions result in additional code and can make developing an app far more complex. How does one "undeposit" already spent money, or "give back" a traded stock? How does one even detect that things are in an inconsistent state?
    \Barnes\

    The most important point is your last one. In case of a failure, how do you know what went through and what didn't?

    A major point of persistence isn't that it's 100% reliable all the time. The point is that you know when failures occur, and what transactions they affected. If a machine fails, some transactions in-flight are going to fail. But the originating client is going to know what transactions those are. And for transactions that reported "succeeded", they can be confident that they really did.

    If your transactions aren't really hardened to disk, then this is what can happen in the event of a failure: for a bunch of in-flights you'll get a fail. But for another bunch of transactions, you'll get success returned to the originating client. But because the transactions aren't hardened at the time the response to the originator is returned, some of those "successful" transaction will have been commited, and others will have vanished.

    As for compensating transactions - they can be used to get mostly the same affect as truly atomic transactions. But they're avoided when possible for the reasons you state - you need aplication code to do it, and often this additional code is quite complex. A side effect of this is that there's a possibility of bugs in the compensating code (higher than some might think - because compensators are rarely tested as often as normal path code).

    Finally - you didn't mention this Tom, but it's still an important issue to me. The issue comes back to defaults. As Mr. Mueller states, companies and their developers can make trade offs and do risk assessments and try to strike a balance. However - what if the developers don't know that disk sycning is off? It's an easy thing to miss. In a perfect world, of course you want to look at every configuration parameter and tune everything to suit your needs, and everyone should know the product's ins and outs intimately. But what if this one parameter is missed, and the app goes into production lacking the safety that the developers assumed was there.

         -Mike
  73. disk sync[ Go to top ]

    I'm sorry, I missed one. This is from the SwiftMQ Performance page:

    \Swift\
    Notes on Message Persistence
    The JMS specification only requires to write persistent messages to disk, it doesn't state anything on how it should be done. Unfortunately this prefers JMS providers in benchmarks which have a poor persistent store implementation, because the less a provider does during persistent operations, the faster he is.
    SwiftMQ's persistent store implementation is a page-oriented, checkpoint-based message database with write-ahead logging. That is, changes on pages are journalized on byte level and then written to a transaction log during commit before any data page is flushed to disk (write ahead). Thus, a consistent state can be reconstructed out of the transaction log at any time. The actual database (the page file) contains only those changes which have been logged to the transaction log. This is a state-of-the-art store implementation and standard for database management systems since decades. It is reliable on the first place and then fast, based on reliability, because it mainly writes to the fast transaction log
    \Swift\

    Someone somewhere seems to value reliability. What's really ironic is that this is a paper on performance, that's also stressing reliability, and you've now helpfully added the following:

    \Swift\
    SwiftMQ 3.2.0 Router Standard
    java -server -Xmx512M
    transaction log size 200 MB
    disk sync disabled (force-sync="false")
    connection factory "plainsocket@router1" were used for non-recover tests
    connection factory "plainsocket_recover@router1" were used for recover tests
    \Swift\

    So after going on and on about persistent store reliability, and castigating competitors who aren't reliably writing ("Unfortunately this prefers JMS providers in benchmarks which have a poor persistent store implementation"), you add a little note saying oh, by the way, these tests were used with just the sort of settings we were telling you were so bad! The paper says "It is reliable on the first place and then fast, based on reliability, because it mainly writes to the fast transaction log" - and then purposely turns off the reliability in your tests.

    This is priceless!!!!

        -Mike
  74. blazing performance[ Go to top ]

    Mike,

    I don't know much about the topic, but I'm curious about some of the numbers being discussed here you you and Andreas.

    Mike: Swift says the disk forces incur the following "One disk sync takes between 20 and 50 milliseconds, concerning to the disk speed". This is patently, absolutely false. Even without EMC, with just a cheap RAID array you can easily do alot better than 20-50 millis per force. Even without an array, on my HP-UX setup without RAID I get 10-20 millis per force. With an array, it's 3-10 millis per force.

    If you have 15k RPM drives, it requires 4ms just to get the platter to pass fully under (or over) a head, right? So if the head is pre-positioned correctly, you can position rotationally in a minimum average of 2ms and then flush (depends) and probably read-verify (depends). That's _if_ the head is already in the right place, which usually takes between 5-6ms on a 15k RPM high-end drive. (We have some IBM U160 drives rated at 5.3ms IIRC.) My understanding is that enterprise-class drives do a read-verify on the same rotation as the write itself. That means that your best-case average flush is 2ms+ (where the + is an unknown) and typical is 7ms+ (5ms for the head movement and 2ms for the average 1/2 rotation). Depending on how much information you are flushing, that + component could be very significant. (That's probably one reason why MSMQ limits messages to 4k max!) If you've got multiple sectors to flush (something that tx logs avoid), it would of course be worse. And you have to count API, OS, driver, cable, SCSI processor, cable, command set interpretation, etc. into the cost.

    Mike: With an array, it's 3-10 millis per force.

    Can you suggest why a RAID array would be faster? Have you ever heard the joke: If one pregnant woman can produce a baby in 9 months, how long would it take if nine women were pregnant? (It's not one month ;-)

    My guess is that the RAID array is caching. I know some cache writes, even when given a force-to-disk request. (Ever seen the batteries mounted on the RAID adapters to keep the RAM alive? You'd think they'd use flash RAM more for write-behind caching. I think I've got more "flash cache" in my MP3 player than most servers have on their RAID arrays ;-)

    One other possibility is that your 3ms force-to-disk waits actually don't force anything to disk because there is nothing in the queue to write, right? (I'm just trying to figure out where the big discrepency comes from. Maybe HP/UX isn't as lazy with write-behinds as Windows, for example. And on that topic, what is with Solaris? It seems to only force anything to disk when its cache is full or you actually issue a "sync" ... so if you do something and lose power 24 hours later, you can lose that 24-hour-old work!)

    Lastly, Mike, you've apparently done quite a bit of work in this space ... why not download and try the SwiftMQ product? I'm pretty sure you can still grab an eval of it for free off their site (iit.de?) and tell us how it compares to what you've used / seen elsewhere. I'm curious.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Easily share live data across a cluster!
  75. blazing performance[ Go to top ]

    Yep. The seek speed also matters, especially if you don't have a continuous file space. That's the reason why SwiftMQ maintains different free-pages chains to re-use pages with lower page numbers as often as possible. Just compare a sync after writes at the beginning of a file against a sync after writes at the end of a 1 GB file. This is a significant difference, at least on some platforms.

    -- Andreas
  76. blazing performance[ Go to top ]

    \Purdy\
    If you have 15k RPM drives, it requires 4ms just to get the platter to pass fully under (or over) a head, right? So if the head is pre-positioned correctly, you can position rotationally in a minimum average of 2ms and then flush (depends) and probably read-verify (depends). That's _if_ the head is already in the right place, which usually takes between 5-6ms on a 15k RPM high-end drive.
    \Purdy\

    I haven't verified this, but it sounds about right. For transaction log type of applications, you typically try to dedicate the drive to the tran log. Once you do this, the write-forward-only aspect of the tran log comes into play e.g. high performance transaction log implementations want the tran log to be a contiguous file on disk, and they only write forward - they never seek on the disk, except when switching logs or checkpointing.

    \Purdy\
     (We have some IBM U160 drives rated at 5.3ms IIRC.) My understanding is that enterprise-class drives do a read-verify on the same rotation as the write itself. That means that your best-case average flush is 2ms+ (where the + is an unknown) and typical is 7ms+ (5ms for the head movement and 2ms for the average 1/2 rotation). Depending on how much information you are flushing, that + component could be very significant. (That's probably one reason why MSMQ limits messages to 4k max!) If you've got multiple sectors to flush (something that tx logs avoid), it would of course be worse. And you have to count API, OS, driver, cable, SCSI processor, cable, command set interpretation, etc. into the cost.
    \Purdy\

    Alot depends on the specific combination of the operating system, the disk controller, and the physical disk itself.

    For a tran log, most of the data you're flushing is very small (on the order of a few hundred bytes max). When dealing with a JMS provider, if your tran log includes the published messages (or your message store is elsewhere but acts like a tran log), then the data size can be important. In the financial space where I do most of my work, we strongly advise people to keep their messages as small as possible. This is generally achieved because technical financial people have been aware of messaging performance for a long time, and small messages are built into the culture. Anyway, where I'm working now a 2K message is commonly referred to as "really big". If you allow much bigger messages than this and you're pumping out hundreds or thousands of messages a second during burst activity, you can quickly swamp your I/O channels, run out of disk space, or run out of memory.

    \Purdy\
    Can you suggest why a RAID array would be faster? Have you ever heard the joke: If one pregnant woman can produce a baby in 9 months, how long would it take if nine women were pregnant? (It's not one month ;-)
    \Purdy\

    I'm not a RAID expert, but I'd suggest part is because you're striping your data over multiple drives. For high end RAIDs, they do indeed cache to memory (I know EMCs do this). But at this really high end there's alot of different pieces that are coming into play that all contribute to both performance and reliability (fast disks, striped writes, RAM caches, highly optimized server->controller->disk links, etc).

    \Purdy\
    My guess is that the RAID array is caching. I know some cache writes, even when given a force-to-disk request. (Ever seen the batteries mounted on the RAID adapters to keep the RAM alive? You'd think they'd use flash RAM more for write-behind caching. I think I've got more "flash cache" in my MP3 player than most servers have on their RAID arrays ;-)
    \Purdy\

    From what I've seen, it depends on the array. The space is very mature and very well-defined, and there are many price points with increasing feature sets & performance.

    And your performance for something like a transaction log is going to be tightly tied to the whole I/O system. SCSI systems seem to do poorly with high forcing rates for small amounts of data, IDE believe it or not does better forcing tiny amounts of data. On a variety of non-RAID hardware, I've seen forces that range from 10 millis up to 60 millis (as I said, on a plain-old disk on HP-UX my forces range from 10-20 millis). If someone else is using the disk, these numbers are going to go up alot as many more disk seeks are required. Windows vs. Linux vs. HP-UX vs. Solaris are likely to have small differences due to how the VMs work and the drivers involved.

    Certainly the raw specs of the disk are only piece of the puzzle.

    \Purdy\
    One other possibility is that your 3ms force-to-disk waits actually don't force anything to disk because there is nothing in the queue to write, right? (I'm just trying to figure out where the big discrepency comes from. Maybe HP/UX isn't as lazy with write-behinds as Windows, for example. And on that topic, what is with Solaris? It seems to only force anything to disk when its cache is full or you actually issue a "sync" ... so if you do something and lose power 24 hours later, you can lose that 24-hour-old work!)
    \Purdy\

    Well, at the application level this can happen because of grouped disk commits, but it should be relatively rare.

    For most of the OS differences, I'd have to say "don't know". I do know that Solaris can force partial data, since I've done it, but I don't know the particulars or have any numbers handy.

    \Purdy\
    Lastly, Mike, you've apparently done quite a bit of work in this space ... why not download and try the SwiftMQ product? I'm pretty sure you can still grab an eval of it for free off their site (iit.de?) and tell us how it compares to what you've used / seen elsewhere. I'm curious.
    \Purdy\

    Actually, comparing a number of JMS implementations from a performance perspective is on my todo list, and Swift has been on the list. But I gotta get a release out the door first. :-)

    I can say I have no beefs with Swift products in general. I've heard some good things about them, and I know in some areas they're giving IBM MQ a run for its money. But their documentation rankles me in a number of areas, as you might have already guessed! And the cause of this rankling is that, invariably, I end up in extended meetings repeatedly with people who point to the vendor documentation and say "See, they're the experts and they say do it this way!" - even when I know that way is wrong for the application at hand. I've had this happen to me many, many times over the years. I know it's a losing battle to try to effect change (vendors are vendors, and their docs are inevitably going to be twisted a bit to match marketing & sales pressures).

    One of the things I want to investigate sometime in the future is RAW partitions to see what kind of speed increase you get as opposed to mounting a regular Unix file system. I know Oracle et al recommend this if you're looking for high performance, but I don't know how big of a pop you get from it.

         -Mike
  77. SwiftMQ's blazing performance[ Go to top ]

    \Mike\
    Do you know any application servers that drive 2PC asynchronously right now? I've only seen synchronous 2PC to date. The problem here is that with the current spec, an app server would need an awful lot of threads to handle multiple simultaneous transactions asynchronously. If someone's done it without creating a gazillion threads, I'd love to take a look at it.
    \Mike\

    I am not aware of any TP monitor that implemented support for async resource-managers - this was an optional part of the original XA specification. And, as you correctly pointed out earlier, the J2EE JTA specification did not include these optional asynchronous APIs.

    I too would like to see asynchronous resource managers, but at least with java, multi-threading is easier. This in turn makes it easier for a well written transaction monitors (and well written resource managers) to aggregate simultaneous transactions into single forced writes. Allowing for multiple simultaneous transactions to occur in the same time as one.

    Tom, BEA
  78. Performance reports[ Go to top ]

    \Mueller\
    Sure. Go, tell me the URL where I can see such a detailed performance profile from another JMS vendor. I would be happy with a single platform. You'll not find anything, even not the big ones like IBM or Tibco or BEA. So what you are talking about?
    \Mueller\

    Yeah, you're right, I spent 20 minutes doing google searches, I couldn't find any detailed performance profiles on any JMS vendors. Oh, except for these:

    http://www-3.ibm.com/software/ts/mqseries/txppacs/ip11.html
    http://wwws.sun.com/software/products/message_queue/wp_JMSperformance.pdf
    http://www-3.ibm.com/software/ts/mqseries/txppacs/mp77.html
    http://www7b.boulder.ibm.com/wsdd/library/techarticles/0111_dunn/0111_dunn.html
    http://dev2dev.bea.com/articles/zadrozny_002.jsp
    http://dev2dev.bea.com/resourcelibrary/whitepapers/WL_JMS_Perform_GD.jsp
    http://e-docs.bea.com/wls/docs81/ConsoleHelp/jms_tuning.html
    http://www.sonicsoftware.com/products/sonicmq/benchmarks/index.ssp
    http://www.sonicsoftware.com/sonic/sonicmq/whitepapers/benchmarking_ebus_mes_prov.pdf
    http://www.my-channels.com/developers/nirvana/documentation/sonicVnirvana.pdf

    Again, this is a brief list from a short search. There are also many excellent indepedent performance reports, like this:

    http://wstonline.bitpipe.com/data/rlist?t=soft_10_50_20&orgtypegrp=ALL

    Several of the reports also include detailed information on things like CPU utilization and disk utilization. My brief searches didn't find many multi-platform reports (except for IBM, which has many, many individual reports for different platforms), but many of the included links at least did their tests on real Enterprise hardware, not just a dual-Pentium III PC.

    TIBCO is very closed mouthed - I couldn't find anything useful on them.

         -Mike
  79. Performance reports[ Go to top ]

    Yeah, you're right, I spent 20 minutes doing google searches, I couldn't find any detailed performance profiles on any JMS vendors. Oh, except for these:

    >
    > http://www-3.ibm.com/software/ts/mqseries/txppacs/ip11.html
    > http://wwws.sun.com/software/products/message_queue/wp_JMSperformance.pdf
    > http://www-3.ibm.com/software/ts/mqseries/txppacs/mp77.html
    > http://www7b.boulder.ibm.com/wsdd/library/techarticles/0111_dunn/0111_dunn.html
    > http://dev2dev.bea.com/articles/zadrozny_002.jsp
    > http://dev2dev.bea.com/resourcelibrary/whitepapers/WL_JMS_Perform_GD.jsp
    > http://e-docs.bea.com/wls/docs81/ConsoleHelp/jms_tuning.html
    > http://www.sonicsoftware.com/products/sonicmq/benchmarks/index.ssp
    > http://www.sonicsoftware.com/sonic/sonicmq/whitepapers/benchmarking_ebus_mes_prov.pdf
    > http://www.my-channels.com/developers/nirvana/documentation/sonicVnirvana.pdf
    >
    > Again, this is a brief list from a short search. There are also many excellent indepedent performance reports, like this:
    >
    > http://wstonline.bitpipe.com/data/rlist?t=soft_10_50_20&orgtypegrp=ALL

    Well, it seems that you don't want to follow me. Our perf profile covers the whole JMS functionality with 14'700 single load tests, incl. rollback/recover, excl. XA/ASF, for our product and shows you how SwiftMQ performs under it. I haven't seen anything comparable yet. The above links (I've checked them all, most of them I know already [we are a JMS vendor]) are either marketing papers (product A vs product B), general tuning guides, performance packs (MQ) or even (the last link) just a search result of marketing WPs (which, again, I already know).

    > but many of the included links at least did their tests on
    > real Enterprise hardware, not just a dual-Pentium III PC.

    Look, this is the point. It's just not interesting for you what the benchmark tells you. What you like to do is to put pot on my head by continously suggesting that we buy our hardware from a computer shop around the corner and that we are not serious. That's what you like to suggest with your posting. The truth is that we have of course performed our benchmarks on several platforms. The box above, a Dell dual 1 GHz PIII on Linux, had simply the best results. Better than a 8 CPU Sun box.

    -- Andreas
  80. SwiftMQ's blazing performance[ Go to top ]

    The problem here is that with the current spec, an app server would need an awful lot of threads to handle multiple simultaneous transactions asynchronously. If someone's done it without creating a gazillion threads, I'd love to take a look at it.

    You know what a thread pool is?? ;-)

    -- Andreas
  81. Async XA[ Go to top ]

    \Mueller\
    For the JTA stuff. I don't think it's forbidden to drive the 2PC async from a Tx manager as long you access the XAResource from the same thread, means, dispatch all prepares on different threads and wait until all have been completed. Then do the same with commit. I can't imagine that the JTA spec forces to drive the 2PC synchronously. That would be too slow. If you have other infos, please post it.
    \Mueller\

    Here's what I've found from going through the JTA spec in detail. This is from the 1.0.1B version of the spec...

    First off, JTA removes all threading restrictions that were present in the X/Open spec. From the JTA spec on differences from the X/Open spec....

    'The DTP concept of "Thread of Control" maps to all Java threads that are given access to the XAResource and Connection objects. For example, it is legal (although in practice rarely used) for two different Java threads to perform the start and end operations on the same XAResource object.'

    and further...

    '3.4.3 Thread of control
    The X/Open XA interface specifies that the transaction association related XA calls must be invoked from the same thread context. This thread of control requirement is not applicable to the object-oriented component based application run-time environment, in which application threads are dispatched dynamically at method invocation time. [...] Thus the XAResource interface specified in this document requires that the resource managers be able to support the two-phase commit protocol from any thread context".

    I additionally don't see any references in the spec to serial access to the resources - in fact, the JTA spec is pretty sparse in general - you really have to read both JTA and the X/Open spec to understand the real specification.

    In any case, it appears you can prepare and commit resources in parallel - but given the current JTA spec, this can only done by multiple threads within the global transaction maanger. This can put a high burden on the TM, which probably already has a lot of threads doing work :-).

    For example - let's say 10 transactions are ready to prepare at a given moment. It seems to me that if the TM wanted to process them all at once in an async manner, it would need 30 threads to accomplish it - one per resource, plus a master thread waiting for those resource threads to complete. That's compared to 10 threads in a non-async model. That's alotta threads! If there's another way to do it that wouldn't need thread threads per tran, I'd be very interested in hearing about it.

    Compare this to the X/Open model. In this model, the TM can effectively say:

       handle1 = resource1.prepare (ASYNC);
       handle2 = resourc2.prepare (ASYNC);
       xa_complete (handle1);
       xa_complete (handle2);

    (Note: it's not syntactically complete, but written with clarity in mind).

    In the above the two prepares fire off prepare requests, but return immediately without waiting for a response. The complete calls then wait for each response in turn. This means both resources are preparing in parallel, and the wall clock time for this is effectively the length of the longest prepare. The same can be done on commit.

    Given that the resources are almost always remote, this means only one thread is required for within the TM process. In this model, yhe individual XA resource "client" drivers would need to fire off prepares (or commits) to their servers without waiting for a response, and then gathering the response when complete is called. This gathering could happen with a single thread in the XA client driver, since the driver is presumed to know the internal protocol. The TM, on the other hand, can't use a single thread because its only knowledge of the resource is via the XAResource and connection/session interface, and in the current spec that prepare call will always block to completion.

        -Mike
  82. Async XA[ Go to top ]

    In any case, it appears you can prepare and commit resources in parallel - but given the current JTA spec, this can only done by multiple threads within the global transaction maanger.

    Look, this is another point (I read the JTA spec as well since I have implemented SwiftMQ's XA). Your initial rant was that async stuff is forbidden. And now you tell it's not, so you were wrong. But instead of just saying that you were wrong you try to open another issue, the threads. But that wasn't the issue.

    This can put a high burden on the TM, which probably already has a lot of threads doing work :-).

    Usually one would use a threadpool to minimize the number of threads. A dedicated new thread per operation is just stupid.

    -- Andreas
  83. Async XA[ Go to top ]

    \Mueller\
    Look, this is another point (I read the JTA spec as well since I have implemented SwiftMQ's XA). Your initial rant was that async stuff is forbidden. And now you tell it's not, so you were wrong. But instead of just saying that you were wrong you try to open another issue, the threads. But that wasn't the issue.
    \Mueller\

    My memory was inaccurate. I checked the spec, found I was wrong, and shared the relevant information with this forum. You make that sound like a bad thing.

    \Mueller\
    Usually one would use a threadpool to minimize the number of threads. A dedicated new thread per operation is just stupid.
    \Mueller\

    Yes, I understand this. Do you agree that thread pool or no, you're increasing the overall threading burden on the server?

    Do you also understand that the if JTA directedly supported async calls that it would reduce this burden on the app server? Don't you think this would be a good thing?

    It seems to me you're more interested in besting me in some manner than discusssing the issues and (gasp) trying to suggest improvements.

        -Mike
  84. ¿MQ uses DB2?[ Go to top ]

    /Purdy/
    Even MQ Series uses DB2 inside it.
    /Purdy/

    This is not true.
    MQ has its own logging, transaction management, storage, data file management, etc.
  85. ¿MQ uses DB2?[ Go to top ]

    Cameron: Even MQ Series uses DB2 inside it.

    Dino: This is not true. MQ has its own logging, transaction management, storage, data file management, etc.

    According to one of the MQ Series developers (who should know), the MQ Series product embeds a version of DB2. (I didn't mean to imply that it requires a separate copy of DB2.)

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Easily share live data across a cluster!
  86. Thought this might interest many on this thread.

    Larry Jacobs contributed to week 6 of the Middleware Architecture Series on "Distributed Transactions". He discusses the the global consistency promises of two-phase commitment protocols and transactional messaging systems and how architects can select the right technology for the application.

    Checkout the Middleware Architecture Series
    Distributed Transactions: What you need to know to stay out of trouble

    Middleware Arch Series: http://otn.oracle.com/middleware

    -
    Sudhakar Ramakrishnan
    Oracle Corp.
  87. \Ramakrishnan\
    Thought this might interest many on this thread.
    Larry Jacobs contributed to week 6 of the Middleware Architecture Series on "Distributed Transactions". He discusses the the global consistency promises of two-phase commitment protocols and transactional messaging systems and how architects can select the right technology for the application.
    \Ramakrishnan\

    Well, here's what I got out of that article:

    Number one point - Use Oracle products for everything!!!

    According to the article, you can avoid 2PC by having the database and the messaging provider one and the same. Um - this only works if every single database/messaging combination in my system happens to be Oracle. Not bloody likely!

    If my database and messaging system aren't Oracle, we're back to the in-doubt transaction problem.

    The article's bottom line "In just about every case you and your system will be better off without any distributed two-phase commit". The author lost a tremendous amount of credibility with this one. He's advocating moving systems towards messaging-in-the-middle - and thereby completely losing the "Isolation" part of ACID you get with true global transactions. Using the example from the article, your New York system will _always_ be out of sync with your San Fran one (because for any given transaction, San Fran will be commited and just sending the message over to NY. NY will commit later when the message comes in).

    On point-in-time recovery: almost complete nonesense. In most recovery scenarios, the transaction manager has sufficient information to coordinate recovery. If the tran manager or one of the XA resources becomes "corrupted", as the article says, you're pretty fairly well screwed whether you're using his messaging solution or a 2PC solution. In the end, you've still got two databases, one of them is inconsistent due to "buggy" app code, and some poor schmoe is going to have to manually slog through both sides to figure out what to do.

    On availability, the article says "Failure isolation, and therefore availability, distinguishes this message-based system from the earlier 2PC system. The participants can continue executing even when another participant fails....". This makes it sound like this messaging solution is a drop-in replacement for 2PC. It's not. This comes back to isolation, atomicity, and consistency. In a typical 2PC situation, you're doing 2PC because the resources involved must be in sync at all times. To go out of sync is very, very bad. But in the messaging scenario, the author boasts effectively that you can keep doing transactions on database A while database B is down. If you're relying on consistency, this is unbelievably bad - when database B comes back it will be horrendously out of date with database A! This may not seem like a problem, until you realize that data likely isn't just flowing from A to B, but may be going from B to A as well. If the data flow is bi-directional, then the inconsitencies between the databases will eventually cause both databases to end up inconsistent.

    There are some real problems that the author does a good job of pointing out - such as held database locks for in-doubt transactions in the case of a failure, and performance problems. But a resonsible author would advocate hardening the hardware for the in-doubt case (make sure you're friggin' hardware is highly available!), and offer suggestions the performance side (memory-caching disk arrays, the possibility of async XA, etc). To imply that you can drop a messaging solution in in place of a 2PC solution is just simply wrong.

         -Mike
  88. It's good to read some sense on these forums, thanks Mike. However, I can't find the prohibition you mention in JTA against preparing/committing all the XAResources in parallel. Since I've been considering implementing exactly that, I'd appreciate a more specific reference.
  89. \Jencks\
    It's good to read some sense on these forums, thanks Mike. However, I can't find the prohibition you mention in JTA against preparing/committing all the XAResources in parallel. Since I've been considering implementing exactly that, I'd appreciate a more specific reference.
    \Jencks\

    I don't have the spec handy so I can't give you a definitive reference. However, from memory - the original X/Open spec has explicit support for asynchronous 2PC. Fully within the presented framework you can fire off asynchronous requests and then wait for all resources to return responses (sort of a high level select/poll equivalent at a much higher level of abstraction).

    The Java spec intentionally does not include async support - it says so in the text. The reasoning behind this isn't at all clear to me (it seems as ambiguous and empty as the single-threading of Sessions in JMS...).

    Given that - I suppose its possible for a global transaction manager to create specialized threads to fire off transactions, and have a coordinating thread wait for those threads to receive their synchronous responses from the XA resources. But I don't recall off hand if the JTA spec allows or disallows this (or more likely is silent on the issue). I'll have to have a closer look at the spec later to see if I can track down anything specific...

        -Mike
  90. If time is relative...[ Go to top ]

    So much of the needs and decisions seem to depend on the specific real world
    application. There seems to be a common thread of a balance of speed and
    reliability.
     If we take the example of a large active bank account where the transaction sequence order is a necessary component of the data save, the efficiency savings that Mr. Jacobs describes seem to be a function of network latency.
     Considering database vendors have been working for 30+ years on the reliability
    of data storage and retrieval within context of the TPC races , is the message service's sole function (in this specific example) one of reducing network latency ?
      What type of data store model do banks use for large active accounts with many
    independent users that create hundreds to thousands of transactions per day ?
  91. True to form[ Go to top ]

    Once again a "discussion" on TSS has descended into loudmouthed bickering, name calling, ad hominem attacks, grandstanding, boasting, and spreading of rumors.

    True to form.