Article: Concurrency Testing in Java Applications


News: Article: Concurrency Testing in Java Applications

  1. Testing and optimizing Java code to handle concurrent activities is difficult without test automation. Even with test automation, being able to correlate the test activity from the client side to observations of thread, memory, object and database connection use on the server side is difficult at best. In this article, Frank Cohen describes methods for concurrency testing in Java applications and shows a new technique to correlate what a Java application server is doing on the server side while a load test automation tool drives a test on the client side.
    There are some things you should expect in today's test automation tools. For instance, many test automation tools are now using unit tests in ways that aid in surfacing concurrency problems. Each unit test operates a particular method or class. Stringing unit tests together into a sequence forms a functional test of a business flow. You should also expect the test tool to provide libraries that embellish unit tests to use protocol handlers to speak the native protocols of the application or service under test, including HTTP, SOAP, Ajax, REST, Email, and custom network protocols. Additionally, you should expect the test tool embellishes functional unit tests with a Data Production Library (DPL) to provide test data as input to the unit test and data to validate the service or class response. It is reasonable to find a test automation framework wrapping unit tests into functional tests that may be operated concurrently to identify concurrency issues in an application. I coined the term TestScenario in my open-source test tool (TestMaker) to name this wrapper. A TestScenario operates the unit tests as a functional test by operating a sequence of unit tests once, as a load test by operating sequences of the unit tests in concurrently running threads, and as service monitors by running the unit tests periodically. The TestScenario gives us the flexibility to try multiple sequences of business flows concurrently.

    Threaded Messages (25)

  2. I think the title is rather misleading. This article is about using specific tool that has really nothing to do with concurrent testing (which is a complex problem - but not discussed). Just my 2 cents, Nikita Ivanov. GridGain - Grid Computing Made Simple
  3. being able to correlate the test activity from the client side to observations of thread, memory, object and database connection use on the server side is difficult at best.
    Difficult? Yes very difficult when using the wrong tool(s). A distributed tracing and profiling solution can make this extremely easy enabling the correlation of server activity and resource consumption a breeze even across many nodes within a grid computing platform. GridGain: Parallel & Remote Tracing Resource Metering: Across Clusters, Hosts, Processes and Threads Coherence: DataGrid Traffic Analysis What I find so amusing about this article, which clearly is named poorly, is that none of the tools mentioned could even pin-point a concurrency problem and locate the code block or external resource interaction patterns causing the performance problems. The tools cannot even measure monitor contention (blocking) a basic for any Java performance & concurrency analysis tool never mind database concurrency problems. Database Concurrency Database Lock Contention regards, William
  4. I think concurrency is ambiguous and Frank in this case shows how to test concurrency that is mostly related to application loads, this is different than say true multithreading concurrency testing. Load concurrency and scalability can be tested in multi-process environments as well and does not truly test the multithreading safety of the application. This low level of testing can be done with TestNG at nice level of granularity, though more granular controls and more deterministic multi-threaded tests would be welcomed. Good article btw Frank. Ilya
  5. Some thoughts[ Go to top ]

    The title might have been better as "A call for help to the load testing and concurrency analysis tools vendors on behalf of Java developers needing to do concurrency testing." Don't we developers deserve better tools to debug the concurrency problems we are encountering in building REST-ful and interdependent applications? The tools should work in our existing environments without requiring us to change platforms. I test other people's applications so normally I can't put Coherence, GridGain or others into the system I am testing. (Even if my customers would greatly benefit from getting good test instrumentation when using these platforms.) I suspect that most developers and testers are in this situation. The article assumes that you were not on these platforms and still need to do concurrency testing. JXInsight looks good as an agent-based concurrency analysis tool. However, there is no load generating component to it. One assumes that the server is already under load. My article argues for a way to correlate the activity at the server-side when you generate load using a load testing tool. Finally, I guess I have to live with shameless plugs from GridGain since I'm guilty of it myself. Of course, that doesn't make either of us right. -Frank Cohen
  6. Enjoyed the Article[ Go to top ]

    Frank, I enjoyed your article and appreciated how you showed the value of Glassbox. I'm excited at the opportunity for organizations to put together solutions based on open source software, which they can use pervasively without licensing fees or concern that a small vendor might go out of business. As a factual note, of course Glassbox does detect thread contention and other database problems. However, we are delivering a tool that non-experts can benefit from, and that experts can extend to provide holistic management. Best Regards, Ron
  7. Re: Enjoyed the Article[ Go to top ]

    Hi Ron,
    I'm excited at the opportunity for organizations to put together solutions based on open source software, which they can use pervasively without licensing fees or concern that a small vendor might go out of business
    Do you say this every morning when looking in the mirror? Ron, no matter how many times you repeat yourself (over the last 2 years) it will not be reflect or impact reality. How long are you going to play the open source card when this is all a charade? Most customers choose products that ** work ** and are under active development. Active by the way does not equate to a code base in beta for more than a year with very little improvements in feature set or quality. Licensing generally comes into the equation when these basic conditions are met.
    As a factual note, of course Glassbox does detect thread contention and other database problems. Please tell us all how this is accomplished so that we understand what you actually mean by this.
    However, we are delivering a tool that non-experts can benefit from, and that experts can extend to provide holistic management.
    Non-expert? I know many Java and AOP "experts" that do not understand what is meant by "thread contention" and "database problems" but I will wait for your response to above before going into more detail. regards, William
  8. Finished[ Go to top ]

    I will let users decide for themselves the quality and improvements in our offerings. Over & out, Ron
  9. Re: Enjoyed the Article[ Go to top ]

    I'm excited at the opportunity for organizations to put together solutions based on open source software, which they can use pervasively without licensing fees or concern that a small vendor might go out of business.
    I would agree here that statements like that won't impress the audience of TSS, in my humble opinion. I would also agree with others that although the article is interesting, it should have been made clear by the editorial staff what it is about and what it is not. There's absolutely nothing wrong with promoting a product, but hiding it under wrong title is another matter. Just my 2 cents, Nikita Ivanov. GridGain - Grid Computing Made Simple
  10. Product promotion[ Go to top ]

    I like the article and the discussion of this thread - but don't you think that not only the article but the whole discussion is a "hidden" promotion of products like JXInsight and GridGain. Why do you post a message with a link of your product including a nice slogan? Mirko
  11. Re: Product promotion[ Go to top ]

    Mirko, Yes but there are links referencing other products including Coherence and iTKO Lisa which you failed to mention. But the difference here is that (1) it is open in its intention (2) and it is relevant to the contentious points within the article. I do try to only post replies on topics when I fill that the content and conclusions are largely subject to debate which was the case here. There is no hidden promotion here and my language is to the point and certainly not subtle. If I was really here to promote my product and not my true honest opinions on performance management (which is naturally reflected in my products design and blog postings) then I would be referring to the other posters as "the honorable gentleman" and dropping subtle clues to their underlying motives. I leave that to others who have apparently much more experience and patience. regards, William
  12. Re: Product promotion[ Go to top ]

    William, I have much respect for your knowledge and enjoy reading your comments and blogs. What I wanted to say is, that posting messages and article with links to products is a very good "hidden" marketing and that there is nothing to blame about this. Mirko
  13. Re: Product promotion[ Go to top ]

    Hi Mirko, Agreed. The links do generate extra traffic to my blog and increase awareness of JXInsight though probably not as much as one would imagine. I am always surprised to find from our web stats that the majority of our repeat visitors come from our commercial competitors. Apparently we are doing a good job living up the to the company name of J*INSPIRED*. I would like to say that I reference these articles because every time I come across a problem in the field that took a little longer to solve than the usual "William" time I bring the solution back into the product and publish an article. So both the blog and the product represent a journal of my experiences with real-world (and very difficult) performance problems as well as my current technology interests such as grid computing. regards, William
  14. Hi Mirko and William: What do you guys think about the thesis of the article: that test automation needs load generation activities correlated to back-end observations to deliver actionable knowledge? Is this important to your development and test efforts? How would you accomplish such correlation with today's tools? What would you advise the tools vendors to do? The article is my statement to the tools vendors that they need to do more to deliver useful, acttionable knowledge. -Frank
  15. Load generation[ Go to top ]

    Hi Frank, I can agree with you that it is difficult to test concurrency problems. If we troubleshoot such problems for our customers, the process depends on the problem and the tools we have: - If its a synchronization or deadlock problem in production, than we use Java Core dumps to analyze the problem or we use tools like dynaTrace diagnostics ( for analysis - one advantage of dynatrace is, that you get every single request path including the request and method parameters. (of course this depends on filter settings) This is very useful if the synch/deadlock depends on the input data. - If we cannot do a production analysis (or maybe the application is not yet in production), we try to reproduce the problem in pre-production, using load testing tools (JMeter, Loadrunner, SilkPerformer, etc.). The problem is, that it is sometimes very difficult to get the right use-cases and data. It would be very nice to "reverse-engineer" laodtest-cases and data from production or to use Apache Logfiles, Tomcat Request Dump Valve, etc. Logs to get better test cases. If the problem can be reproduced, than it is often very easy to analyse the root-cause with tools like dynaTrace, JXInsight, PerformaSure etc. Correlating diagnosis data directly with the load-test is a great feature and is already realized with SilkPerformer/dynaTrace and I think Loadrunner has also a integration with the HP/Mercury analysis tools (Topaz). Would be nice to have "open" standarized interfaces for load-test tools, to plug-in diagnosis/monitoring tools. Mirko
  16. Re: Load generation[ Go to top ]

    Hi Mirko, Taking a core dump, which is basically a diagnostic image, is a matter of life with some production environments and technology stacks but I hope you would agree that having to do this in production is a failure to some degree though not necessarily a reflection of the testing teams efforts as they might simply have not got access to the right tooling or have a comprehensive test suite and all the time in the world for running different mixes. I will disagree on one point you do raise which I suspect reflects more your product usage and who you have been listening too and that is method argument capture. The benefits of method argument capture, which by the way JXInsight has supported since May 2006 ( and much more so with JXInsight Diagnostics, is overstated in the context of concurrency testing unless of course you are a vendor who has more experience in developing load generators and less so in production environments were at the end of the day concurrent boils down to contention for particular system and and software resources. The art here is trying t understanding the situations (transactional mixes) that cause the contention behavior to drastically alter. In software performance engineering this is called the system execution model. Load generator vendors are naturally obsessed with parameters because they are trying to simulate real world behavior but from a software performance engineering perspective this is not paramount at least not for all method invocations and when so it is mainly at the entry point. At this stage most engineers do recognize the importance of contextual tracing (which we pioneered in the Java space) is important as a means to identify the particular component instance the software is interacting with, such as the SQL used in creating the prepared statement or the URL for the inbound or outbound servlet request. But lets be honest here how many times has anyone come across a case were sending in the value of 2 or 4 drastically altered the performance of a web application. What are you going to do? Tell all users to not use 2 or 4 as query parameters. If the parameter values do alter the software it is generally because it alters the resource interaction (SQL statement) and less so the execution path though there will always be a case to prove me wrong but this has rarely happened (in fact never in my experience). Parameter values always leave a software execution trail (path), which is a concept we introduced more than 5 years ago, and it is better to focus on the path (software interaction) and less so on the parameter set (which could be huge) as a execution path is normally created in the software for a reason. Understand the execution path and then tune for this case whilst being careful to not add overhead to more common execution paths. regards, William
  17. Method argument capture[ Go to top ]

    Hi William, I agree that the JXInsight timeline analysis is a very helpful feature to analyze the system execution model and to identify extensive resource consumption, but you have to remeber: Many roads lead to Rome...there is not just the "William road" :-) Mirko
  18. Re: Method argument capture[ Go to top ]

    Well I think I will be proven right next year when more products release similar timeline analysis views. We will then have to read how such companies set a "new standard" once again in performance management via their own "innovation" ripped completely from my blog. 2 weeks ago while attending the Software Test & Performance conference in Boston I met up with a customer of DynaTrace who told me that they had requested for the "JXInsight Timeline" to be added. They were only using the product in development so maybe their requirements are different. By the way I just published a blog entry relevant to our current discussions on concurrent performance analysis. Benchmark Analysis: Guice vs Spring regards, William
  19. Tool Comparison[ Go to top ]

    Hi William, the discussion shows, that it is very difficult to compare monitoring/profiling tools because of the compexity of performance monitoring/tuning. I am thinking about designing an evaluation matrix for these kind of tools. As I don't like feature comparison matrix I would prefer to define a list of analysis/troubleshooting use cases which the tool has to solve. Also a benchmarking application for these tools would be nice to measure the "real" overhead for different analysis levels. I've experienced serious differences in overhead with different tools and almost all state that they have the lowest overhead in the market. Another point is that overhead should not just be measured in clock time or CPU time but in memory overhead. I know a tool that has low performance overhead but a huge memory overhead (> 30%). What do you think about it?
  20. Re: Tool Comparison[ Go to top ]

    Overhead is a term abused by non-technical sales and marketing staff and engineers with questionable integrity. Last year I met with some sales people who had left Precise and moved to you know who. When I asked them why they moved they stated that this new company and product had a much better proposition because of the low overhead. Well I was taken aback by this because they had previously stated that Precise In-depth for J2EE had a very low overhead (less than 5%). I asked could it get even lower than 5% and they said well that was not entirely the case and that this new offering was indeed 5%. Would I trust these guys. NO. The question is how do you accurately measure overhead when overhead is a relative to the performance of the existing system. An application with severe performance issues in terms of database response times is going to make any CPU overhead added by instrumentation look extremely low - like less than 5%. On the other hand a system that is highly tuned and is CPU bound is going to have severe performance degradation unless of course the application is not using all available CPU's and you offline the work onto the free CPU via a separate queue and workers. The overhead is there but now it is hidden from the response times but if the application does change its workload then this overhead might become more obvious and much higher than another tool that adds a smaller overhead to the calling thread. I have seen vendors claim zero overhead just by using this technique on a test kit box - all carefully crafted to full the customer. Moving the overhead to another box is also another form of hiding the overhead but a little bit more transparent but at least the response times are less impacted though you will find the data collection is much more flat and less contextual as some of the analysis worked is executed outside of the calling context. All performance engineers architects make trade-offs between working memory size and performance overhead. This is a difficult task especially when applications are so diverse. I understood this from the very start of the development of JDBInsight and JXInsight and added system properties at practical every major execution location in the code were there was a trade-off to be made. Today we ship with a properties file (jxinsight..config) that contains 4,000 system properties. About 500 of these can tip the scale between memory foot print and performance overhead. I tell customers to initially run with the defaults for prod, dev, or test until they have a good understanding of the software execution model of the application being managed. When they are about to start pre-production testing we work through the snapshots with them identifying what data is really of interest in a production environment in terms of service management and incident management (including problem management) then we start turning off features. Of course all these system properties add complexity especially if do explore each one in turn. The DynaTrace customer I did talk too said that "if you have an IQ over 150, the tool [JXInsight] can do just about anything you want it to - it is very powerful". I would like to think he exaggerated a little on the IQ requirement but I do recognize that given so much power (extensibility, data collection coverage) to the normal user has its drawbacks. But I do not think we need to dumb down a tool so much that it becomes a toy that has very little educational value but it easy to sell to higher management how do not have to solve real-world performance problems themselves. By the way our automated analysis Inspections was an attempt to help the novice tool user and novice performance engineer. JXInsight 5.0.8 - High Concurrent Table Access Analysis Inspections Blog Entries It is extremely hard to automate everything as can be seen from the benchmark blog entry. regards, William
  21. Hi Frank, I initially thought such test case correlation would be useful and that is why I developed the iTKO Lisa integration largely on my own and with very little input from the guys at iTKO. I did try to contact Borland on a number of occasions and they never responded (so much for being an ex-employee deemed one its "most precious assets"). The lack of response is more than likely a result of an incestuous business relationship between Borland and DynaTrace were each CTO previously held the same position, CTO, at the same company. Apparently there is nothing open about Borland's Open ALM. Getting back to the issues with test correlation. Lets replace the "test case" with "user request" which is basically what a load generator is trying to simulate - generate comparative user workloads for a production environment. Now lets place ourself in a production environment were one of more performance problems have been reported. Do you think operations or service management are going to attempt to correlate each resource consumption back to every single individual user request - all the way back to the browser. What would be the benefit when most of the time there is a clear entry point into the system, which already identifies the actual use/test case being performed, and the underlying cause (assuming it is not network issue between the user and application server) is present within one or more of the computing nodes in the back-end. The goal should be to understand the changes in performance model as the workload, generated by either a user or load test tool, varies in terms of volume, activities, current capacity,...... (I am ignoring user experience management here). I personally focus on the resource contention as normally it is not possible to alter the user behavior without going through a lengthly development and change management cycle at least in the short term and when the situation is still manageable (risk management). Tracing or correlating work (request, transactions,...) back to a user (or client) is useful when you are focused on the software execution model. What piece of code executed by the client resulted in 10,000 transactions being executed? The following link shows the main activities of SPE (Software Performance Engineering) which is the basis for many of the concepts and features in our product related to performance management. Software Test & Performance Conference 2007 User Experience Test Ajax: A Performance Problem in the Making Software Execution Model meets System Execution Model regards, William
  22. Re: Some thoughts[ Go to top ]

    Frank the proposed title still does not reflect the article content and underlying intentions. Come one do you honestly think people on TSS are that stupid to believe that this article was a call to tools vendors when you yourself are a tool vendor and you promote another tool vendor. How many tools did you actually investigate before you decided to make a call? JXInsight was the first product on the market to include thread monitoring contention and thread monitor waiting times alongside clock times, gc, and cpu times for every traced interval. That was nearly 7 years ago. Recently new entrants into this market have copied this by including waiting call times in their products even going so far to promote such features as new and innovative. In a normal market future customer needs are generally anticipated by product designers. Why do (did) you think this would be any way different in your case?
    Don't we developers deserve better tools to debug the concurrency problems we are encountering in building REST-ful and interdependent applications? The tools should work in our existing environments without requiring us to change platforms.
    Maybe my initial response lead to some confusion. Distributed execution analysis is not the same as concurrent thread execution analysis though it most cases the distribution of work is used to increase the concurrent execution at a cluster level. JXInsight can trace across vendor middle ware platforms and not just Coherence but it can also be used to perform concurrency analysis even of standalone applications which by the way is not really the target of the tools you mentioned. JXInsight places no restrictions on the execution environment we have had customers use our distributed tracing facilitates to trace calls from a handhelds->CORBA->Java EJB-Database. We are not the only one providing such capabilities at least one other commercial performance vendor offers similar capabilities.
    JXInsight looks good as an agent-based concurrency analysis tool. However, there is no load generating component to it. One assumes that the server is already under load.
    JXInsight is not a load generator. JXInsight is used by performance engineers and testers to understand what the results of a test actually mean rather than prove that under a particular workload everything collapses. Testing should be used as a way to acquire a better understanding of the software & system execution models which means you need to observe the component and resource interaction inside the application under load testing - test to learn rather than learn to test. All said we do in fact offer integration with load test generators providing ** true ** resource usage (and contention) correlation of test cases via our distributed tracing. The following article shows how we can accurately profile a test case. In fact we are profiling the complete execution flow from the load tester generator JVM across the middleware (RMI) and to the EJB component running in a JBoss container. I am surprised you did not come across this article before since it does mention a competitor and your favorite phrase - "SOA Testing". iTKO Lisa SOA Testing Integration - Part 2 I have no problem someone promoting their own product (or consultancy service) on TSS via an article but can we at least have less of these blanket statements "difficult at best" about the state of the other tools on the market when in fact it is obvious the author has done very little research outside of his own or partner product(s) in formulating such a conclusion that re-enforces an ill-formed perception of reality. Joseph is this not your job? regards, William
  23. Re: Some thoughts[ Go to top ]

    One other issue with concurrent testing and analysis which is absent from the article is the tracing of work scheduled by one thread and executed by another. I am once again surprised by such an admission especially as this issue is obvious (at least to me) when looking at Brian's article referenced at the start of your own article. Brian even mentions work queues and demonstrates the use of a future. The problem here in terms of performance analysis is not simply one of thread contention (this need not necessarily be the case) and more so on correlating resource consumption across threads and disconnected caller stacks even in a collocation context. This will become increasingly important with the greater usage of the Java concurrent package in applications and middle ware products. For your information GridGain and Oracle Coherence both use various different work queues structures and thread pools in executing a client data or work request. Profiling a distributed client request is an near impossible task not because of the remote procedure call protocol but because of the number of threads that interact with the work as is moves up and down the technology stack within each JVM. We could not do distributed trace analysis today for such products unless which could first trace the execution of a work unit across threads within a single JVM. Here is an article I wrote at the beginning of 2006 showing this issue in relation to the Swing/AWT event queuing mechanism. Yes even the very, very difficult problems have been solved by other solutions. The only real call to vendors should be one of better education and marketing of existing product capabilities. I would like to think I did do this but maybe you were not listening at the time. William
  24. Re: Some thoughts[ Go to top ]

    admission => omission
  25. Re: Some thoughts[ Go to top ]

    Frank I do not mean to beat you up on every aspect of the article but I think you have largely misrepresented correlation in relation to the tooling. You talk about the tool being able to correlate the slowness of a request with a slow database operation when in fact all that is said or could be said (at least from my knowledge of the product and codebase) is that during this particular request a number of database operations made by the ** same ** executing thread executed slow or slower than normal. This is basic profiling or tracing. This is not the same as "the database was running slowly overall". Such a statement implies that the slowness was experienced by others thread as well including possibly those in other JVM's. I still cannot find out how you actual do perform correlation across these tools. Would this be simply the case of looking at the output from one tool and manually comparing it with output from the other tool. If so then I can understand the editors reference to "difficult at best". A lot of manual effort and hardly very accurate when there it is not based on event a single statistical correlation method. The following articles might inspire you to revisit your approach to correlation and your view of the performance test & management solutions market Correlated Metric Inspections Beautiful Evidence: Metric Monitoring regards, William
  26. Concurrency testing in Java applications is definitely a tricky affair. The article mentions some good points about concurrency testing, tool features to look for. And I would have loved to read more on this. But the article lost me as an audience in the later parts, where author digressed into the product used for monitoring the appServer bottlenecks, which I believe fall more a part of routine load testing and performance tuning. -Nikhil.