Lucy Monahan is a Principal Performance QA Engineer at Novell, and helps to manage their distributed Agile process. In this article
she describes Step-By-Step how Novell can trace intermittent errors in a distributed enterprise application while running load tests as part of Novell's agile process.
Intermittent problems can be difficult to solve:
* the value may be corrupted potentially at any time before the value is used
* the transaction may involve multiple enterprise servers using remote calls and the root cause could be on any server
* to identify the point of corruption, one may need to monitor return values and arguments
* many test runs may be needed to reproduce the problem
* heavyweight tools may impede ability to reproduce the problem, particularly if the problem is a race condition in a multi-threaded application
these concerns can be collapsed by using features that will provide the relevant debug data with low impact on performance. Here we look at an example where there are two servers involved in a transaction. server1 is running a Java application containing REST endpoints and calls server2 remotely via SOAP. Occasionally a bad value is returned to the client along with a NumberFormatException. But where is the value being corrupted?
Read the full article at: Tracing Intermittent Errors