I was surprised this week to see a posting questioning the performance results of a call stack sample profiler in monitoring a low latency trading application. The poster was clearly confused by the results, but not as much as I was confused as why anyone would consider using such an approach in a low latency environment in the first place. Obviously he did not fully understand what is involved in monitoring performance and how best to apply various performance measurement techniques. When asked what lead him to believe that sampling was suitable, his response was that he had been told to ensure little or no impact on latency and throughput. He naturally, but incorrectly assumed, sampling was appropriate. When he did finally get to look at the performance data collected he was deeply disappointed that it offered very little useful information.
“It is telling me what I already know…we have 20 threads executing the same call path repeatedly and concurrently…what I need to know is the frequency of particular method/path executions and the latency ranges of their execution over time…this [sampling data] is redundant…useless”.