To be honest I have never found CPU sampling based approaches to application performance management terribly effective because of (1) the inaccurate mechanism used to calculate method timings and (2) the loss of performance model data such as invocation counts, timing averages, and local and global minimums and maximums. The last issue being extremely important considering how most service level agreements (SLA) are currently defined and how users in general perceive the performance of an application.
Yes, sampling does have its advantages in that the overhead can be extremely low if the number of threads and depth of the call stack is kept within reasonable limits. But is it not misleading to report timing based on a statistical sampling of the call stack for every thread when it is not always possible to ascertain whether a method present on two consecutive samples was indeed always active and consuming CPU time during the interval? Personally I would prefer to report the sample counts along with the consecutive sample counts to the user.
So how does one achieve the high precision accuracy of event profiling/tracing with the low overhead of sampling? Simple, provide a hybrid solution that delivers the best of both approaches and more - probe metering strategies.
Read on: http://blog.jinspired.com/?p=190
JXInsight 5.5 provides a new innovative approach to metering that allows for multiple profiling strategies to be applied during a measurement window such as sampling every Nth probe or spacing the time interval between each probe metered for a particular probe name. Truly unique is the ability to configure each profiling strategy separately and chain them together.
Probe strategies provide rules about when to sample, basically, based on probe type.
A Probe is fired when the Probe.begin() method is called. When fired the probes runtime will check whether one or more metering Strategy objects are installed. If there is at least one strategy installed the runtime will request the strategy instance to vote on whether the firing should be metered.
If the metering strategy(s) votes YES then a Reading will be created for each installed Meter with the Metering associated with the probe Group updated when the Probe.end() is called.
If the vote is NO then no reading of the meters or updating of the associated group metering will happen. This can reduce the overhead significantly as the most expensive operation within the probes runtime is the accessing of the underlying counter associated with a meter...
Probe types provided are:
- Frequency Strategy: The name based strategy meters every Nth firing of a probe.
- Interval Strategy: This name based strategy meters every firing of a probe with an interval of N milliseconds since the last metering.
- WarmUp Strategy: This name based strategy meters after an initial N number of firings of a probe.
- Burst Strategy: This global strategy meters the firing of probes within bursts. The strategy is very similar to the frequency strategy but with the option of having more than one firing metered after a N number of probe firings. The gap property specifies the number of firings before entering the metering phase. The run property specifies the number of fired probes metered during the burst phase.
- Sample Strategy: This global metering strategy meters the firing of probes within time intervals. The strategy is very similar to the interval strategy but with the option of having more than one probe metered after the interval. The gap property specifies the time interval between metering phases. The run property specifies the number of fired probes metered during the sample phase.
- Delay Strategy: This global metering strategy meters the firing of probes after an initial time delay. The value property specifies the time from the initialization of the probes runtime to the metering phase.
You can even chain strategies together to get even more flexibility in your sample set.