Every once in awhile each and every one of us is trying to measure performance of certain parts of the code. It can be either a fully functional application measured against specific number of user transactions per minute or a microbenchmark. Which you have written to prove that your fellow developer is wrong. Especially in the latter case it is beneficial to be aware of the pitfalls skewing your benchmark results:
- Noisy neighbours. Running on a desktop? Are you sure you have switched off the torrent client? Or what about the windows updates? Or disk defrag? Running on a server? Are you virtualized? Are you sure the otherĀ instances in the same hardware do not do something which is tough to virtualize properly - such as intense I/O?
- Garbage collection pauses. Do you now the pause times introduced during the test? Sure they haven't tampered your results?
- JIT optimizations. Are you running your measurements on optimized or unoptimized code? How do you know?
- Class loading. Are all the classes loaded and initialized before the measurement starts? Or do you suffer from the lazy loading consequences?
- Different architecture. Are you running on the same platform used in production? Same number of cores? Same 64bit architecture? Are you sure?
- Measurements. Are your measurements in nanoseconds? Sure you are not in the wrong territory? Maybe you still can to create a more coarsely grained test and reach into safer "seconds and tens of seconds" territory instead of the nano-level?
All in all - if this frightened you off from writing your next microbenchmark then - good. Because most often you shouldnt. But if you still are into it, you can check out the more detailed post on the subject from this blog.