While virtualization and JVM abstraction allows development and administration to go ahead without any grave concerns about the underlying system that will be hosting applications, software architects and performance specialists are realizing, now more than ever, that in order to obtain optimal performance, a strong understanding of the underlying CPU, hardware and system as a whole is the key to getting applications to run most efficiently.
In today’s software development world, there is no shortage of seminars, summits, presentations and round tables on all sorts of compelling topics. It’s tempting to bounce from one hot new idea to the next, hobnobbing about the potential of the latest 4G mobile app or how embedded technology might soon interface directly with the human brain. But smart Java developers know that sometimes it’s the process of getting back to the basics that enables the greatest strides forward.
Hardware performance is still key
Richard Warburton, lead of Java SE track at Devoxx UK, is one of the visionaries helping Java developers return to the fundamentals. CPU caching is one of the topics he discussed at Devoxx. He’s helping developers dig past their Java code, through the virtualization layer and right down into the hardware. Richard says the community is starting to grasp the importance of the relationships among these various factors in application development, “You want to work in sympathy with your hardware. You want to understand the behavior of your JVM and the way a lot of these systems interact together.”
Warburton also describes the shift that’s occurring in the concept of what performant code looks like. “It’s an interesting situation. Historically, people have this perception that you have to write this butt-ugly, awful code to make things perform well. Actually, we are realizing you don’t need to brutalize your code base.” He says you can use nice, clean and elegant code, “As long as you understand clearly what’s going on when your code actually runs – and understand it in enough of an overview to ensure you are writing things in a manner that’s going to work well with your hardware.”
Code versus hardware
As it turns out, many of the performance issues developers run into have to do directly with how their code is interacting with their hardware resources. Richard says recurring problems plague developers who:
- Don’t size their thread pools properly in servlet containers. This leads to cache thrash and time lost to context switching.
- Choose the wrong algorithms when working with their data structure. For example, they may use a LinkedList that isn’t cache-friendly when an ArrayList might be a better choice.
- Aren’t aware of how even basic objects get laid out in memory. You can’t optimize what you don’t understand.
According to Warburton, it’s a big mistake to try to correct a problem without measuring what’s actually happening first. Yet all too often, that’s what happens. Instead of uncovering the root cause of performance problems, developers try to fix things by rewriting a bunch of code. “That’s not the quickest route to solving your performance problem. It’s frequently a situation where you want to go in there and look at something like a GC log or even the MXBeans. Use something like VisualVM to monitor what’s going on and get a clear picture of what’s actually happening with your memory profile rather than going in and changing code.”
Some things just don’t change
If you’re thinking that cloud computing and modern programming languages will save you from needing to know this stuff, you’re wrong. Richard points out that cloud computing doesn’t eliminate the need to have an understanding of how code interacts with hardware. “It doesn’t fundamentally change the situation. It just adds in another layer that you might need to think about.” Developers have been working with virtualization for a long time now. The fact that it happens in the cloud now doesn’t make the performance issues go away. It just makes it a little more challenging to understand the relationship. For example, the same hardware may be in use by many different developers doing completely different things at the same time.
What about languages like Scala and Clojure that are supposed to make writing performant code easier? They do offer some benefits for writing code that readily scales out to more CPUs. However, the foundation remains the same. You still have the same hardware, JVM, JIT, and garbage collectors. If you want to diagnose a performance problem, you’ll use the same tooling What might those tools be? Here are Richard’s quick recommendations:
- For Linux, use vmstat command to identify where your bottlenecks are and if you’ve got a lot of paging going on
- Censum by jClarity is a good GC log analyzer to help you discover long pause times before you start messing with your application code (full disclosure, Warburton works for jClarity)
- Any old profiler - you don’t need an expensive one, the simple profiler that comes with visualVM frequently does the job perfectly well
In the end, software architects and system developers must understand that despite the great leaps forward made in the world of virtualization and cloud computing, abstraction is never a good excuse for not understanding how software that is developed interacts with the underlying hardware.
What tools do you use to make the most of your hardware so you don’t waste time rewriting code? Let us know in the comments.