Gregg Sporar asks: "Isn't everyone a profiling guy?"

Discussions

News: Gregg Sporar asks: "Isn't everyone a profiling guy?"

  1. Gregg Sporar, an avid JFluid user, had an article about JFluid forwarded to a project's "profiling guy," and asked if everyone wasn't a profiling guy. His point was that JFluid was very easy to use, yet his question brings up an interesting point: how much should any given developer profile?

    In projects where I've been "the profiling guy," developers did their own pigeonhole optimizations, but since I tend to work on large apps with many interrelated components, the developers typically didn't have a chance to see how their modules actually affected the entire application (or, for that matter, the entire application server). Having a performance architect can be a very smart thing in that case, because that's the only architect who has to know the entire lifecycle of the application, through the development all the way through deployment and into system management.

    So what do you think? How much should developers be profiling, and to what end - memory, speed, or size of deployable objects? Mr. Sporar is obviously advocating JFluid's user experience, but what tools do you find to be "must-haves?"

    Link:

    Threaded Messages (62)

  2. Nope, not if you're lucky[ Go to top ]

    As we continue to beat the drum "make it right, then make it fast", developers should be writing code use the most practical algorithms and idioms that balance productivity, performance and maintenance.

    Then, let the specialized profiling folks find the hot spots and then get back to the developers when they find offending code.

    Developers need to be getting themselves into a groove where they're focused more so on getting the task done rather than second guessing and micro optimizing everything they right.

    Let the folks with the tools, tests, frameworks and workflows to measure and evaluate the code spend the time hunting down the 10-20% that shows up later rather than having the developers fixate on the 100% all the time.

    Performance is and always will be a tuning technique that has to account for all factors of the application, not necessarily the few little bits that the respective developers are focused on.

    Mind, this doesn't mean they should be using crummy algorithms or techniques, but over time you learn what mostly works and what mostly doesn't, so you rely on your past experience and apply it with confidence in your work, because confident work is more correct and more productive work. If and when you learn later that something perhaps should be written differently for better performance, then fix it, put another notch on your pistol grip, and move on using the new found knowledge to apply to future work.
  3. What is a good profiler for web applications?
    any links?
  4. Java Open Source To The Rescue!

    Eclipse has excelent support for profiling web apps.

    Simply double click your web app source file in the IDE. This will bring up Notepad, the default JSP editor for Eclipse.

    Using Notepad, change each method in your JSP from this:

    ...Your JSP method code ...

    to this:

    Date start;
    Date end;

    if (profiling)
       start = new Date();

    ... Your JSP method code ...

    if (profiling){
       end = new Date();
       System.out.println("MyJSPMethod executed in " + Long.toString(end - start));
    }
  5. That's available in any development environment, for one thing, and isn't a good "profiler." That's only measuring start and stop times, has no indication of WHY something might not be executing quickly, doesn't measure system impact.

    "Open source to the rescue," indeed. If I used IDEA (which I do) or any other commercial IDE, would I thus be justified in saying "Closed source to the rescue"?
  6. Here: http://eclipsecolorer.sourceforge.net/index_profiler.html
  7. bummer[ Go to top ]

    No IDE is required to what you are suggesting
  8. If you're not happy with the Eclipse/Notepad method, you can use JFluid and NetBeans to debug your web apps.

    To do this, install NetBeans and JFluid and change your startup JVM by modifying catalina.bat to this:

    set JAVA_HOME=c:\Documents and Settings\gs\.netbeans\4.0\modules\profiler-ea-vm\

    Change the path to match your install. You get charts and graphs of memory usage, cpu usgae, object lifecycle, and so on.

    A tutorial, including a sample web app to profile, can be found here:

    http://www.netbeans.org/kb/articles/nb-profiler-tutor-5.html.
  9. So if I have my app up on a remote J2EE (in my case an Orion server on a Linux box) do I need to install NetBeans && JFluid to profile my web app?
  10. Yes, you have to install NB+JFluid on at least one machine. If you want to do remote profiling, running your app on another box, you will also have to install a smaller JFluid "server" package on that machine.
  11. Nope, not if you're lucky[ Go to top ]

    Performance is and always will be a tuning technique that has to account for all factors of the application, not necessarily the few little bits that the respective developers are focused on.

    Kinda obvious, but also probably the most important point so far in this dicussion ...


    That said, there's a good site for tactical performance optimization:


    http://www.javaperformancetuning.com/


    PJ Murray

    CodeFutures Software

    Java Code Generation for Data Persistence
  12. I got a cube neighbor who has a sheet up with all these quotes for these supposed "gurus" (I won't mention any names) imbuing us with their great wisdom by saying how evil it is to worry about performance up front.

    It's so tiring to keep hearing that you should just worry about performance somehwere near the end of a project. It's just a tuning thing, right? NO. It's NOT just a tuning thing. First off 90% of the programming public doesn't have a clue what effect anything they do has on a CPU. Even if they do, it's very easy to design an app such that "tuning" it later could require significant rearchitecture.

    Can someone explain to me why its not a normal part of Agile development? I think it's very valuable to find out what makes you slow early instead of later. (If you've ever tried it, you know why.) Iterative development doesn't mean you iterate everything but performance tuning. You iterate EVERYTHING. Waterfalling your performance can be an extremely bad mistake.
  13. I agree with you 100%. Performance test should be run everyday so problems can be detected and corrected early.

    It is interesting to note that Linus Torvalds has recently recommended that all mods to the Linux kernal undergo daily performance testing. Previously, Linux underwent performance testing only before a release. The result was no one was able to tell where performance hits crept into the code.
  14. I couldn't agree with you more. There's nothing worse than a finished product that "all of a sudden" becomes slow. Bad designs pop up during profiling and should be part of the iterative process. Waiting for the finished product and then hoping to optimize on the fly is a huge mistake.

    The importance of profiling is also heavily dependent on the architecture. Try tuning a distrtibuted system with downstream dependencies or work withing a transactional context. Performance problems in one small subsystem can have a domino effect throughout; not to mention the complexity of debugging such problems.

    Performance should never be an afterthought although I am sure that some people\clients are more "performance averse" than others.
  15. YOU are the profiling guy[ Go to top ]

    One of the things I think is really promissing about netbeans integration with JFluid (except that is is too buggy to use) it that it makes profiling a trivially easy part of one's standard development practices. I think it's best if the developer can be responsible for the performance of his own code. Saying "I'll let the guy with the profiling tool find the hot spots" is a cop out. That mentality reeks of "roles". The fact is that the "role" of a successful developer is to deliver code that works well, and profiling your code is necessary when you need to weed out performance bottlenecks.
  16. When you need too?[ Go to top ]

    Maybe simplistic, but isn't the answer:
    When you need too?
    You have to think a little about what the app is doing (I don't believe in ignoring facts) but then get something out, and profile based on need.

    I don't want to sit around for a week trying to get my own implementation of .contains(..) with a nicer BIG O notation if that isn't a bottleneck, and if the app is running fast enough :)

    Of course, if you are developing a huge enterprise app, you want to be smart about it too. Cameron can't just hack away on Coherence expecting it to work fast enough at the end ;)

    Dion
  17. Of course[ Go to top ]

    IF you have defined performance requirements (ie measureable metrics) for your application...

    and IF you have created proper unit tests which measure the performance of your application against these requirements...

    and IF (actually WHEN) you (the developer) breaks one of the unit tests related to performance...

    THEN of course you have to be familiar with the tools required to fix the problems you created.

    Isn't that what everyone does? ;-)
  18. Marketing realism would be nice too[ Go to top ]

    LOL, point WELL taken. Most of these discussions in the projects I have experienced are premature for exactly these reasons, and the defined performance requirements are 'as fast as possible' ... OK, task complete ;-)

    JFluid in particular, of course, will remain an obscure and little-used technology as long as it is attached only to 'NetBeans', which is presumably, as so often, the plan of the Sun sellers.

    Too bad, in my view.
  19. Realism about IDEs would be nice too[ Go to top ]

    LOL, point WELL taken. Most of these discussions in the projects I have experienced are premature for exactly these reasons, and the defined performance requirements are 'as fast as possible' ... OK, task complete ;-)JFluid in particular, of course, will remain an obscure and little-used technology as long as it is attached only to 'NetBeans', which is presumably, as so often, the plan of the Sun sellers.Too bad, in my view.

    Ah, that 'obscure and little-used NetBeans' that recently won the 2005 developer.com Open Source Tool of the Year award!

    If you are going to profile either client-side GUI applications or web applications, or J2ME applications, surely it makes sense to use an IDE like NetBeans that comes with pre-installed support for all of these.
  20. For most applications I've worked with, performance is determined in the architecture, and that's why it needs to be considered up front. Performance changes by orders of magnitude depending on architectural decisions, not code-level optimizations.

    Once you've written a lot of code to a given architecture, you are locked into it (as much as it is difficult to change the architecture late in the game). Given that, you're only talking about 10-50% performance improvements based on profiling optimizations, and it's frustrating when there's nothing more you can do on a bottleneck because it's the architecture that prevents it from speeding up.

    I think architecture performance modeling needs to be the "profiling" part up front, so that you can model what's going to happen for transactions across the entire system.
  21. In a multi-developer Agile/XP programming environment, architecture alone will not suffice. If you write a small program, what you said would be ok. Some of the code-level stuff like selecting a proper collection object are left to the whims of a developer in a XP env (he could end up using a Vector in lieu of say a better newer collection object like a List).

    So can't discount profiling and say all can be caught in architecture..which to me is an ideal view of the world of programming.
  22. In a multi-developer Agile/XP programming environment, architecture alone will not suffice. If you write a small program, what you said would be ok. Some of the code-level stuff like selecting a proper collection object are left to the whims of a developer in a XP env (he could end up using a Vector in lieu of say a better newer collection object like a List).

    Well, I don't know your definition of XP, but the last time I checked, pair-programming was fundamental to it. Why? So that no "code-level stuff like selecting a proper collection object [is] left to the whims of a developer", as the partner can check the coder's decisions instantly and correct bad choices.

    Alas, if you have a big project going, code level optimization has very little influence in most cases, as it follows the 80/20 rule. To find the hotspots where it really makes sense to replace Vectors with ArrayLists without harming the thread safety of the code, you use a profiler -- but in my opinion this can well be put of to the end of a development cycle, as swapping a List implementation is easy if you stuck to the basics and programmed against the interface.

    On the other hand, verifying your architecture early and regularly with a profiler is much more important, as refactoring a bad architecture already set in stone is hard. In most cases, this will mandate API changes as well, so that you'll have to adapt the complete application. The problem is that most developers (even the "profiling guys") don't care to use realistic test data. Performance profiling won't work with a toy project -- this is going to be about as significant as profiling the use of a StringBuffer against String concatenation in a loop with two iterations. You need real data, real queries, real traffic, from many users (best double the estimated amount of all these values, as the system will reach this limit easily over the time).

    Just my two cents,
    Lars
  23. I started writing code back when it made sense to put #asm directives in your code and read magazine articles discusses the pros and cons of various compiler options on performance. (Was it Walter Bright's compiler that was able to pass parameters in a register instead of on the stack?) So I "grew up" with a certain amount of sensitivity to performance, and I've kept that with me.

    With that introduction, my approach to performance and memory optimization is usually to (1) not duplicate effort or data, and (2) make appropriate choices based on a sense of the time and space costs of algorithms and data structures, keeping in mind the other factors involved: overall development time, code readability and maintenance, etc. In everyday code I might make a small extra effort up front for the sake of performance, but not at the cost of readability or maintainability.

    If I'm using a profiler, it's usually customer driven. I rarely pull out the profiler on my own unless it's clear that performance or memory usage is outrageous and the system is complex enough that I'm afraid to just dive in. Profiling is a specialty that involves knowing how to generate valid data, knowing how to interpret that data, and knowing what to change.

    My profiling is usually performance-oriented. Almost always, actually. But that's because the code I work on is not memory-hungry. If I were in a different domain, I can easily see that changing.

    IMO, good coding practices up front minimize the need to bring up the profiler in many situations. Use the right data structures, the right algorithms, and don't wastefully do the same thing over and over repeatedly again and again.

    Regards,
    Thomas
  24. IMO, good coding practices up front minimize the need to bring up the profiler in many situations. Use the right data structures, the right algorithms, and don't wastefully do the same thing over and over repeatedly again and again.

    ...and...
    Can someone explain to me why its not a normal part of Agile development? I think it's very valuable to find out what makes you slow early instead of later. (If you've ever tried it, you know why.) Iterative development doesn't mean you iterate everything but performance tuning. You iterate EVERYTHING. Waterfalling your performance can be an extremely bad mistake.

    Yes, of course it can be a bad mistake, which is why you want experienced folks in the project do the things mentioned above so you don't end up at the end, getting ready to throw the big, creaky, sparky knife switch to find out that when you do the system siezes up and dies as the accumulation of bad design and coding choices all collapse in a heap.

    Most folks that have some projects under their belts have some good ideas about writing code with a "reasonable expectation of performance" right off the bat, code that represents more linear performance than degenerative performance as the data loads pop up. Write enough of this stuff, and this becomes routine practice.

    Next, you have the whole issue of perceived performance. There is a great metaphor that artists in theater use when performing. When they make a mistake, don't broadcast the mistake to the audience, because they won't recognize it one way or the other. Just because you the developer think or "know" it's slow, doesn't mean it's unacceptable to the user.

    Save for real time data projects, or projects with ENORMOUS data sets, performance is measured by the benchmark that every user carries with them: the seat of their pants. This is a HUGE gray area that cannot be measured with stopwatches. It's either "fine" or its "slow", and it always "depends". "Fine" and "slow" are not hard numbers, and that's what performance tools give you, hard numbers that can be overanalyzed. "This is 100% slower than last time", yea, it went from .1 sec to .2..."This report takes 15.26 minutes to run" Yes, it's run once a week on Sunday morning at 2am. "Our threshold response time is .15 secs, this one took .17." That screen is used by an administrator once a quarter..you really want me to spend another day trying to speed that up? This is where Dilbert gets much of his material.

    But the problem is that when something transitions from "fine" to "slow", you better have something up your sleeve to give that user a 50-100% or better boost in performance, because once it's "slow", it is always slow until you make it really fast. That seat of the pants is very forgiving until it has been violated, and at that point it becomes very critical. Thankfully, this is usually the case once you know what horrible thing snuck in to your code base. (Index on a database, changing an algorithm, adding a little caching, etc.)

    However, given that, even "slow" systems are better than broken or incomplete systems. Give a user a fast broken system and a slow working system, and they will take Door #2 every time, because there is nothing slower than a system that doesn't work. I'd rather have a Hyundai on the street than a Porsche in the garage.

    Similarly, no specification or implemented IT project survives first contact with the users (part of the whole Agile model anyway). "This is exactly what I asked for and not what I wanted" pops up as they start using the code in all sorts of ways not envisioned by the dev team.

    So, learn to write good code, learn to design good code, implement your feature set, test it for functionality and set it free. Aggregious performance problems will come back as "bugs", and you fix them. If you have reasonable coverage on your tests and QA, you'll hit a lot of these early on and only the edge cases or dramatic shifts in data volume will screw you up.

    Don't fixate on hard numbers or relative changes. Write good code, fix bad code. When good code goes bad, make it better.
  25. I have no idea wtf you are talking about. What does what you said have anything to do with including performance tuning in every iteration?

    Are you saying that since no spec survives first contact with the customers, write good code and everything will be good? I have no idea what this means.

    Seriously, since user expectations for a given application can change during the development of the application (including how it performs), doesn't that beg for performance tuning on an iteration schedule?
  26. However, given that, even "slow" systems are better than broken or incomplete systems. Give a user a fast broken system and a slow working system, and they will take Door #2 every time, because there is nothing slower than a system that doesn't work.

    This is not true. Performance is often part of the "correctness". Walk into a financial company and drop a slow app on a trader’s desktop and see how fast you perform the walk to the exit door.

    It doesn't stop at financial companies however. Time is money for most everyone. Saddling them with an application that is slow doesn't do anybody any good.

    How can anyone make a suggestion on a solution without considering performance in the first place? This can not be done in a vacuum and determines everything from technology choices down to time to delivery.

    I will agree with one thing in this thread. Not everyone is a profiling guy and certain teams have experts in many different areas. People in those situations should rely on the expertise of others around them to help optimize their systems. (for instance et the database team involved in profiling the impact of your system on theirs). It should never be an afterthought though or even allowed to be the last stage in a delivered product.
  27. I think the author is taking a somewhat narrow view of software performance engineering (SPE). Building high performance enterprise applications is not assured simply by attempting to optimize a relatively small amount of application code that might be executed during a complex business application involving external resources and significant communication, control and co-ordination container-to-component code. A poor decision much earlier in the project (architecture, component design,...) has a much greater impact and unfortunately more probable. Of course unit testing some system code should involve profiling especially if the code plays a significant role in the overall application software execution model. But this profiling exercise is focused specifically on a small subset.

    Tools that enable the architect and senior developer to better understand the prescribed or actual software execution model are important especially when they correlate much more execution context information than a typical Java code profiler. Typical contextual information items are remote request and parameters (CORBA, RNI, HTTP), component level call stack, component configuration (codesources, deployment files), persistence frameworks and mappings, transaction boundaries, datasources and message queues as well as SQL and messages commands. If the sames tools can then be used during the analysis of the system execution model (possible deployment architecture validation under high workload) in assessing capacity requirements and identifying concurrency issues then when moving into production architects and operations have a higher degree of confidence in meeting business requirements.


    Regards,

    William Louth
    JXInsight Product Architect
    JInspired

    "J2EE tuning, testing and tracing with JXInsight"
    http://www.jinspired.com
  28. In relation to memory analysis with the purpose to detect pinpoint memory leaks or excessive allocation sizes then tools such as YourKit (http://www.yourkit.com) and JProfiler (http://www.jprofiler.com) should be made available for development staff.

    Personally I never use such tools for Java code performance profiling (times and allocs) because they are too generic and do ** not ** understand the application code, JVM runtime and Java language constructs better than I do. Using JXInsight's Tracer I can be much more selective in the parts to be profiled, attached more contextual information and have accurate statistics for each traced segment (not neccesarily a method or class). So many times the tools tell me what I already know but cannot do anything about.

    Also JXInsight's Tracer is the only profiling solution that I am aware of that adjusts the clock times based on GC events and provides side by side analysis of blocking, waiting, allocation sizes, wall clock and cpu times. A sensitive points within our code we have code blocks:

    Tracer.start("myid"); // id could be derived from execution
    context
    try {
     ...
     ...

    } finally {
      Tracer.stop();
    }

    With the above I get memory allocations, thread blocking and waiting, high resolution clock times, gc time as well as cpu usage - beats System.current.....


    Regards,


    William Louth
    JXInsight Product Architect
    JInspired

    "J2EE tuning, tracing and testing with JXInsight"
    http://www.jinspired.com
  29. Profile Tracing Introductions[ Go to top ]

    Hi,

    I wanted to add that adding such tracing blocks to the actual domain execution parts of the application should be avoided. I once went to a customer on a profiling assignment where I was warned that my performance assessment might be complicated by the excessive use of trace and log statements in nearly every method (more like every alternate line). Hmmm... guess what was one of the performance problems.

    Instead such trace instrumentation should be added via event notifications , interceptors (CORBA, EJB3), filters (Servlets), proxying and bytecode instrumentation (AOP...AspectJ).

    <very-naughty>
    JXInsight provides already trace extensions for popular component architectures, middleware and persistence technologies. I am currently finishing up JXInsight's trace extension for SolarMetric's Kodo JDO which will be made available next week. In the meantime if you are interested in the current JXInsight trace extensions and how they are visualized within our award winning UI please visit:
    http://www.jinspired.com/products/jdbinsight/downloads/index.html

    Windows
    http://www.jinspired.com/products/jdbinsight/downloads/newgrange/windows/JXInsight.3.1.8.1.exe
    Solaris
    http://www.jinspired.com/products/jdbinsight/downloads/newgrange/solaris/JXInsight.3.1.8.1.bin
    Linux
    http://www.jinspired.com/products/jdbinsight/downloads/newgrange/linux/JXInsight.3.1.8.1.bin
    Mac OSX
    http://www.jinspired.com/products/jdbinsight/downloads/newgrange/osx/JXInsight.3.1.8.1.zip

    Note: Native Agent libraries for the above platforms as well as HP-UX and AIX are distributed with each platform installer.

    </very-naughty>


    Regards,


    William Louth
    JXInsight Product Architect
    JInspired

    "J2EE tuning, tracing and testing with JXInsight"
    http://www.jinspired.com
  30. I find this article kindof trivial, unless it is written for college freshmen.

    Regarding the wider topic of profiling, here are my two cents:

    1) JFluid is, unfortunately, useless, for my project because we use JDK 1.5 for AMD64, on our servers, and, according to their own docs, JFluid wants to use its own 32-bit JVM 1.4.2 hack. Simply incompatible.

    2) Live production servers is exactly where I want profiling, the most, if there is a problem - not in my IDE.

    Offline 1-user load tests (from IDE or not) is insufficient in 85% of cases for web-projects.

    Neither do tools like Borland OptimizeIt help, as they introduce too much overhead and slow down live site. With tools like that you can not profile a high-load environment (load server or test server, does not matter).

    In-JVM, low-overhead profilers (like JFluid, actually) are the only help when ran on a live site but most of them (like JFluid again) ask for a hacked JVM and do not run in my environment of 64-bit JDK 1.5.

    At least, I could not find any and would be very gratedul if somebody knows a good one. My impression, so far, has been that the supply runs way short of the demand in the area of Java/J2EE profiling.
  31. And what do you think about JProbe?

    Dmitry
    http://www.servletsuite.com
  32. JFluid and JDK 1.5[ Go to top ]

    Hi Irakli,

    To answer your questions:

    1) The next release of JFluid, Milestone 6 (should be out in a few days), is going to support JDK 5/6. Well, actually the first production JDK that JFluid works with, will be JDK 5.0_04, which is to be released in the end of June. However, JFluid also works with JDK 6 Early Access, that can be downloaded from java.sun.com right now. So quite soon, our customized VM will be only for those who is still stuck with JDK 1.4.2 for some reason.

    2) Nothing prevents you from using JFluid with live production servers. You don't have to launch your application from the IDE - we offer two methods to attach to an application launched separately.

    In fact, one of the goals of JFluid while it was a research project was to explore how applications running in production mode can be introspected unobtrusively. And though I can't pretend we solved that problem completely, some things we came up with, such as attachment to a JVM launched without any special options, partial call graph CPU profiling, continuous (no heap snapshots) memory leak debugging etc., were hopefully the steps in the right direction.

    Misha Dmitriev,
    JFluid/NetBeans Profiler Team Lead
  33. JFluid and JDK 1.5[ Go to top ]

    Misha, thank you for your answer.

    If I understand it correctly the next version of JFluid will be able to work with the original JDK 5s (like my Sun JDK for AMD64) - no need for hacks?

    That's major. Looking forward to that.

    Reg. your second comment: I need NetBeans+JFluid on the client to monitor and a server JFluid component on the live server. Does that mean that monitoring can only happen in a live mode?

    Can I start the JFluid server component, go to sleep, come in 12 hours (no, I do not sleep that long :) ) and get a summary what were the slowest methods during these 12 hours, what was average number of threads, average memory consumption and which class of objects caused it etc.?
  34. Re: JFluid and JDK 1.5[ Go to top ]

    If I understand it correctly the next version of JFluid will be able to work with the original JDK 5s (like my Sun JDK for AMD64) - no need for hacks?

    Correct, it will be able to work with original JDK 5 and 6 (though, again, the first release of *JDK 5* that supports JFluid is not yet available).
    That's major. Looking forward to that. Reg. your second comment: I need NetBeans+JFluid on the client to monitor and a server JFluid component on the live server. Does that mean that monitoring can only happen in a live mode?

    Not sure I understood the question completely. Monitoring by definition is something that's done on a live application?
    Can I start the JFluid server component, go to sleep, come in 12 hours (no, I do not sleep that long :) ) and get a summary what were the slowest methods during these 12 hours, what was average number of threads, average memory consumption and which class of objects caused it etc.?

    Ok, JFluid server component is just a couple of libraries that the monitored/profiled JVM needs to load locally to connect with remote JFluid tool, exchange information with it, etc. To really collect profiling info, you have to start the JFluid tool, connect it with the profiled JVM, and tell it what data you want to gather.

    The picture that you presented above is how things should work in the ideal world; I doubt if any existing tool can or will be able to work like that, at least if you want the info to be detailed enough (as opposed to traditional, quite superficial monitoring). For one thing, CPU profiling and any sort of detailed memory profiling can't be done together if you don't want your CPU prof. results to be distorted, possibly significantly, due to additional activity introduced by memory prof. instrumentation. Another thing is, detailed info collection may not always work "out of the box" - typically you may need to adjust some parameters etc. , esp. if your application is big and you care about the profiling overhead.

    Once you accept this, other things you are talking about are of course possible, except for some of the "average" metrics you are talking about (looks like they would be useful - thank you for hinting at this!)
  35. Live production servers is exactly where I want profiling, the most, if there is a problem - not in my IDE.
    At least, I could not find any and would be very gratedul if somebody knows a good one.

    Some free advertising from a satisfied customer:
    I am running YourKit profiler in a few production systems and it works fine. Profiler is rather rudimentary, but it is rock solid and has 0% overhead when not in use. CPU profiling is also fast, but I haven't done any comparison.

    I especially like its memory dump, as it completes even after several OutOfMemoryErrors from the profiled application. This is priceless when dealing with mysterious, unreproducible (is there any other kind) memory leaks.
  36. Applicaiton Profiling[ Go to top ]

    I work as a Java performance consultant for a very very large vendor, for the past 5 years I have work pretty much exclusively on Java performance and Profiling.

    My tool of choice is a specialised version on JInsight, I find that's it's unique graphical view highlight both micro and macro performance and design issues.

    I have seen the nature of performance problem change over the years, mainly due to the fact that up to 95% of the code running in a JVM today was not written by the application developer but rather by vendors and the open source community.

    Industry peer and project pressures are foring developers to implicitly trust this code and the technology "layering" is making problems more macro than micro.

    I guess I only get to see that bad stuff but after over 100 engagements I have seen certain patterns emerge. Without getting into specifics here is my list of todo's for any project.

    1) Performance, a "non-functional" requirement, is as important as the functional requirements.

    2) Performance Engineering is a discipline, but it is also a responsibility for all members of the team.

    3) Set real goals, "I want this to run at 100 TPS per CPU", or "My budget is $500,000"

    4) Design and Architecture is the most critical phase, especially when abstract models are translated to physical implementations, technology, standards and topology (A data model with 4000 entities and many relationships only becomes a problem when you use Entity beans or a persistence layer to deliver it)
     

    5) DO NOT! implicitly trust any code you did not write yourselves, this will be most of it, examine technology before it is used, understand it's cost in normal use cases and try and predict the ab-use cases.

    6) Measure, with technology and procedures, the performance of a growing design/development against your goals on a daily basis. This way changes in the performance can be accurately attributed to recent changes.

    7) Developers must use profiling tools, this is not an external task unless the person performing it is a skillful developer and can both understand and potentially argue against certain developer/architect decisions.

    8) FORGET! vusers, think-time, scripts, runs and the whole LoadRunner mentality, go back to the axioms of performance, TPS, Resource Utilisation, ITR, ETR. Once understood these base numbers can then be used to translate back to user based targets, it's virtually impossible the other way around.

    9) Remember that time marches on, the latest technology that is a "must have" or will give you ultimate flexibility and be "future proof", will, in fact, probably be out of date within 12 months. Change is the only constant in this business but the real basics stay the same.

    10) Don't be ashamed to change, do not be too defensive, I have had many heated discussions with developers who could not understand, in fact would not even entertain the idea, that their perfect baby could have been over architected or poorly implemented.

    11) Avoid the temptation to leave a poor application unchanged and push responsibility to the infrastructure, or look to reduce the ticket price of the deployment platform, this increases TCO and reduces QOS.

    It's always good to make things faster and cheaper, in Java projects today, developers are the main key holders to this ability, run with it.

    Oh and use JInsight or Hyades/TPTP although this is real slow, they have really got to go back to a binary trace file format.

    Paul Anderson
  37. Applicaiton Profiling[ Go to top ]

    Paul, well said! *applaud*.

    How much is hour daily rate? We may have a need for somebody like you :)
  38. Daily rate[ Go to top ]

    155 Euros per Hour plus expenses, which is actually a very good rate :-)
  39. Application Profiling - Part 1[ Go to top ]

    I would like add additional pointers to Paul's list.

    When I am first called in on a performance assignment for a typical J2EE application the first thing after hearing all the horror stories is to starting recording and compiling enterprise performance metrics and execution patterns for each use case in single user mode. The metrics I am looking for are not the ones a typical Java code profiler provides.

    Partial List of Java Enterprise Metrics Per Use Case
    -----------------
    Number of roundtrips (CORBA, HTTP, RMI) between client and server.
    Number of roundtrips (SQL) between server and database or message queue.
    Number of resource (JDBC, JMS, JCA) transactions executed.
    Number of nested transactions executed.
    The maximum number number of active transactions during execution.
    Number of records or messages processed during transactional activity.
    Number of tables accessed per SQL executed and per DML type.
    Number of columns accessed per SQL executed and per DML type.
    Average object allocation created in both client and servers at various levels (use case, per RPC, per transaction, per component interaction).
    Number of inter component (EJB) method invocation (only at the boundaries and not the bean class itself).

    I do also record some timing information but I tend to not give this much attention as I am typically running this in unrealistic environment conditions (collocation of client and server or server and database on same hardware and ONE USER !!!!).


    Regards,


    William Louth
    JXInsight Product Architect
    JInspired


    "J2EE tuning, testing and tracing with JXInsight"
    http://www.jinspired.com
  40. Application Profiling - Part 2[ Go to top ]

    After collecting use case execution snapshots and deriving the above metrics the performance engineer arranges meetings with the architects and developers of each subsystem or component to discuss the findings and demonstrating the execution behavior via offline analysis of the snapshots. This tends to be a fruitful period where much of the low hanging fruit such as inefficient component interaction and persistence mappings/configuration are identified much to the amazement of the attendees.

    Architects and developers are in general surprised with the magnitude of some metrics especially transaction counts, sql executions (repeated reads) and object allocations (much more lightweight than a profiler - only counts per trace window).

    Over the next few weeks many of the issues found in the simple one user use case testing are addressed by the development teams until finally a suitable baseline is derived that will be used for future software execution model analysis. It is very important to get rid of the low hanging fruit before moving to the system execution model analysis as these are likely to mask other underlying issues that are likely to pop up under heavy concurrency and workload.

    I have seen some complex (and extreme) use cases initially start with 20,000 transactions and ending up finally at less than 200 which reduced the time from minutes to seconds. Normally the reduction in RPC, transactions, or SQL executions is 50%.

    Regards,

    William Louth
    JXInsight Product Architect
    JInspired

    "J2EE tuning, testing and tracing with JXInsight"
    http://www.jinspired.com
  41. Truer words have never been said about everyone being surprised at what the code is doing. I have actually found a whole bunch of logic bugs by just running a code profiler and looking of ratios of method counts in the call trees to expose logical errors, e.g. "If a person can only have 3 addresses, why do you call the loadAddress() method 80 times?".
    You need to get rid of all the code that doesn't need to be there, the bonehead mistakes and the "low hanging fruit" before you even start true performance testing. As my DBA friends like to say - "The best performing SQL statement is the one that never gets executed!".

    -Andy Faibishenko
    http://www.istekonline.com/PerformanceTuning.html
  42. Application Profiling - Part 1[ Go to top ]

    The metrics I am looking for are not the ones a typical Java code profiler provides.

    I would add that different types of applications tend to suffer from different classes of problems. In the large-scale mega-ultra Rolf-hates-it big-azz enterprise application category, a profiler is almost useless for solving production bottlenecks, not because it can't do its job, but rather because the problem is rarely related to bottlenecks in Java code that a profiler could expose. From my own experience, at least 80-90% of large-application bottlenecks are related to the use (or more correctly, abuse) of the database(s) that the application relies on.

    Peace,

    Cameron Purdy
    Tangosol, Inc.
    Coherence: Cluster your POJOs!
  43. Application Profiling - Part 1[ Go to top ]

    Just because the problem isn't in the Java code, doesn't mean you cannot find it using a Java profiler. Especially in the case you just provided of a non-performant database. Just because the profiler won't tell you exactly where the problem is, doesn't mean it's not useful to know that the problem is in the database or how you are using the database. At least you know where to focus your time.

    Most of the problems with profilers have more to do with people a) not knowing how to actually use their profiler, and b) some profilers suck (I won't mention any names).
  44. Re: Application Profiling - Part 1[ Go to top ]

    Hi Marc,

    I agree that a large majority of developers are given profiling tools with little training in performance engineering and the tool itself but this is not the problem when discussing the application of a Java code profiler in a pre-production or production environment.

    A Java code profiler will probably tell you that time is spent in the JDBC driver (with call stack info) but it will not tell you the SQL statement, the transactions contexts active on the threads stack, and very importantly the actually sequence of components calls and SQL statements that make up the transaction activity pattern. I can tell you that when it comes to tuning J2EE applications the actual sequence (and not call stack tree) is the most important information item in understanding the user/business/resource transaction(s) and locating places for improvements.

    JXInsight is unique in providing transaction pattern analysis (sequencing, global and local loop detection) as well as timeline analysis mode with database transaction, sql, and table execution concurrency graphics. Our timeline visualizations are probably one of the most powerful (and industry first) features of the tool but at the sametime the least used because the low hanging fruit identified in our profile analysis mode keeps architects, developers, testers and DBAs busy.

    Again the qualification is "typical enterprise J2EE application running ontop of typical enterprise J2EE container accessing enterprise database backend".

    More information on timeline analysis:
    http://www.jinspired.com/products/jdbinsight/downloads/new-in-2.5.html
    http://www.jinspired.com/products/jdbinsight/downloads/index.html

    Regards,

    William Louth
    JXInsight Product Architect
    JInspired

    "J2EE tuning, testing and tracing with JXInsight"
    http://www.jinspired.com
  45. Application Profiling - Part 1[ Go to top ]

    From my own experience, at least 80-90% of large-application bottlenecks are related to the use (or more correctly, abuse) of the database(s) that the application relies on.

    Me too. And the other 10-20% is the network and browser. So I have eliminated both from my Java app and it runs Great! :)

    (I wanted to post that in the "How to do web apps" thread but since there are so few posts I thought I would wait a bit and not seem like a horse's arse)
  46. Additional Pointers[ Go to top ]

    Willaim,
         An excellent addition, introducing real world considerations and limitations into the design and architecture phase is critical, but as you know very very difficult.
  47. JInsight[ Go to top ]

    Hi Paul,

    is there still a public version of JInsight? I liked its views...but as I am not an IBM employee as you are, it is difficult to get it, isn't it? An additional advantage was that it works in zOS environments.

    My favorite profilers are now:

    JProfiler and JProbe - IMHO one of the best tools for tracing J2EE AppServers is PerformaSure - it doesn't overuse the JVMPI and therefore has a low overhead which makes it a good decision for profiling under load or (if necessary) in production.

    I also like JXInsight a lot - very easy to use and you get to the right (defected) code quickly. I had some problems using it in a Linux environment with Oracle 9 and the OCI driver...

    My experience is that the main problem with performance bottlenecks is that the different roles in a project (DBA, Developer, Sys-Admin,...) do not communicate and if something goes wrong they are fingerpointing to each other. A dedicated performance team or consultant in a project is therefore good to avoid this useless discussions.
    Another point is that performance measurements are often done at the end of a project (or maybe after production) and if the problem is in architecture or design it costs a lot of money to fix the problem.

    - Mirko -
    codecentric
    "Your code is our source."
  48. JInsight[ Go to top ]

    JProfiler and JProbe - IMHO one of the best tools for tracing J2EE AppServers is PerformaSure - it doesn't overuse the JVMPI and therefore has a low overhead which makes it a good decision for profiling under load or (if necessary) in production.
    I thought JVMPI is deprecated in JDK 5. Is there a newer version of PerformaSure that is able to use JVMTI?
  49. JVMPI and Java 5[ Go to top ]

    JVMPI has been deprecated but it still works with Java 5 runtimes. In fact there has been performance improvements to the JVMPI especially with regard to call frame access as well as long standing bugs around thread times - probably as a result of the internal JVMTI work and the mapping of the old JVMPI interface on to the newer JVMTI interfaces where possible.

    Regards,

    William Louth
    JXInsight Product Architect
    JInspired

    "J2EE tuning, testing and tracing with JXInsight"
    http://www.jinspired.com
  50. PerformaSure does not use JVMPI. JProbe does.
  51. Hi Steve,

    I assume that you are still the development director at Quest for Java product line and have indepth knowledge of the product but if PerformaSure never used JVMPI then how does it provide GC statistics and object allocation size counters. The last time I read the docs when they where publicly available I recall reading about having an agent installed in the running JVM. Was this simply for loading native code (that does not make any calls into the JVMPI API) ?

    I would be very worried if performasure did not provide any GC statistics otherwise the clock statistics reported in the tool are subject to large inaccuracies which would have the developers chasing down the wrong path.

    Can you provide an clear statement whether your product provides GC times along side clock times in your entry point transaction analysis views?


    Regards,


    William Louth
    JXInsight Product Architect
    JInspired


    "J2EE tuning, testing and tracing with JXInsight"
    http://www.jinspired.com
  52. Jinsight[ Go to top ]

    Mirko,
         The old version is still available, i think, from alphaworks, bu the internal version is better (client server). Jinsight was, in effect, open sources with the eclipse Hyades project, although it is a little slow to visualise, probably due to the XML data format.

    I have not really used any other tools, I'm just so used to the Jinsight Execution view, although JXInsight does look very similar in some respects, spooky, William?:-)

    Jinsight Old and new and Hyades are all available for z/OS.

    I fully agree with your view on the problems you see, communication is always a problem, in J2EE projects I also find language is an issue too, not many non-developres are very keen to admit they have no idea what recursive reflective serialisation is. An you correctly point out that the main, in fact only problem is that people do not manage performance from the start of a project.
  53. Hi Paul,

    I had a look at old screenshots of JInsight for the first time today (I have never worked for IBM) and I would like to believe that we are sufficiently different in those visualizations (at least in terms of graphical excellence and beauty) that we share across products which I believe most other profiler tools share as well. There are only so many ways to present a call trace it is either a table, tree, sequence diagram, graph, or tree map. We provide all of them and more.

    If you take at look at our product release pages (including screenshots) since JDBInsight 2.0 you would see that our visualizations go far beyond JInsight and Hyades today and at least for the coming year especially with regard to transactional database activity, timeline analysis for traces, transactions and SQL statements. As the visual designer for all JXInsight/JDBInsight views, graphics, and icons I can assure you that my only source of inspiration was real world experience and reading lots of Edward Tufte books and of course a design background. I will admit the UI consoles view menus resembles Eclipses. Please note that Insight is a common substring of many performance management products.


    Regards,


    William Louth
    JXInsight Product Architect
    JInspired


    "J2EE tuning, testing and tracing with JXInsight"
    http://www.jinspired.com
  54. Ooops[ Go to top ]

    William,
       Sorry, I was trying to be humorous, I can clearly see that there is no real relationship between the two. I commend your efforts in representing performance information in new and exciting ways. I strongly believe that developers are actually generally very "visual" people with excellent pattern recongition capabilities. Even if developers take full responsibility for "code" performance they will need tools to be able to do this, we are little while off the nirvana of on the fly static code analysis were the colour of the line of Java changes depending on it's cost.

    maybe we should take a more detailed chat offline.

    Paul.
  55. Ooops[ Go to top ]

    Hi Paul,

    I did not mean to come across as being on the offensive. We are just coming to the release date of JXInsight 3.2 (JDO and JMS trace extensions) and as always is the case you feel there is not enough time to get all your product ideas in and tested. I should probably be not spending so much time on TSS but this topic (and good discussion) does not come around often enough.

    William
  56. I have tried Netbeans Profiler several times but everytime I came to the conclusion that it is not suited to the things I want to do with it.

    I want to profile J2EE applications UNDER LOAD, i. e. I simulate 10 - 50 concurrent users clicking through our applications and then check out which methods took the most (self) time. When I read that Netbeans Profiler uses bytecode instrumentation this seemed achievable.

    But: Netbeans Profiler crashes my JBoss process when trying to do a performance profiling (entire application) of a JBoss 3.2.6 Application. First everything seems to work, because the profiler connects, it even starts instrumentation as I can see on Netbeans status line "55000 (or so) methods instrumented" but then the JAVA(!) process of JBoss DIES and I'm getting "Profiled Application Status: Stopped - Instrumentation: None". There's no point in pressing Ctrl+Brk again, since that process has died!

    So my question: Does it work for you? Can you do an entire application performance profile of a JBoss Application Server including one or more deployed applications? Any hints?
  57. Hi Michael,

    Can you tell me what do you considering the more expensive operations performed by your J2EE application. Is is CPU bound? Is is IO bound? Does it suffer from large and excessive GC events?

    I have a hard time justifying the introduction of a general code profiler in a testing or pre-production test environment for a J2EE application where I would be more than likely seeing much of the time and alloc costs in sending or receiving HTTP request and response, container to component dispatches, and database transaction activity.

    I really do no want to use a memory profiler just so that it can tell me that my JDBC driver is creating an enormous amount of TCItem objects. I look for times and invocations counts across technology boundaries (Servlets, SOAP, CORBA, EJB, JDO, JDBC, JMS, JCA, JTA, JTS). A general Java code profiler is overkill for this.

    An important limitation of a Java code profiler is that it can only see class and methods and is not aware of the actually data parameters and execution context (RMI-IIOP Requests, HTTP Query Strings, EJB MetaData, XA-Transaction parameters, JMS Message Properties, JDO query strings, SQL strings........).

    I have one customer, a very, very large software/hardware vendor, that has an generic application server framework (active ,odel pattern) that has the same serverside callstack for nearly all requests. In this case it is the the parameters (query, selection, filters, actions) need to be profiled at least in terms of aggregation of measurements. We provide an JXInsight trace extension that creates this picture within our console. For this customer our trace stack is more important than the Java call stack.


    Regards,

    William Louth
    JXInsight Product Architect
    JInspired

    "J2EE tuning, testing and tracing with JXInsight"
    http://www.jinspired.com
  58. Hello William,

    the performance of this system is CPU bound, that means the first barrier you run into is CPU. But from a testing point of view I want to get a big picture where the CPU time is spent without narrowing (filtering) in advance.

    Of course in call stacks you will always see the top level methods but normally those methods are just delegating to others. Therefore you should not group your methods by inclusive time (time including the time of internal method calls) but by self time (time excluding the time of internal method calls).

    This way you will see the hotspots and then you can decide which architectural software piece is the bottleneck. As I am also an developer I am not afraid of technical details ...
  59. Michael,

    This outcome may have two explanations:
    (a) Bug in our customized JVM
    (b) Bug in our tool

    Regarding (a) - there is one known issue that leads to pretty much the same outcome that you describe. It usually happens sporadically, and for this reason, as well as due to the complexity of the code involved (it really seems to be somewhere very deep in the original HotSpot VM code), we came to the conclusion that the effort of fixing it is too much compared to the benefit. Especially given that JFluid is finally going to run on standard JDK 5/6 JVMs very soon.

    Regarding (b). It may be a fixed issue - we have fixed quite some in the forthcoming Milestone 6 release of JFluid. Or it may be something new, in which case we would be really interested in getting a more detailed description of the problem (e.g. both the profiled VM and the NetBeans logs) from you. As you may see from our mail list archives, we usually respond to bug reports promptly, and often fix them within days. Then, we can send you a patch to make the tool work for you before the next release. So my suggestion is: try the current Milestone 5 release if by any chance you use something older. If the problem is there, wait for Milestone 6 (just a few days), or if you have an ftp site, send us details (feedback at profiler dot netbeans dot org) and we can send you M6 preview. If the problem is still there, we should be able to fix it in few days once we have enough input and some assistance from you (or, ideally, just your binary code).

    Regards,

    Misha Dmitriev,
    JFluid/NetBeans Profiler Team Lead
  60. Hi Misha,

    thanks for your reply. Somehow TheServerSide lost my reply to your post from yesterday.

    (a) very nice to hear that soon we can use standard JVMs.
    (b) I'll try M5 and M6 and if the problem should persist I will file a bug report. At least I should then be able to help you reproduce the problem.

    Regards
    Michael
  61. You are welcome. FYI, Milestone 6 has just been released. Please try it and let us know if there are any problems.

    Regards,

    Misha
  62. Wily Introscope?[ Go to top ]

    I haven't looked at JFluid yet, but from an initial glance, it sounds like it might be similar to Wily's Introscope profiler, which does byte code instrumentation. They claim it only adds about a 2% overhead & can be left running in a production environment for profiling down to arbitrary class level.

    I would be interested to hear of any experiences anyone has had with this & how it compares to JFluid.
  63. Wily Introscope?[ Go to top ]

    Introscope's actually pretty good, but it doesn't really profile the application - it measures interaction between components and resource allocations, which can be very useful, but isn't quite the same as a profiler like jfluid or optimizeit.