Wily Technology published a survey which looked at performance and availability in J2EE. The survey results showed that apps are only averaging about 88 percent availability for existing Java-based production applications. The results were collected from a select group of 360 enterprises representing 16 industries, 43 countries, and equally divided between large and small organizations.
According to an article by eWeek "problems originate in the application environment and in connected systems. In order of frequency after application code bugs were: issues with configuration and tuning, architecture, database connections, design problems, memory leaks, capacity planning miscalculations, Java virtual machine issues..."
"The low level of availability is actually, unfortunately consistent with what you see everyday on the Web," reacted industry analyst Jean Pierre Garbani with Forrester Research in Cambridge, Mass.
Despite such experiences, the respondents indicated a strong commitment to continuing Java development. One third of the respondents indicated they intended to spend more than 75 percent of their application development budget on J2EE.
Read Survey Questions Java App Reliability
What does Wily Technology sell? Interesting survey results... Something to keep in mind.
Interesting survey indeed although I really question the source of data. I just came off a project where the management has been sold by Wily into using their product, Introscope. One of the worst piece of software I ever had to deal with. Based on a great idea of instrumenting the bytecode and collecting performance and other metrics, the implementation is horrendous. The enterprise manager keeps dieing, the client workstation needs Deep Blue-level hardware to operate and things constantly break. If they can't get their product together, who are they to talk about the up time of Java applications??
In any case, there are many factors related to uptime, and I don't see a strong corellation between the technology (Java, C++, .Not) and the uptime. If anything, Java apps have a better chance to survive under stress and when running poorly written code.
PS. Whatever you do, be sure to evaluate WILY products carefully in PRODUCTION environment before you decide to spend money on it. Or better yet, look at competing products from Borland or Sitraka or other well known vendors.
I am with Wily and familiar with the results of our online survey. Full results are available by attending one of our seminars presented in 9 cities across North America. Regarding the survey data, more than 360 people took the survey representing a diverse group of companies and roles. I believe the data represents a good sample of real user experiences in deploying and supporting J2EE apps.
To address comments regarding Wily's products, more than 200 enterprise customers use Wily application management software products to monitor, improve and manage their production applications. Wily's Introscope was designed for production. The company and products are recognized by most independent analysts as the leaders in this area. I would invite anyone evaluating application management software to compare Wily with any other vendor by piloting the products in their real, production environment. We would welcome the opportunity to demonstrate what real J2EE production management is all about.
Vice President of Marketing
Wily Technology, Inc.
Nice to see that someone from your company is reading TSS. Interesting to see that the person is from a marketing team, not development. Your post is the type of information that leads managers to believe that the product is solid. Again, I think the *idea* behind the product is solid, but the implementation CLEARLY SUCKS. Do you guys even try to monitor any of your internal applications with it? The product didn't have logging until the latest release 3 months ago, for god's sake. And what kind of hardware does one need to bring up the workstation tree? We have 4 servers and even setting max memory to 512 did not work half of the times.
The product was very difficult to setup, requires a whole bunch of custom work for persistent collections (and what good is it without them?) and the uptime was horrible. I suggest you divert more money into the dev team to ensure UI standards, good memory usage and user-friendliness. Then you'll see more positive posts from people who actually USE it, not from analysts who make their money but writing surveys for their companies.
Bottom line: there is hope, just get a couple of strong J2EE guys to fix the difficiencies and things will improve.
Its nice to see initiatives in application management space. Is there no way people outside the US can get this ?
- Wily's product has market share
- Wily's product is well known
Truths gained from talking with wily customers at the JavaOne conference this year.
- Wily's console product is probably the worst UI on the particular market though I have seen some contenders for this covetted title. It does not work and provides little support for communication across teams.
- Wily's has not innovated in this space since its early entry. Visualization to Wily is poorly renderered large green and red circles.
- Wily's attempts at copying the competitions features have been pretty dismal
- Wily has some very good sales people, as its hard to understand how they have come to lead this industry.
Better products though not without their own issues:
- Sitraka PerformaSure (has its own identity crisis)
- J2EE Indepth - you will spend a large amount of time clicking a url, and another url and another url before eventually getting to a SQL and when you do you wont remember why and how you got there.
- JDBInsight has real transaction analysis and less overhead than any of the others. Best UI, but then again I would say that. It lacks bytecode instrumentation which is coming in 2004 with a new product, JEEPP, that covers the complete spectrum of Java specifications/technologies. Bear in mind that 80% of problems relate to database transaction management (apart from financial systems based on messaging).
Now for the challenge proposed by Wily:
Mike Java, I would like you to run SPECJAppserver2002 on WebLogic 8.1 on Solaris and/or Windows platform and report the overhead incurred as well as provide a execution profile snapshot which can be downloaded along with your console to see how operations, architects and devlopers could identify performance issues. Please don't have us fill out some survey - show us what you have to offer in the way of analysis of a real world systems instead of PetStore or MedRec. Even better would be to allow access to the management system so that we could confirm you production readiness. By the way JInspired's JDBInsight ships with a execution profile snapshot of SPECJAppServer for WebLogic which can be downloaded easily for our website.
Looking forward to your proposed co-operation.
"Tune and Test with Insight"
Another product that was left off the list:
You left off Mercury Interactive's Topaz & Deep Diagnostics:
Topaz and Deep Diagnostics
Topaz comes with a web-based monitoring of all your services, including JMX, Java code (using instrumentation) and all other critical processes outside the J2EE sphere. For more in-depth diagnostics, you can use the Deep Diagnostics application (formerly Performant's Optibench), which uses lightweight bytecode instrumentation on production-level applications. The reporting features are extremely detailed and useful for diagnosing performance-related issues in a production environment.
This article paints a pretty grim picture. If people work on systems with this kind of down time they should look for a new software and system architect.
Also IMHO defining application downtime is a very complex issue. We rely a lot on external systems and have worked very hard from an architecture standpoint to make sure that if we have an external system go down on us (which happens all the time) it has minimal impact on other areas of the application.
Monitoring tools like Wiley are very important. We originally had a home grown JMX/Ping type of application for montioring the application. We recently got the budget to buy a tool which we are currently integrating it for the holiday season.
It's not that simple, regardless of technology, the main source of downtime is human, whether it's operational problems, procedures, application architecture (forward/backware compatibility etc). I just gave a keynote on continuous availability in China and very little of it was about J2EE/WebSphere.
A better investment would be in training, developing processes etc to improve availability. It's a culture thing. There are some incredibly available systems that are built using some pretty ordinary hardware and it's due to knowing what the goals are and then putting a culture in place from the start to document, test, train, design processes etc to achieve that goal and most of this is technology independant.
I don't think anyone worth his salt would say "availability" or any of the other "ilities" are concepts that are unique to J2EE projects.
These "ility" goals are indeed an incredibly important part of any project and need to be addressed on a cultural level.
Firstly I haven't seen the results of the entire survey so I may be talking out of my arse (but hey, when did even reading an article become a pre-requisite to posting an opinion).
I'm not sure how useful it is averaging this discriminately - clearly there may be apps. in the sample where 4.5 hours of down time (or even 16 hrs) is completely acceptable or even desirable - what would've been useful is a gap analysis of uptime to determine the difference between customers desired uptime and achieved uptime for each of their apps. - maybe the full survey results show this ?
Maybe, for the business needs, the quoted availabilities are fine and inline with the level of investment they are prepared to put in versus the downtime cost.
They must have included javablogs.com as one of the apps in the study ;-)
: Clustered JCache for Grid Computing!
blog-city is just as bad
The availability for the J2EE applications I write and support has been greater than 99%. I cant think of a single instance of down time that was not related to a hardware failure or some sort of human error. In particular I support a JMX service that is under constant, heavy load from streaming real time data. That JMX service has not dropped a packet in over a year.
just out of curiosity, how do you measure the availability of your systems? With some kind of management software? We use a home-grown "intelligent" ping.
We've had enormous problems with availability, and the current figures are in the high nineties. The aim is a modest two nines, which we rarely reach.
The problems have to do with all the things mentioned in the post -- especially the very complicated environments. One of the main problems seems to be that our environments are over-consolidated and our service components over-centralized. Not to mention that our test/transition process sucks :)
I still dont understand why people build/buy products which are proprietary in nature when there are very well defined standards available.
I am refering to JMX for application management [http://java.sun.com/products/JavaManagement/
]. When JMX is a standard defined just to do this with very extensible architecture, there are vendors like Wily trying to push some proprietary stuff spending big marketing money (they even claim to do some stuff in JMX, of course a check mark item for them I guess to help their sales).
I think JMX is the way to go for App Management and the fact that JMX is going to be part of J2SE 1.5, will boost the standard. Say good bye to these expensive million dollar solutions and look for open standards-based approach. There are even some free implementations/tools available in JMX like...
Also there seem to be some commercial vendors like AdventNet with offerings at reasonable price.
Folks....look at standards. If not, you'll have tough time.
I think you should know well that JMX is not going to help resolve problems but merely indicate there is a problem. The JMX tooling I have seen does not even do any complex event pattern analysis to at least attempt to resolve and determine the cause of a performance problem.
What is availability and reliablility to you, in the context of J2EE and JMX? can you tell me how the current crop of JMX events (notifications) emitted from appservers will enable operations to determine why particular users are reporting performance degradation leading to non availability (from the user perspective).
I like JMX and currently have a prototype console in the works but I really cannot see it helping problem resolution for ALL cases that I see on customer sites.
Operations do care about standards but at the same time they have a job to do and staring at a black box is not always the most efficient way to work.
"Tune and Test with Insight"
IMHO JMX is a standard for enabling application managment. Hence we can build products/applications and provide manageability around this standard.
Weblogic, JBoss .. expose performance stats through jmx that are interesting. Some container stats, runtime stats for EJBs etc. 3rd party vendors can provide more visibility into these jmx data via their consoles.
Sorry, i did not have a look at the report mentioned in this posting, but have a question.
Do people really want to know indepth ejb, servlet statistics (method level performance etc) of a j2ee application running in production or will they be happy with just a url based check for the application availability and response time ? . This is something what many vendors talk about.
People talk about application management in different perspectives. Any insight into this ?
Thanks & Regards
There are 2 phases here. 1) Instrumenting the application to send the notifications to existing consoles like Open View or Tivoli etc. So that the existing consoles can be re-used.
2) Consoles it self.
Wiley combines both technologies under one hood. Most of the commercial Consoles have Fault management modules, correlation Engines etc by default. The promise of JMX is that these consoles can be re-used. So any enterprise can re-use the existing console infra structure.
Now, we are assuming that console infrastructure is already available. Now what matters is how deep you want to go on managing App. infrastructure like EJBs, Servlets etc. To do that what we need is a developer tool that can automate this process of generating JMX Mbean instrumentation on the EJBs and associated infrastructure like servlets/log files/DB monitoring etc. Once you have the MBeans you can apply any protocol adapter like SNMP/HTTP to access these Mbeans. JMX clearly specifies this kind of architecture.
So for your application, do the following
1) Generate Mbean/SNMP/HTTP adapter.
2) Integrate with any console
3) Follow the consoles admin. guide to automate the correlation logic for incoming notifications.
4) If the notifications occur configure actions to rectify the problem in server infrastructure by executing config. tasks from the console on the server infrastructure using the generated SNMP/HTTP adapters.
There is no short cut here or any mantra. You have to follow these to get what you want? If you believe some one else had already done it for your specific application, they are cheating you.
Why bother with all your steps? We have a prototype of a VM agent and AspectJ tool that can automatically generate a MBean for any Java class, especially common classifications - ejb, servlets, etc. I think you will find that though it will help identify a problem you will not neccessarily resolve the root cause.
I have yet to see user code (EJB) directly cause a performance bottleneck in a J2EE system in pre-production or production stages. If you look at the call stack for any remote invocation on a EJB component you will quickly realize that engineering effort has to be spent in minimising container or resource code execution (non user code). Its about architectual, component, and resource interaction design.
Though it could be possible to have JMX provide more detail monitoring and execution contextual information I do not think this is what it intended usage is.
"Tune and Test with Insight"
William: Suyo, Why bother with all your steps?
Without judging whether or not the steps described are the best approach, I think that Suyo works for AdventNet and they make a product that automates those steps, which I believe answers your question as to "why bother".
: Clustered JCache for Grid Computing!
Some things I do to address ": issues with configuration and tuning, architecture, database connections, design problems, memory leaks, capacity planning miscalculations, Java virtual machine issues..."
I use container DataSource via JNDI (so no connection issues possible), use a simple MVC implementation, I use j:Rockit VM, and I use fast x86 based such as AMD, to mitigate these issues.
For monitoring I use JaMon (OSS)
"I blame it on the fact that we have shielded the developer from reality without giving them a direct contact with the reality of the platform"
"I blame platform independence for poor performance"
I suppose the solution is to pick a platform and develop for it. Hmm. What if the platform doesn't perform? What if performance requirements change?
While the issues discussed are legitimate in kind of interesting I don't really see how the criticisms made are specific to J2EE applications. Whether your application is built in .NET, Delphi or Java youre almost always going to have reliability issues with enterprise applications.
One major issue is that many enterprise apps are chained together such that failures in service in one system will cascade to other downstream applications. Is this J2EE specific? Maybe: Java apps are often being built as more manageable front-ends to legacy systems that are a nightmare to maintain. Is this Javas fault?
Another interesting point about the 88% reliability: the article doesnt specify that the applications are intended to be run 24x7. If thats not the SLA then a little downtime is probably acceptable.
As far as execs and customers getting systems that arent what they envisioned, isnt that the responsibility of the development team management and the business liaisons? Until the day comes that you can speak the requirements into the system and the application magically pops out I hardly think that you can place any blame on the platform for generating systems that dont meet customer expectations.
Ive never used Wilys app but based on the comments here Ill never recommend it to any of my customers and colleagues. Looks like this little publicity stunt kinda backfired.
One major issue is that many enterprise apps are chained together such that >failures in service in one system will cascade to other downstream >applications. Is this J2EE specific? Maybe: Java apps are often being built as >more manageable front-ends to legacy systems that are a nightmare to maintain. >Is this Javas fault?
None of this is J2EE/Java specific its just that Wily plays in the J2EE space and therefore wanted to make a point about how people should be implementing monitoring on the application servers. I personally think the survey was aimed more at CTO/CEO's in their ivory towers then at architects/developers. Trying basically to scare them into spending some money.
Their VP of marketing post is just plain funny and has. The same old lack of substance comment you get from any marketing group. Not very impressive on a techology oriented message board.
As for the more important discussion on availabity/reliability in respect of a systems SLA targets, if things are really that bad in the J2EE space compared to other non J2EE projects maybe we need to think of better training for J2EE Architects.
Finally in terms of monitoring we recently puchased and are currently implementing http://www.panacya.com