Performance and scalability: Performance Issue with large chunk of data retreival.

  1. Guys, I would like to get some opinion on performance issue we are facing. 1. We have a 2/3 tier J2EE application where we are using Struts, SLSB and just JDBC (wrapped in DAO's) in Jboss 4.04 on Windows, to extract data. 2. There are few reports which requires atleast for 2/3 years of transaction data where every day we add 300-500 records in DB. When we run these specific reports we get out of memery error. We can see why, it happens at Jdbc layer more often. Question is: 1. First How do you find the bottleneck in a system like this, any recomended specific tool. 2. How would you change/structure the specific DAO's in question to scale it (getting data of 10,00 records or more) 3. Any other appraoch, like removing JDBC wrapper DAO and using hibernate or caching.. Any suggestions... Cheers.. Vishal http://startups.sharmavishal.com
  2. Large Data retreival[ Go to top ]

    Hi, Are you retrieving & displaying the complete result data in one go? That would surely result in outofmemory. You gotta implement some sorta paging technique for large reports.
  3. Re: Large Data retreival[ Go to top ]

    Thanks for that. Are there any frameworks which can be used in that layer for that scenario or implement on yr own. Any suggestions.
  4. Memory arguements[ Go to top ]

    What are the memory arguments that are set for the JVM in JBoss?
  5. Performance problems are tricky... Before you start changing pieces of your application, or try to optimize, I would STRONGLY SUGGEST, you analyze the application to truly see where the memory is going, than what you can do about it. Sometime it may be as simple as changing your garbage collection algorithm, or some unknown object keeping a reference to a object. Most JDBC Drivers can go through Millions or Billions of rows without getting an Out of memory Error. So most likely it is something in your code. A Paging scheme may help the most, but first you are better of identifying the problem. Java persistence architecture (JPA) Implementors such as Oracle Essentials, Eclipse Link, OpenJPA, JPox, and Hibernate are great tools and may improve the issue if you are luck enough for the problem to be in the JDBC Domain and the light layer above it. I suggest using the Free Memory Profiler that is part of Netbeans IDE. I know, 90% of the developers use Eclipse, but I suggest using Netbeans for this purpose. You can switch back when it is done. It takes about 1/2 a day to learn how to use it(Tutorials). About 2 hours to configure a J(2)EE project (Pending you are new to netbeans) and within a day, you can put your finger on the problem directly. If it is a problem with the JDBC Driver, you can send it to the vendor (But I would not think that was the case) If it is your code, you can than make a better decision on how to fix the problem. Run the profiler again, and verify the problem is resolved. If you need some real help with profiling and optimizing your applications, there are professionals that specialize in this, and worth their weight in gold, if you have serious problems, because they are also good at Refactoring solutions as well. You will find them when you need them. best of Luck Tony McClay Sun Certified Enterprise Architect JEE 5 (SCEA CS-310-052) Sun Certified Business Components Developer JEE5 (SCBCD CX-310-091) Sun Certified Web Components Developer (SCWCD CX-310-081) Sun Certified Java Programmer 5 (SCJP CX-310-056)
  6. Hi Vishal; First of all, I have an argumet! At any given point of time a resultset should not be dealing with more than 100 active Records, for efficiency sake! Whether for reporting or any other purpose! I am a firm advocate of paging data, that there is nothing spectacular we might achieve by loading 10000 records on a single query versus paging the 10000 into smaller chunks, like say batches 1000 and processing them! Paging the data might make things much more easier, as valuable system resources could be better utilized for business processing, rather than holding up the data itself. Getting data of 10000 records or more in a single fetch is out of my imagination, rather I would page the query and process it in batches! I still dont understand the full scope of your situation, yet I am more than happy to accomodate paging, for any reporting or any other purposes where I deal with potentially more than 1000 records. Another way is to have a Stored-proc with a cursor doing the processing which invloves the 10000's of records, and just sending back the necessary information to your Java layer.
  7. 1. There are different types of out of memory error. I think you referred to HeapOutOfMemoryError here. For this type of error, increasing max heap size is just a temporary solution. I suggest first you get your heap memory usage by adding "-XX:+HeapDumpOnOutOfMemoryError" to your deployed java application and then use SAP MemoryAnalyzer to do some investigation. 2. Not very sure about your reporting case, but I think retrieving large rows from db and then doing the calculation in your DAO layer is not a good approach. Use stored procedure to generate your report and then return the output or result cursor to DAO layer instead. 3. If you insist on generating reports on DAO layer, you can split your data into pieces by month. Generating sub reports for Jan, Feb ... respectively and store them into local storage, purge your memory usage and finally make your overall report from these report pieces. Use it if you have no other choice.