Large Data Set Retrieval using Entity Beans


Performance and scalability: Large Data Set Retrieval using Entity Beans

  1. Large Data Set Retrieval using Entity Beans (2 messages)

    I'm new to this site and to EJB, but experienced in perf & scalability issues, Java programming, server side programming in general. This is the first of probably a bunch of questions I am going to ask to quickly get up to speed on assessing the use of EJBs for a project.

    This question is to do with large data set retrieval. Examples explain it best: Say I have a UserProfile entity bean. Most of the time I want to retrieve a single user profile, perform some app. specific operations on it. No problem. Every once in a while, some major new thing happens at the site. They want to scan through the whole set of users (programmatically) and perhaps change a few attributes for each user. Now, I notice that the find methods of entity beans - CMP since I want to reduce data access code - either return single objects or collections. It is in this regard that I have the question: If I had a find method that returned ALL userprofiles (so I would use the Collection return type), and say they were a million in the database, what would happen? Would all the million userprofiles be first loaded into memory thereby bloating and probably killing the server, not to mention the huge load time? Or would the userprofiles be streamed to me e.g. in the form on an iterator which extracts them on demand? Obviously I prefer the latter option. I just dont see an easy way to have such "streaming" methods in CMP entity beans. Please enlighten.

  2. Forget Entity Beans, you're best off using a Session Bean and call JDBC directly. You need to put together a paging mechanism as well, the Value List Handler pattern will help. Start here...

    You might want to look at the DAO pattern as well.
  3. See this thread.