Discussions

EJB design: Reading Large Data files into Arrays

  1. Reading Large Data files into Arrays (1 messages)

    I have a file with about 7000+ lines. I want to read it into a collection of objects to process it. What is the best way to handle this?
    How does the performance get affected by creating too many objects?
    Is there a way to search through files quickly without loading then into collections and then search the collection?
    Thanks a lot
    -Prash
  2. Reading Large Data files into Arrays[ Go to top ]

    If the data in the file is not interdependant, the best way to process the data is line-by-line. That way, you never have more than one line in memory at a time.

    If you want to locate specific data with the file, you can again load one line at a time, search for a keyword, and process that line.

    All of this assumes some sort of batch operation. If the data is interdependent, or you need to cache data in memory, you might be best off caching data as an array of either strings or objects.

    The memory usage for strings is about 2 x 7000 x (average length of a line). Assuming 80 character lines, that is 1200 K or so, which is not too bad. Objects will probably take up a similar amount of memory.

    If you do frequent processing on the data, you might be better off copying it into a database and using SQL for lookup. There are plenty of free databases out there (e.g. MySQL).