I have this requirement where I need to read a huge XML file (100+ Megs), parse it (StAX) record by record (boundary conditions as defined by the business logic) into an intermediate data structure (HashMap) and then write the contents of the structure to a file.
What would be the optimum solution to this?
Should I make use of an array of HashMap(s) as the intermediate structure and have one thread parse a record in the XML and put it into the structure and another thread read from the structure and write to the file? The problem is that the method that does this functionality can return only once the entire data in the XML has been written to the file. This method is invoked from a web application. I cannot background it for the time being.
Further, should I use memory mapped files (java.nio) when reading the XML file and writing to the output file?
Is there a way I can monitor the memory usage before invoking this method, midway through the method and at the end of the method?