Jun 1st, 2010, 05:08 PM
Problem while reading 1.5 million records from DB with Spring Batch ItemReader
My requirement is to read 1.5 million records from database(Oracle 10g), process and write to a flat file. I was using Spring batch 2.0.3.
Developed a CustomStoredProcedure with REF cursor as OUT Parm. Before the step, I executed the sproc, which returns a Map<String, ArrayList<StudentRec>. I configured my CustomItemReader(which is a ListItemReader) to read ArrayList<StudentRec>. This approach works good when the data volume is 100,000 records.
When I ran the job with 1.5 million records, I got the failure due to "Java Heap size". I changed the heap size to 2GB but no luck.
I switched to Spring Batch-2.1.1 and used StoredProcedureItemReader for reading the data. Again REF cursor is the OUT param. Even with this approach
the job was not able to handle 1.5 million records.
Can you please suggest a best approach to handle this problem in spring batch?
Thanks in Advance.
Jun 2nd, 2010, 12:44 AM
Did you try setting the fetch size in the reader? It won't necessarily help (depends on your jdbc driver, not on Batch). If it doesn't you would need to take a paging approach, not a cursor-based one.
Jun 2nd, 2010, 02:09 AM
I have similar scenarios, and for me they are working well.
I'm using a org.springframework.batch.item.database.StoredProc edureItemReader as reader.
Controling the commit-interval i can control the size of the buffered read objects.
The problem can be a memory leak on the custom reader (the reader must liberate the buffered objects after write, ie. instanciate a new Map in every new chunck).
Jun 3rd, 2010, 09:02 AM
I tried with changing fetch size and it does not help. I will try with paging approach.