Splitting large files into smaller files
Hi ,
I am using spring integration along with spring batch for processing batch files. I use spring integration to poll a directory and if a file of a particular name pattern is found, use the spring-batch-integration and the JobLauncher to launch a job. Now, we have been facing issues with spring batch while trying to process huge files of more than 5.5 million rows which runs for about 9 hours. To improve performance, we decided to use partitioning in spring batch and split the incoming file into smaller chunks which could then be processed in parallel. Due to time constraint, we wrote a bourne shell script that did the split after validating the footer for number for records which is invoked from a java class. However, I wanted to know if there is a way in spring integration where we could provide information as to number files , the master file is to be split to and the folder where to put the split files. Also are there any performance hits doing it so versus shell scripts? I can understand that shell scripts would be faster and am prepared to live with minor performance decrease as long as I can centralize my logic in one place instead of maintaining a separate shell script and the additional burden of having to find another way in case of windows servers. I would be very interested in your experience and suggestions on this. Can there be a FileSplitter component that serves this purpose if there no way of doing it easily at present?
Regards,
Anoop