First of all apolgies as i was not too sure if this is a right place to pose this question (since im quite new to Spring Source forums).
Currently, we have a requirement of designing a DD payment reconsilliation batch on one of our midrange systems which does the below job
"On a daily basis we receieve a huge flat file (records over ~ 0.5 million) from one of our backend systems which has a header (specifying the date in the format dd|mm|yyyy), the body which has transactional records which consists of account number|sort code|product code|payment due date and finally a footer which contains the count of transaction records specified in the body. In terms of processing the file there was a intial requirement to get account numbers and sort code of the customers verified against one of our scalable webservice (SOAP)."
As a part of our intial design we have planned the below where might need an advice in features (with their best pratices) which we can employ using existing spring technologies like Spring Batch 2.x , Spring OXM and Spring Integration
Step_1 Use a custom row mapper with a FlatFileItemReader and persist the entire data on to a database table with a column validateFlag ('N' by default)
Step_2 On successful completion on Step_1 the idea was to use a custom controller which makes use of JdbcCursorItemReader (or JdbcPagingItemReader) and more of custom built layer which would use a marshaller (like Jibx or XmlBeans) and makes a plain HttpPost call to an available end point via a Producer/Consumer (async model) kind of setup with multi-threaded execution pool where the messages are validated via LinkedQueue and ultimated according to key id the respective flag in the database is updated.
Step_1 looks much straight forward to us with the basic features of Spring Batch. However, when it comes to Step_2 since we are reliying on existing custom code would like to know whether there any in built features (which we can put into the best use) by using above stated spring projects which can help us in terms of benchmarking them.
NOTE : Specially considering batch processing stratergies like remote chunking or partitioning (their merits and de-merits) assuming we can share the DB resources concurrently between the batch nodes (threads).
Apologies for semantics (if any)
Thanks in advance !!!