Dec 6th, 2007, 08:14 AM
Are DB set operations in opposition to Spring-batch philosophy ?
The tasklet design supposes that it processes one single line of data, while iterations are managed by the framework.
While managing iterations and all related concerns in the infrastructure saves time and headaches for the developer, this solution offers poor performances in the case of DB read/write operations : it is much more efficient to invoke a single SQL statement that applies on N records than invoking N statements that apply on a single record.
While I guess this approach is feasible (I can directly implement Tasklet), it bypasses a lot of the nice spring-batch features. Is this design in opposition to the spring-batch philosophy?
Dec 6th, 2007, 08:46 AM
In regards to your write part it depends on how you implement your ItemProcessor or OutputSource. You could create one which caches your updates. That way for instance you write 50 or 100 (depending on your commit interval) records at once to the database.
So yes it processes 1 line at a time but the handling depends on your ItemProcessor/OutputSource implementation.
Dec 6th, 2007, 09:17 AM
We are making a bit of a shift in this regard in milestone 4. Rather than reading in one line, then writing that line out until the commit interval has been reached, all of the input for a 'chunk' will be read in, then passed to a processor that will write out all of those items. For example, in the file to database scenario with a chunk size of 5, 5 lines would read in and translated to Items (Objects), those 5 items would then be processed and written out via a DAO.
You would still need the same solution as mdeinum mentioned to ensure the written records are batched and then committed, so the overall approach isn't changing, but it should be a bit easier to understand where and how it should be applied. As a workaround for Milestone3, a RepeatInterceptor on ChunkOperations should tell you when the 'chunk' is closing, and allow you to write out your chunk.