Jul 4th, 2010, 11:03 AM
Built in recovery mecanisms provided by the framework
I just discovered Spring batch solution and i want to evaluate it before choosing it.
Everyone knows that for for batch processing in general when you have to deal with high data volume it is very important to choose a robust solution that allows to restart from a recovery point and not from the beginning of course. I wanted to know if Spring batch framework offers built in mecanisms for recovery and how do they work ?
It is always possible to use custom tables with indicators but what the other solutions ?
Jul 6th, 2010, 02:31 AM
You should find plenty to get your teeth into. Try http://static.springsource.org/sprin...igureStep.html and the sample jobs with names like *Skip* *Restart* *Retry*.
Jul 7th, 2010, 01:22 PM
I have a similar question regarding recovery mechanism on SB. I am trying to make my batch processing resistant to extreme fail situations like eg. power goes out.
If the power goes out during a job execution, several steps are going to be completed.. others will be started, etc.
Thing is... right now my batch is fired up by a quartz job, and i'm thinking on using the jobOperator before starting the quartz job to see if there is any job running (because if the power goes out.. it will still be with the batch status "Started"), and then restart that job execution. I tried to implement this but the jobOperator tells me that the job is already running, which makes sense.... Do i have to manually set the failed status on my job and step executions to achieve this? Is there a more efficient way to restart a already "started job"?
Jul 8th, 2010, 01:03 AM
If the power goes out be fore a job completes you are correct that it will remain with status STARTED, so there is not way for the framework to detect automatically that it is not still running. It is a business decision, and you can use the JobExplorer (or JobOperator) to poke the corpse and try and detect signs of life. If it is still running and it is chunk-oriented, then a step execution will be updated at the end of every chunk, so the last modified time of a step execution is often a key ingredient in the decision. Once you have decided that it is not running, or that it may be running but you don't want to wait for it, you can manually set the step and job executions statuses to FAILED and then restart.
Tags for this Thread