Dec 28th, 2009, 06:39 AM
how to implement a multi-process step execution
I am trying to define the best way to execute a step using a cluster. All the implementations that I have seen so far use a request/reply paradigm where the afterStep callback method is used to block the thread until all replies are received. Besides a blocking thread, the loop with a timeout value won't do, especially with a mix of short-running/long-running steps.
What I would envision is an architecture where the step is stuck in the executing part until all "chunks" are processed. Of course it means that another thread, fired by a controller, is able to change the step from executing to success and fire the next step, if any.
Is this a core capability of Spring batch? (i.e. a flow that can be resumed from another thread than the caller). How to deal with a synchronous job launcher in such a case?
Dec 28th, 2009, 09:43 AM
Stephane, your scenario sounds reminiscent of the "Asynchronous Aggregator" from Spring Batch Integration (the unimplemented use case), but only vaguely so.
We decided during the development of 2.0 that somewhere in the stack there has to be a blocking contract. We chose to make it the Step, but that decision isn't cast in stone (except it would be hard to change for 2.x). All you do if you allow the boundary to move is you force someone else to poll or block, so I don't see how it is more efficient of less fragile in the long run.
I also think you are conflating chunks and steps - chunks are always small enough that they shouldn't take long to process, and this is in the control of the developer
In some form or another looping with a timeout is a reality of distributed computing - you never really know if your remote worker is still alive or busy. Your point about the mix of long-running / short-running steps doesn't have enough detail to make any impact yet. You simply need to set the timeout to the longest possible time you are prepared to wait for a reply without assuming reasonably that there was an undetected failure. Anything that happens quicker than that is not inefficient - it just completes earlier.