I am trying to invoke several Hadoop jobs serially using Spring Hadoop, but one problem I'm having is that the main thread is not waiting for the job to complete before proceeding to the next job. The XML I'm using to define each job looks like:
<hdp:job-runner
id="runner"
job-ref="my-job"
run-at-startup="true"
post-action="cleanup-script"
wait-for-completion="true"
/>
although the wait-for-completion attribute doesn't seem to have the desired effect. I've tried setting the "executor" attribute, as stated in the docs:
As the Hadoop job submittion and execution (when wait-for-completion is true) is blocking, JobRunner uses a JDK Executor to start (or stop) a job. The default implementation, SimpleAsyncTaskExecutor creates a new Thread for each new task. Before going into production, it is recommended to double-check whether this strategy is suitable or whether a throttled or pooled implementation is better. One can customize the behaviour through executor parameter.
but I can't get the configuration right -- it's looking for an Executor instance but I can only set executor to a String.


Reply With Quote
