Feb 8th, 2008, 12:29 AM
Defer architectural decision
One of the major advantage of using Spring Batch is that it allows us to defer scalability decision to later stage. Is there any sample of the followings for the same batch job?
1. running the job as single process in single jvm
2. running the job as multiple processes in single jvm
3. running the job as multiple processes in multiple JVM
Feb 8th, 2008, 01:23 AM
1. and 3. are the fixedLengthImportJob and parallelJob. 2. is not planned as a sample for 1.0, but we are confident it can be implemented straightforwardly (more so with m5).
Feb 8th, 2008, 10:48 AM
Is there any documentation for these samples?
Actually I thought paralleljob sample is (2). Why is it multiple processes in multiple JVM?
Feb 8th, 2008, 12:27 PM
Oops sorry, you are right. Parallel job is #2. #3 is the one we are deferring, but planning quite carefully. There is some documentation for the samples in the User Guide, but it's a bit old, so won't have the prallel job in it. Javadocs are the main source for now, and we will do some more work for 1.0 on the usert guide and samples in particular. What did you need to know?
Feb 8th, 2008, 08:52 PM
Is #3 a target for v1.0? How about #4 - running multiple processes in clustered JVM?
I'm going to develop batch processing for my project but I'm not sure which option I will go for before the load test. As I want to defer my scalability decision, I want to see how 'easy' it is for me to switch from one option to another when it is required.
Feb 9th, 2008, 03:38 AM
#3 will not be implemented in 1.0, as a sample or with the necessary additional plumbing for the remote processes to communicate. But it will be a pretty short step, if people want to try it out. And we will begin work on it right away, so there should be some concrete stuff coming out quite quickly, probably as samples to start with.
#4 is with Terracotta (what is a "clustered JVM" otherwise)? No plans right now to try that, but I would be interested to hear how it goes if anyone tackles it.
We are going to be putting some effort into grid implementations (GigaSpaces, ObjectGrid, Coherence, etc.), but since they are less common than ordinary messaging middleware, we are going to tackle JMS first. (GigaSpaces has a JMS API, so that would work with no additional effort. The others will be slightly different.)
Feb 9th, 2008, 03:52 AM
Actually, I saw your presentation on http://www.parleys.com/display/PARLE...wComments=true
And on the 13th slide, it mentioned #3 - multiple JVMs and #4 - multiple clustered JVMs. What is the different between the two? #3 is with single machine or #4 is with multiple machines??