Hello,
We are planning for a new big proyect and so far we are leaning towards using Spring Batch because it is amazingly strong and complete.
We have a functional requirement to process items 1 by 1 in some cases, and after doing some tests we found out that the "overhead" of using a database jobrepository while processing single items its way too much, all the inserts/updates done to BATCH_ tables increase the processing time considerably and definitely that is not something we want.
Our idea right now is to use a master step to partition our data and then process it with N threads, our partitioner will assign just 1 record to each partition hence our slave step will process 1 record at a time. Is this the correct approach? Or is there any other "best practice" that we should follow here?
If we do not care that the job is aware of the jobs/steps state when we restart the vm is there any other drawback of using the inmemory job repository instead of the database one?
Also we were thinking if there is any bennefit/difference between using the inMemory job repository and setting up an inmemory database and use the database job repository pointing to the inmemory database?
If there is indeed a beneffit from using an in memory database, is there any that is proved to work best with Spring Batch?
Thanks in advance.


Reply With Quote