PDA

View Full Version : Spring Batch Hardware Infrastructure and Scalability



thowell
Jan 19th, 2010, 09:15 AM
I am planning on using Spring Batch to replace some Mainframe processing.

Is there a recommended hardware configuration that is easily scalable and recommended for Spring Batch?

We want to design a central Spring Batch server platform where we can scale the underlying hardware without impacting existing Spring Batch jobs. The ideal solution would be similar to a cloud where the cloud can expand and contract a needed. In addition, there should be no special coding to accommodate the hardware such as the case in grid computing. The mainframe allows us to add CPU engines as needed.

My fear is we implement Spring Batch jobs (possibly hundreds) on a server which then runs out of capacity or fails. Then we have to bring up a new server to add new Spring Batch jobs to it leaving us with 2 batch environments to maintain.

Has anyone had experience in this area or can offer some suggestions?

Jul
Jan 19th, 2010, 03:16 PM
I am planning on using Spring Batch to replace some Mainframe processing.

Is this Mainframe processing Java implemented or e.g. COBOL?


Is there a recommended hardware configuration that is easily scalable and recommended for Spring Batch?
As far as I know it's not easy. From your description looks like you are expecting quite huge load to be processed. In that case I can assume (also from 'scalability' point of view) you want to process jobs on more then one box, and allow attach new boxes (units). So your jobs need to support some distributed management mechanism (can be MQ e.g.). Scalability of this solution is expensive from maintenance point of view.
In fact Mainframe has the most flexible environment which can be extended by additional CPU. In that case Spring Batch job need to be partitioned (even in simplest way) and processed by many threads. I know that this solution will be more expensive (hardware is expensive), but is also most efficient.

thowell
Jan 20th, 2010, 08:43 PM
Is this Mainframe processing Java implemented or e.g. COBOL?
We are converting COBOL and Natural into Spring Batch.

You are correct that I am seeking a hardware solution that will allow me to add CPU / server nodes as needed. Web applications use an application server to form clusters which allow us to add nodes as needed. I would like a similar capability but with batch.

I was curious if anyone has implemented a large number of Spring Batch jobs and what infrastructure did they use. Maybe I am over complicating it. Ideally, the infrastructure would be easy to support, easy to scale, and just work (basically a mainframe style environment but without the mainframe lol).

Jul
Jan 21st, 2010, 04:03 AM
We are converting COBOL and Natural into Spring Batch.

Do you plan to deal with DB2 as well?


(basically a mainframe style environment but without the mainframe lol).

I think Spring Batch has all what you need to do something you need. But Spring Batch is set of features which need to be combined in complex jobs. How you combine a steps, what kind of partitioning implementation will you use and how you will control jobs/steps depend on you. Unfortunately I have no such expirience with SB and Mainframe.

You can take a look also on: http://www.terracotta.org/

HTH,
Jul

thowell
Jan 21st, 2010, 07:20 AM
No DB2, just Oracle databases. It will be a complete move from the mainframe.

We've used parallel and multithreaded processing in our Spring Batch proof of concept with great results. Now I am trying to figure out how would I manage a large scale environment.

Dave Syer
Jan 25th, 2010, 12:28 PM
Most people just use a regular app server (e.g. WAR deployments containing Batch workloads). You could also consider lower level OS virtualization I suppose.

thowell
Jan 25th, 2010, 01:32 PM
Thanks Dave. We are debating whether to use an application server product versus the lower level OS virtualization solution (aka a cloud of some sort). We'll experiment with the WAR option for now and see how well it works for us.