Results 1 to 3 of 3

Thread: Batch Processing

  1. #1
    Join Date
    Feb 2013
    Posts
    21

    Unhappy Batch Processing

    Hi. For our company I was implementing batch job with spring.
    I have a reader, a processor and a writer.
    Reader reads from xml file
    Processor finds objects from db and updates changes from xml
    Writes saves to db.

    Everything is good and working, but for processing I am hitting to db for every single item. I need to somehow run processor just like writing is being run, once for big chunk.

    i am surprised that framework does not allow this since it is common use case.

    how can i do that, ie process batch of items, not every single item, but reader returns one item in one time.

    thanks.
    Last edited by elbek; Feb 11th, 2013 at 04:14 PM. Reason: comma

  2. #2
    Join Date
    Sep 2008
    Location
    Chicagoland, IL
    Posts
    338

    Default

    The way chunk based processing is implemented within Spring Batch is that the ItemReader and ItemProcessor are executed once per item and the ItemWriter is executed once per chunk. This is because the typical area for optimization is within writing. That being said, there are a number of options that you can do to limit the number of times the database is hit from an ItemProcessor. Options I've used in the past include:

    1. File sorting: Sort the file in the order of the item that needs to be retrieved from the database. As that item changes, you query the database (this concept is called a control break). In this case, you are hitting the database once each time the control value changes.
    2. Cache database values: If the dataset is small enough, using something like ehCache can be used to cache database results in memory so that they don't hit the database again.


    Another option would be to restructure your item to be an aggregation of multiple items as you currently have it and commit after each item. This requires you to write a lot of extra code to handle the aggregation on the read, looping in the processing and the line aggregation in the write...but it would probably work.
    Michael Minella
    Spring Batch Lead
    Author - Pro Spring Batch
    http://www.michaelminella.com
    Twitter: @MichaelMinella

  3. #3
    Join Date
    Aug 2006
    Posts
    129

    Default

    exactly what i used to lookup keys in the database :

    Code:
    public class IdCountListener extends ItemListenerSupport<Map<String, Object>, Map<String, Object>> implements ChunkListener, InitializingBean {
    	
    	private IdKeyLookup lookup;
    	
    	private Logger logger = LoggerFactory.getLogger(getClass());
    	
    	public void setLookup(IdKeyLookup lookup) {
    		this.lookup = lookup;
    	}
    	
    	public void afterRead(Map<String, Object> item) {
    		lookup.addKey(UUIDConverter.getUUID(item.toString()).toString());
    	}
    	
    	public void beforeChunk() {
    		lookup.reset();
    	}
    
    	public void afterChunk() {
    	}
    
    	public void afterPropertiesSet() throws Exception {
    		Assert.notNull(lookup);
    	}
    
    }

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •