Hi Emilio,
It can be achieved with some work with a 2-step flow.
First, check this part of the documentation.
So, following this, I took a concrete example with a simple object called "Author", having 2 properties (first and last name).
I define an XSD schema for it, assuming you already have one for your object, but this is just for the demo :
HTML Code:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema targetNamespace="http://com.authors/batch"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://com.authors/batch"
elementFormDefault="qualified" attributeFormDefault="unqualified">
<xsd:element name="Authors">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="Author" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name="Author">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="FirstName" type="xsd:string" />
<xsd:element name="LastName" type="xsd:string" />
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
Also, I will take a sample input file, like this :
HTML Code:
<?xml version="1.0" encoding="UTF-8"?>
<Authors xmlns="http://com.authors/batch">
<Author>
<FirstName>Chopin</FirstName>
<LastName>Kate</LastName>
</Author>
<Author>
<FirstName>London</FirstName>
<LastName>Jack</LastName>
</Author>
</Authors>
You then generate the classes from the XSD. They will be in package com.authors.batch.domain in my example.
Assuming that is done, the beans that will read my xml file is defined like this :
HTML Code:
<oxm:jaxb2-marshaller id="authorMarshaller"
contextPath="com.authors.batch.domain">
</oxm:jaxb2-marshaller>
<bean id="authorReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
<property name="fragmentRootElementName" value="Author" />
<property name="resource" value="file:src/main/resources/authors.xml"/>
<property name="unmarshaller" ref="authorMarshaller" />
</bean>
Then I need a processor to add the suffixes and create new xml records :
AuthorItemProcessor.java
Code:
package com.authors.batch;
import java.util.ArrayList;
import java.util.List;
import org.springframework.batch.item.ItemProcessor;
import com.authors.batch.domain.Author;
public class AuthorItemProcessor implements ItemProcessor<Author, List<Author>> {
@Override
public List<Author> process(Author author) throws Exception {
String[] suffixes = new String[]{".1", ".2", ".3"};
List<Author> output = new ArrayList<Author>();
for (String suffix : suffixes) {
Author newAuthor = new Author();
newAuthor.setFirstName(author.getFirstName() + suffix);
newAuthor.setLastName(author.getLastName());
output.add(newAuthor);
}
return output;
}
}
Just declare this as a bean on your spring config file :
HTML Code:
<bean id="authorProcessor" class="com.authors.batch.AuthorItemProcessor"/>
Now, once we generate the lists for each record, we're going to merge the authors they contain in a single list, and save them for the next step of the batch flow.
This is done through this class :
SavingItemWriter.java
Code:
package com.authors.batch;
import java.util.ArrayList;
import java.util.List;
import org.springframework.batch.core.StepExecution;
import org.springframework.batch.core.annotation.BeforeStep;
import org.springframework.batch.item.ExecutionContext;
import org.springframework.batch.item.ItemWriter;
import com.authors.batch.domain.Author;
public class SavingItemWriter implements ItemWriter<List<Author>> {
private StepExecution stepExecution;
@BeforeStep
public void saveStepExecution(StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
@Override
public void write(List<? extends List<Author>> items) throws Exception {
ExecutionContext stepContext = this.stepExecution.getExecutionContext();
@SuppressWarnings("unchecked")
List<Author> currentItems = (List<Author>)stepContext.get("itemList");
if (currentItems != null) {
List<Author> newItems = new ArrayList<Author>();
newItems.addAll(currentItems);
for (List<Author> list : items) {
for (Author author : list) {
newItems.add(author);
}
}
stepContext.put("itemList", newItems);
} else {
currentItems = new ArrayList<Author>();
for (List<Author> list : items) {
for (Author author : list) {
currentItems.add(author);
}
}
stepContext.put("itemList", currentItems);
}
}
}
Add this as a bean in your spring config file :
HTML Code:
<bean id="authorsWriter" class="com.authors.batch.SavingItemWriter"/>
It's time for STEP 2 now !
The class in charge of retrieving the list of authors previously saved is as follows (I just mixed the code of org.springframework.batch.item.support.ListItemRea der<T> and the code from chapter 11.8 of the documentation (link above):
RetrievingItemReader.java
Code:
package com.authors.batch;
import java.util.ArrayList;
import java.util.List;
import org.springframework.aop.support.AopUtils;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.StepExecution;
import org.springframework.batch.core.annotation.BeforeStep;
import org.springframework.batch.item.ExecutionContext;
import org.springframework.batch.item.ItemReader;
import com.authors.batch.domain.Author;
public class RetrievingItemReader<T> implements ItemReader<T>{
private List<T> list;
public RetrievingItemReader(List<T> list) {
// If it is a proxy we assume it knows how to deal with its own state.
// (It's probably transaction aware.)
if (AopUtils.isAopProxy(list)) {
this.list = list;
}
else {
this.list = new ArrayList<T>(list);
}
}
public T read() {
if (!list.isEmpty()) {
return list.remove(0);
}
return null;
}
@SuppressWarnings("unchecked")
@BeforeStep
public void retrieveInterstepData(StepExecution stepExecution) {
JobExecution jobExecution = stepExecution.getJobExecution();
ExecutionContext jobContext = jobExecution.getExecutionContext();
this.list = (List<T>)jobContext.get("itemList");
}
}
I wire this to the spring configuration like this :
HTML Code:
<bean id="authorsReader" class="com.authors.batch.RetrievingItemReader" scope="step">
<constructor-arg name="list" value="#{jobExecutionContext['itemList']}"/>
</bean>
Then I am able to write that to xml using a StaxEventWriter, configured this way :
HTML Code:
<bean id="authorWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="rootTagName" value="{http://com.authors/batch}:Authors"/>
<property name="overwriteOutput" value="true"/>
<property name="resource" value="file:./output-authors.xml"/>
<property name="marshaller" ref="authorMarshaller"/>
</bean>
The rest of the spring configuration is like this :
HTML Code:
<bean id="promotionListener" class="org.springframework.batch.core.listener.ExecutionContextPromotionListener">
<property name="keys" value="itemList"/>
</bean>
<batch:job id="job1">
<batch:step id="step1" next="step2">
<batch:tasklet transaction-manager="transactionManager" start-limit="1">
<batch:chunk reader="authorReader" processor="authorProcessor" writer="authorsWriter" commit-interval="1" />
</batch:tasklet>
<batch:listeners>
<batch:listener ref="promotionListener"/>
</batch:listeners>
</batch:step>
<batch:step id="step2">
<batch:tasklet>
<batch:chunk reader="authorsReader" writer="authorWriter" commit-interval="1"/>
</batch:tasklet>
</batch:step>
</batch:job>
And when I run this, I get an xml file containing this :
HTML Code:
<?xml version="1.0" encoding="UTF-8"?>
<Authors xmlns="http://com.authors/batch">
<Author>
<FirstName>Chopin.1</FirstName>
<LastName>Kate</LastName>
</Author>
<Author>
<FirstName>Chopin.2</FirstName>
<LastName>Kate</LastName>
</Author>
<Author>
<FirstName>Chopin.3</FirstName>
<LastName>Kate</LastName>
</Author>
<Author>
<FirstName>London.1</FirstName>
<LastName>Jack</LastName>
</Author>
<Author>
<FirstName>London.2</FirstName>
<LastName>Jack</LastName>
</Author>
<Author>
<FirstName>London.3</FirstName>
<LastName>Jack</LastName>
</Author>
</Authors>
That's it...
It's probably not the simplest solution, nor the cleaner code (I'm not a jaxb pro but I guess the same can be achieved using some xsd tricks), but at least I didn't touch the initial xsd.
I hope it helps.
P.S.: Spring batch should definetely make things like this easier. If there is an easy solution, I'll be very interested in seeing what it looks like.