May 4th, 2011, 11:29 AM
Coding an ItemReader
I am newbie on Spring Batch, and I am working on a ETL solution at work. Basically, I need to process some files from different clients. Each client, send a file, with a determined structure. However, I have noticed that the files are not 100% according to the structure. (some lines of the file are corrupted)
According to business needs, I always must process as much as possible from the file. So I load and process all the lines from the files that comply to the structure. Finally, I must inform about the lines from the files which are not valid (according to the structure)
I have not found a way to get an FlatItemReader to be flexible, and dynamic enough, to:
- read a file which has N columns, where N is a parameter I sent
- the reader whould allways read the N columns, ignorying any additional columns or information
- if there is the case, that the file has less than N, the process should stop and fails
Can anybody suggested me an approach, or alternative? I think I have program my own ItemReader implementation. Does anybody has an example of a custom ItemReader where I can start my own implementation?
Thanks in advance
May 7th, 2011, 07:57 PM
The lifecycle of the ItemReader is that Spring Batch will instantiate it and call it at the start of each job, so you have to build a stateful class that can read your data, and then exhaust its reading until it is 'empty', and stay in the empty state. It took me a while to figure out the key here is that it is only 'empty' when that job is completing - when the job is fully complete, that ItemReader will get garbage collected and a subsequent execution will be a completely new instance.
So what you need to do is implement this stateful interface of read() gives an element on each call, until no more are left.
A simple solution would be on the first call, check to see if your iterator (class variable now) is null - if it is, open up your file and start reading, keeping the iterator as a class variable. All subsequent calls just get the next item from the iterator. If the iterator is not null, but it has no more elemnts, then your item reader returns null.
If memory isn't an issue, your custom item reader could read everything into a list on construction, then on each call to 'read', pop an element off the list. If list size == 0, return null.
It all depends really on what type of thing you are reading from, and how you can build a stateful reader out of that - trading off performance, memory usage, etc.
Note that spring batch does have a file reader, with lots of bells and whistles on it, you may want to take a look at that.
May 9th, 2011, 01:27 AM
As per my understand you want a flatfileitemreader which reads a file of each row of variable length. I also had the same problem(my processing file has different count of values in each row which are seperated by delimiter.. from dat each line i want get the fixed position values irrespective of length).. if u r question is also same.. then below solution may helps you..
configure your application context as
<beansroperty name="delimiter" value="," />
under DefaultLineMapper bean.. no need to specify the bean properties in which data to be stored(Ex.. config... <beansroperty name="names"
value="ID,lastName,firstName,position,debutYear,fi nalYear" /> by using this tag you cab get the filed value with their names, but in this case if your length of the row varies then process fails).. and in your custom filed set mapper u can get the token values with thier index postions irrespective of length of row..
Tags for this Thread