linking multiple lines of a file together to process together
I'm looking for a solution to the following situation, without using a database. Let's say I have a flat file (doesn't matter if it's delimited or fixed width, but my file is delimited). Each line/record has a key, the file is sorted by this key, but the key is not unique in the file. In other words the file could look like
key,f1,f2,f3,f4
111,a,b,c,d
111,e,f,c,g
222,a,x,y,z
333,h,i,c,k
333,m,n,o,a
333,a,b,e,k
What I need to do is read the file and "gather up" all the lines with the same key, then process them and write a result, say the count of the occurrences of a particular value in a particular column. Let's say it was the number of times 'c' was in column 'f3' in the above example. The output would be
111,2
222,0
333,1
Remember, no database. I already have a db solution. :-)
I was looking at some kind of ChunkProvider or maybe RecordSeparatorPolicy but neither seem quite right. I could write a custom reader, but I was hoping there was a way to leverage the existing FlatFileItemReader and use existing extension points.