Results 1 to 9 of 9

Thread: selection dependent multi file reader

  1. #1
    Join Date
    Dec 2005
    Posts
    5

    Default selection dependent multi file reader

    Hi,
    I'm new to Spring Batch and haven't been able to figure out how to handle this situation.
    A database SELECT returns a list of those of our merchants that have updated their catalog files.
    For each merchant id there is a similarly named directory that contains one or more flat CSV files that need to be processed.
    At first sight this seems to require to have one job that reads the selected merchant ids then for each one of them spawn a new job to handle the files using something like a MultiResourceItemReader. Is this a valid way to handle the problem?
    I haven't yet figured out how to correctly spawn a secondary job but maybe there is a simpler way to do this?

    Any advice is welcomed
    Gualo

  2. #2
    Join Date
    Jun 2005
    Posts
    4,231

    Default

    You could put your merchant SELECT in a StepExecutionSplitter. There's a sample that is close (partitionFileJob).

  3. #3
    Join Date
    Dec 2005
    Posts
    5

    Default

    Thanks for your help.
    From the documentation and sample code I understood that StepExecutionSplitter is used to partition a data set into chunks to execute them remotely or in parallel mainly for optimization purposes.
    In my case I have two distinct data sets, one with the list of merchants, and a second one, derived from it, a list of catalog file names to process for each merchant.
    Unless I'm missing something obvious I don't see how I could use the splitter logic.

    Thanks again

  4. #4
    Join Date
    Jun 2005
    Posts
    4,231

    Default

    Seems like your merchant id would be an input to the step that processed the related catalogs. Did I miss something?

  5. #5
    Join Date
    Dec 2005
    Posts
    5

    Default

    That's exactly it.
    First I determine which merchant catalogs need to be processed, then each selected merchant id should be repeatedly passed to the second step to process the catalog files themselves. For the second step I can use a MultiResourceItemReader [derived tasklet] using a wildcard file specification based on the merchant's id.
    My problem is how to trigger the repeated executions of the second step from the first one for each merchant id.
    I'm looking into two alternatives:
    The first one is a repeat policy derived from a CompletionPolicy as described in this article http://angelborroy.wordpress.com/200...eat-a-tasklet/. In my case the RepeatPolicy object will iterate on the selected merchants returning the merchant ids as the current resource until all selected merchants have been processed.

    The second one is developing a tasklet similar to the MultiResourceItemReader but working on the merchants selection to generate the catalog files list(resources) on the fly and process them. Less "elegant" because the two operations are joined in a single step.

    Are this valid ways to do this or am I still missing something?

    Thanks in advance for your help

  6. #6
    Join Date
    Dec 2005
    Posts
    5

    Default

    I finally found an [hopefully] elegant and modular way to do it by using a SystemCommandTasklet to launch a separate job process on each selected merchant.

    Thanks again for your help

  7. #7

    Default

    Hi,

    There is JobStep in version 2.1.0-M3 (not released yet).

    http://jira.springframework.org/browse/BATCH-1443

  8. #8
    Join Date
    Dec 2005
    Posts
    5

    Default

    Hi,
    Thanks a lot, that's very good news indeed as it exactly addresses my issue.
    I'll stick to a simple job spawn system but waiting eagerly for the 2.1.0 version to be released.

  9. #9
    Join Date
    Jan 2010
    Posts
    1

    Default

    Similar Problem has to be solved using Spring batch.

    Step1
    -----

    Database Select (retuns list of ids)

    For list of ids in step1, step 2 and step3 has to be executed.


    Step2
    ----
    Set of database selects

    Id value from the step1 has to be passed to select the statement.

    Step3
    -----
    rows from the step2 db select has to written to the file.
    file should be created with id value and current timestamp.

    Any suggestions to resolve this issue.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •