Results 1 to 7 of 7

Thread: Pre-step and post-step method

  1. #1
    Join Date
    Dec 2006
    Posts
    2

    Default Pre-step and post-step method

    Hello,

    we need to do some initialization at the beginning of a step
    (e.g. mark all records in the target the database table with some flag).

    In the same sense, we would need to execute some cleanup at the end of the task (when all lines of the input file have been processed).

    The "pre-step" and "post-step" method should belong to the step's transaction.

    Thank you in advance

    Michal Palicka

  2. #2
    Join Date
    Jun 2005
    Posts
    4,232

    Default

    What do you mean by 'the step's transaction?' Normally there are many transactions per step, unless you write your own Tasklet that returns ExitStatus.FINISHED, or have an extremely large commit interval. If you are writing your own Tasklet then the answer would be obvious, so I must be missing something?

  3. #3

    Default

    You can implement the initialization and cleanup as separate steps. There is no single 'step transaction' so you should get the same result you would get with proposed "pre-step" and "post-step" hooks.

  4. #4
    Join Date
    Dec 2006
    Posts
    2

    Default

    Thank you for your replies.

    I shall try to explain, what I'd like to achieve.

    (1) start tx
    (2) reset a status flag for all records in a DB table (pre-step)
    (3) read a CSV file and insert/update data in the DB table (the flag gets modified for some of the records).
    (4) store log info (post-step)
    (5) commit tx

    The whole file shoul be read in a single (potentially long) transaction.

    What is the "correct" way to realize this scenario?

    Thank you in advance
    Michal.

  5. #5

    Default

    As Dave indicated you can implement a tasklet with execute body doing (2);(3);(4); return ExitStatus.FINISHED;

    However in case of a single transaction you get almost no benefit from using the batch framework. All I can think of is automatic deletion of created files in case of rollback (4) (assuming you actually want to store the logs as files and use FlatFileItemWriter).

    The "correct" way to realize your scenario may very well be "just write a Ruby/Groovy/... script", but that depends mostly on the context of what you are doing.
    Last edited by robert.kasanicky; Jan 31st, 2008 at 04:33 AM. Reason: spelling

  6. #6

    Default

    I would say that you might want to use a multi-stage operation for this -- e.g.

    STEP1
    Reset status flag for all records in a DB table (1 transaction OR M / commitFrequency transactions were M is the number of records in the database if you do updates one record at a time -- the second strategy might be better if your table has a large amount of data or if the DBMS in your environment is prone to blocking)

    STEP2
    For each line in CSV file, insert / update data in the DB in a holding table (N / commitFrequency transactions where N is the number of lines in file)

    STEP3
    Migrate data from holding table to production table (1 transaction)


    In my experience this avoids a lot of problems, since you are not updating production data at the same time as you are interacting with the error-prone part of your processing (i.e. reading from the file).

    This can easily be represented in Spring Batch if you do it this way.
    a) the transaction strategy used in the recommended "simple" configuration would encapsulate these needs seamlessly
    b) STEP2 can be almost fully created by configuring pre-packaged classes


    Quote Originally Posted by mpalicka View Post
    Thank you for your replies.

    I shall try to explain, what I'd like to achieve.

    (1) start tx
    (2) reset a status flag for all records in a DB table (pre-step)
    (3) read a CSV file and insert/update data in the DB table (the flag gets modified for some of the records).
    (4) store log info (post-step)
    (5) commit tx

    The whole file shoul be read in a single (potentially long) transaction.

    What is the "correct" way to realize this scenario?

  7. #7

    Default

    Just some further context - the operations you're talking about aren't really clean-up operations - they have real meaning to you, even it only means "these records haven't been updated yet today." There are mechanisms in place for setup and clean-up (e.g. opening and closing files, connections, etc.) but they wouldn't serve your purposes because they occur outside of the transactions created by Spring Batch. That's why I recommended that instead of thinking of these tasks as "maintenance" tasks, that you consider them to be atomic steps that must occur in a certain order.

    Doing all your processing in a holding table first and then moving the data over in a single step lets you perform one logical transaction over several physical transactions by "committing" everything (i.e. updating production) at the end.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •