Results 1 to 3 of 3

Thread: Writing to 2 jdbc datasources in one step v's 2 steps each with 1 datasource

Hybrid View

  1. #1

    Default Writing to 2 jdbc datasources in one step v's 2 steps each with 1 datasource

    Guys,

    I'd like some advice on the scenario below to understand the pro's and con's of one combined step, or multiple dedicated steps. I'm worried about the cost of incurring distributed transactions in Option 1, versus the cost of re-reading in Option 2 - and any other issues people immediately see with these approaches.

    If there are other worries/options or my scenario wasn't detailed enough please let me know.

    Thanks

    Danny

    General Comments:
    • Datasources A, B & C would be separate schemas/connections, and could be separate databases.
    • Record set to process could be in the many millions
    • Spring-batch running inside J2EE container, spawned by Quartz

    OPTION 1
    Step 1
    - Read from input datasource A
    - Call api that accepts a chunk of records and writes to datasource B
    - Call api that accepts a chunk of records and writes to datasource C
    Assumption: We'd have to combine the two api calls into a single writer implementation.

    OR

    OPTION 2
    Step 1
    - Read from input datasource A
    - Call api that accepts a chunk of records and write to datasource B
    Step 2
    - Read from input datasource A
    - Call api that accepts a chunk of records and write to datasource C
    Assumption: We would have to re-read the same records for Step 2 that we did for Step 1.

  2. #2
    Join Date
    Sep 2008
    Location
    Bangalore, India
    Posts
    9

    Default

    Hi Danny,

    Given the fact that two-phase-commit is a blocking protocol, I would implement the use case with Option-II.

    Now about the extra cost involved in re-read operation, depending on the use case you might want to implement it differently. If the two resultant datasets are symmetric (but why would it be?), I might implement it this way:
    1. Read from A --> perform processing --> Write to B (Spring Batch)
    2. Write from B --> C using database features

  3. #3
    Join Date
    Jun 2005
    Posts
    4,230

    Default

    Spring Batch has its own data in one of those databases (or maybe a different one altogether), so neither of your options avoids the distributed transaction. You probably have to bite the XA bullet. Are you sure it is a problem?

    However, an interesting alternative that I never thought of before is to use a different datasource (and transaction manager) for each step in the job. Might just about work.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •