Writing to 2 jdbc datasources in one step v's 2 steps each with 1 datasource
Guys,
I'd like some advice on the scenario below to understand the pro's and con's of one combined step, or multiple dedicated steps. I'm worried about the cost of incurring distributed transactions in Option 1, versus the cost of re-reading in Option 2 - and any other issues people immediately see with these approaches.
If there are other worries/options or my scenario wasn't detailed enough please let me know.
Thanks
Danny
General Comments:
- Datasources A, B & C would be separate schemas/connections, and could be separate databases.
- Record set to process could be in the many millions
- Spring-batch running inside J2EE container, spawned by Quartz
OPTION 1
Step 1
- Read from input datasource A
- Call api that accepts a chunk of records and writes to datasource B
- Call api that accepts a chunk of records and writes to datasource C
Assumption: We'd have to combine the two api calls into a single writer implementation.
OR
OPTION 2
Step 1
- Read from input datasource A
- Call api that accepts a chunk of records and write to datasource B
Step 2
- Read from input datasource A
- Call api that accepts a chunk of records and write to datasource C
Assumption: We would have to re-read the same records for Step 2 that we did for Step 1.