Results 1 to 6 of 6

Thread: BatchUpdate in MySql - Performance

  1. #1
    Join Date
    Apr 2008
    Posts
    174

    Question BatchUpdate in MySql - Performance

    Hello All,

    I am reading from a file and writing to a database. As I am in the process of creating a prototype to go with Spring-Batch, I am using tradeDAO sample that came with the samples of Spring-Batch.

    I am using MySQL and have about 50,000 records in my file. Just read all data in the file it takes me about 10-12 minutes, which I believe is long.

    Then to read and insert them into MySQL database it takes me 40 minutes.

    This is just a sample data and eventually we will get millions of data.

    My question is this the normal performance I can get using Spring-Batch, is there any fine-tuning mechanism available?? If yes, how much would be the performance improvement??

    Currently the whole batch process in mainframe gets completed in 1hr 30 minutes (including file read, and other business logic processing for millions of records). I am little concerned about the performance here.

    Any help and input to improve performance would be GREAT.

    Thanks!

  2. #2

    Default

    Make sure you configure the commit interval property on the step to some reasonable value (it is 1 by default, which means metadata is saved after processing each item).

    Next you can consider writing to database using batch updates (see batchUpdateJob in samples).

  3. #3
    Join Date
    Apr 2008
    Posts
    174

    Thumbs up worked

    Thanks for the input!! I set the commitInterval to 100 and it improved the performance dramatically! From 40 minutes it came down somewhere between 5-6 minutes. (This is without BatchUpdate though)

    Thanks!!

  4. #4
    Join Date
    Mar 2008
    Posts
    15

    Default some performance tests I recorded with spring batch 1.0

    Test:
    Transfer data from a table (with 23 columns) from one database to another database.No buisiness logic done.

    Repository database : apache derby
    Number of rows processed : 110662

    spring_batch_1.0_tests.jpg

    Sybase bcp tool was used to do the same test case from sybase database to another sybase database. 110662 rows were transferred in 15.68 sec
    Obviously bcp was very fast in transferring data from one sybase database to another sybase database. What spring batch provides is object interaction and it is not a bulk copy tool .Business process can be easily integrated with spring batch with simple java objects.

  5. #5
    Join Date
    Dec 2006
    Posts
    1,061

    Default

    I would say that's a pretty fair assessment. Spring Batch isn't trying to replace a bulk data-load tool. As you said, it allows for applying business logic with Java, rather than some scripting language with etl or not at all without it. One caveat I would add to that is that the status tables provide a nice advantage if you have a lot of batch processes running. It provides you with a consistent place to look (one table instead of many tables and/or many log files) for the status of a process. Even if using a simple data load tool such as bcp or SQLOADER, you might see an advantage from calling such a tool from a Tasklet so that you could still see what time it was kicked off, whether it completed successfully etc, in a consistent way across multiple processes.

  6. #6
    Join Date
    Mar 2008
    Posts
    15

    Default

    Thanks Lucas,
    I will emphasize on these points too in my presentation that I will be doing in next few days.

    Everyone in spring batch team,
    Thanks for the great documentation.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •