Results 1 to 10 of 10

Thread: Help needed to improve Job performance

Hybrid View

  1. #1
    Join Date
    Jul 2010
    Location
    USA
    Posts
    43

    Default Help needed to improve Job performance

    Hi,

    I have developed a batch job with chunk oriented step to do File - File processing. This step runs using multiple threads with the help of SimpleAsynchTaskExecutor. This step does not use database to save the state. I am using MapJobRepositoryFactoryBean with ResourcelessTransactionManager.

    To process 7 million records, process is taking 9 minutes to complete with throttle-limit=10 and commit-interval=1000.
    Is there a way we can still improve the performance?

    I am struck with this issue.
    Any suggestion is highly appreciated. Thanks.
    Last edited by anish555; Feb 15th, 2011 at 02:41 PM. Reason: Added more info

  2. #2
    Join Date
    Jun 2005
    Posts
    4,232

    Default

    How big is the file in MB? What platform / OS are you using?

    Does it run quicker single-threaded?

  3. #3
    Join Date
    Jul 2010
    Location
    USA
    Posts
    43

    Default

    Hi Dave thanks for the reply

    How big is the file in MB? What platform / OS are you using?
    Input File is 1GB binary zip file. Size of all(99) output files together is 2.5 GB text formatted.
    On Windows it takes 9 mins to run, on Unix 8 mins

    Does it run quicker single-threaded?
    It takes double time running single threaded.

    I changed to ThreadPoolTaskExecutor, it is stilll the same.

  4. #4
    Join Date
    Jun 2005
    Posts
    4,232

    Default

    Are you using a high-end multi-core machine with a fast disk or just a crummy laptop? Did you try it single threaded? How long does it take to unzip the input, as opposed to the processing?

  5. #5
    Join Date
    Jul 2010
    Location
    USA
    Posts
    43

    Default

    Yes I am using high end disk.

    I have tried single threaded, it takes more time(almost double time) to run than multi-threaded.

    I am just reading from the zip file without unzipping it. Coded custom Reader.
    Just reading process is taking about 2mins for reading 7.5 millions binary records are converting them to text Strings.


    Do you think writing to multiple files using multiple threads is cost effective?

  6. #6
    Join Date
    Jun 2005
    Posts
    4,232

    Default

    2 minutes to read 2.5GB of data sounds not unreasonable, but I've seen a commodity laptop read and write 2.5GB in that time. You should look at your IO stats and see if there's anything about your hardware / environment that can be tuned.

    No, I don't think a straight-through file copy job will benefit much from multi-threading. (The fact that you got even a factor of two improvement is encouraging, but not all that exceiting.)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •