Results 1 to 5 of 5

Thread: Parallel processing of XML files

  1. #1

    Default Parallel processing of XML files

    Hi,

    I am new to spring framework and spring batch framework as well. I am working on project where I need to develop design using spring batch framework which can do the following.

    - Collection of XML file from one server to another server
    - Parse XML file and load it in data into database.

    we have multiple XML files and above steps needs to be done for each XML file in parallel.

    Can some one help me with directing to sample program which demonstrate similar sort of functionality? Mainly i am looking for sample code which demonstrate running multiple task in parallel threads.

    Thanks for you help in advance.

    Regards,
    Nemish

  2. #2

    Default

    Sorry for original post – I've intended to reply to another thread. Look here for example of running chunk processing in parallel. It does not fix exactly your needs, as you want to process not records, but files in parallel. But do you really need this? Imagine that one file is much bigger than another, so the thread that process it may inefficiently continue running on one CPU, while in case of parallel chunks all CPUs will be involved (if you set number of threads equal to number of CPUs).
    Last edited by dma_k; Sep 8th, 2011 at 10:43 AM.

  3. #3

    Default

    Quote Originally Posted by dma_k View Post
    Sorry for original post – I've intended to reply to another thread. Look here for example of running chunk processing in parallel. It does not fix exactly your needs, as you want to process not records, but files in parallel. But do you really need this? Imagine that one file is much bigger than another, so the thread that process it may inefficiently continue running on one CPU, while in case of parallel chunks all CPUs will be involved (if you set number of threads equal to number of CPUs).
    Hi dma_k,

    Thanks for your reply. Yes I really needs to process multiple files in multiple processes since system is going to get multiple files from upstream application. I believe there is no use of processing it sequentially since system is going to get multiple files from upstream application. Do you believe spring can provide that functionality or better to develop customized batch framework using Thread pool and JMS?

    Also, I am not planning to process all files together. If system is going to get ~50 files then I am planning to have ~10 thread running and as soon as any one thread is free will start processing another thread. The another reason to have such flexibility is because system will not receive all the files together from upstream application and it will be in random order.

    Regards,
    Nemish

  4. #4

    Default

    In this case I would define parallel steps, each step will refer the same XMLFileProvider class that on request will acquire a new XML file. Each step should acquire / open / read / close XML file (this breaks the reader API, but is doable) and exit, when XMLFileProvider signals that there are no more files. Perhaps somebody may advise a better solution.

  5. #5
    Join Date
    May 2011
    Location
    New Delhi, India
    Posts
    157

    Default

    You can also look at partitioning. In case you need to process 10 files you could create 10 partitions.

    http://static.springsource.org/sprin...l#partitioning

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •