Page 1 of 2 12 LastLast
Results 1 to 10 of 14

Thread: Job performing poorly

  1. #1

    Default Job performing poorly

    Hi,

    We are trying to have a Spring Batch process using partition and JdbcDaoSupport for our database update and insert operations. But it is faring pretty badly in performance with the existing legacy batch process using COBOL. Even a standalone plain java implementation with the same business logic using straight JDBC performs several times better than Spring Batch implemention for the same data files of total of 1/2 million rows. The process consists of reading a control file that subsequently specifies the fixed length data files that need to be processed. Basically, it reads data, does minor validation and does couple of inserts and updates on DB2 database tables as well as write to output files. We also, tried openJpa instead of the Spring JdbcDaoSupport. The result was similar.
    The plain Java implementation takes about 7 seconds whereas the Spring Batch with JdbcDaoSupport takes about 30+ seconds.

    Is there anything we are doing wrong? Any suggestion would be highly appreciated.

    Here's some of the configuration files -

    iffr-context.xml:
    Code:
    ...
    <context:annotation-config/>
    
        <import resource="classpath:/job-launcher-context.xml" />
        <import resource="classpath:/iffr-job-context.xml" />
        <import resource="classpath:/iffr-jdbc-context.xml" />
    ...
    iffr-job-context.xml:
    Code:
    <job id="iffrJob" xmlns="http://www.springframework.org/schema/batch">
        <step id="step">
          <partition step="validation" partitioner="partitioner">
            <handler grid-size="2" task-executor="taskExecutor" />
          </partition>
        </step>
        <listeners>
          <listener ref="jobListener"/>
        </listeners>
      </job>
    
      <bean id="partitioner" scope="step" class="org.myproject.batch.item.ControlFilePartitioner">
        <property name="resource" value="#{jobParameters['inputResource']}" />
        <property name="inputDir" value="#{jobParameters['inputDir']}" />
      </bean>
    
      <bean id="taskExecutor" class="org.springframework.core.task.SyncTaskExecutor" />
    
      <bean id="jobListener" class="org.myproject.batch.item.IFFRJobExecutionListener" />
    
      <step id="validation" xmlns="http://www.springframework.org/schema/batch">
        <tasklet>
          <chunk reader="flatFileItemReader" processor="awrProcessor" writer="awrWriter" commit-interval="10"/>
        </tasklet>
      </step>
    
      <!-- Readers -->
      <bean id="flatFileItemReader" scope="step" class="org.springframework.batch.item.file.FlatFileItemReader">
        <property name="resource" value="#{stepExecutionContext[fileName]}" />
        <property name="lineMapper" ref="flatFileLineMapper"/>
        <property name="recordSeparatorPolicy">
          <bean class="org.myproject.batch.item.BlankLineSeparatorPolicy"/>
        </property>
      </bean>
    
      <bean id="flatFileLineMapper"
            class="org.springframework.batch.item.file.mapping.PatternMatchingCompositeLineMapper">
        <property name="tokenizers">
          <map>
            <entry key="R*" value-ref="tokenizerBase" />
          </map>
        </property>
        <property name="fieldSetMappers">
          <map>
            <entry key="R*" value-ref="fieldSetMapperParent" />
          </map>
        </property>
      </bean>
    
      <bean id="tokenizerBase" class="org.springframework.batch.item.file.transform.FixedLengthTokenizer">
        <property name="names" value="RecordType,Data" />
        <property name="columns" value="1-2,1-512" />
      </bean>
    
      <bean id="fieldSetMapperParent" class="org.myproject.batch.item.EFW2FieldSetMapper" />
    
      <!--  Writers  -->
      <bean id="awrWriter" scope="step" class="org.myproject.batch.item.AWRWriter" >
        <property name="transDir" value="#{jobParameters['transDir']}" />
        <property name="invalDir" value="#{jobParameters['invalDir']}" />
        <property name="resource" value="#{stepExecutionContext[fileName]}" />
        <!-- property name="awrRepository" ref="persistence layer is referenced here" / -->
      </bean>
    
      <!--  Processors -->
      <bean id="awrProcessor" scope="step" class="org.myproject.batch.item.AWRItemProcessor">
        <property name="name" value="AWRItemProcessor" />
      </bean>
    
    </beans>
    iffr-jdbc-context.xml:
    Code:
    <!-- Database configuration would go here -->
    	<bean id="transactionManager"
    		  class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
    		<property name="dataSource" ref="dataSource" />
    	</bean>
    
    	<bean id="persistWageFileService" class="org.myproject.service.impl.PersistWageFileServiceJDBCImpl" >
    		<property name="errorLogDAO" ref="errorLogDAO" />
    		<property name="errorSummaryDAO" ref="errorSummaryDAO" />
    		<property name="submissionErrorDAO" ref="submissionErrorDAO" />
    		<property name="submissionDAO" ref="submissionDAO" />
    		<property name="wageFileInfoDAO" ref="wageFileInfoDAO" />
    		<property name="submissionStatusDAO" ref="submissionStatusDAO" />
    	</bean>
    	
    	<bean id="wageFileInfoDAO" class="org.myproject.dao.impl.JDBCWageFileInfoDAOImpl">
    		<property name="dataSource" ref="dataSource" />
    		<property name="messageSource" ref="messageSource" />
    	</bean>
    	
    	<bean id="submissionDAO" class="org.myproject.dao.impl.JDBCSubmissionDAOImpl">
    		<property name="dataSource" ref="dataSource" />
    		<property name="messageSource" ref="messageSource" />
    	</bean>
    	
        <bean id="submissionStatusDAO" class="org.myproject.dao.impl.JDBCSubmissionStatusDAOImpl">
            <property name="dataSource" ref="dataSource" />
            <property name="messageSource" ref="messageSource" />
        </bean>
        
    	<bean id="submissionErrorDAO" class="org.myproject.dao.impl.JDBCSubmissionErrorDAOImpl">
    		<property name="dataSource" ref="dataSource" />
    		<property name="messageSource" ref="messageSource" />
    	</bean>
    	
    	<bean id="errorSummaryDAO" class="org.myproject.dao.impl.JDBCErrorSummaryDAOImpl">
    		<property name="dataSource" ref="dataSource" />
    		<property name="messageSource" ref="messageSource" />
    	</bean>
    	
        <bean id="errorLogDAO" class="org.myproject.dao.impl.JDBCErrorLogDAOImpl">
            <property name="dataSource" ref="dataSource" />
            <property name="messageSource" ref="messageSource" />
        </bean>
    
    	<bean id="messageSource"
    		class="org.springframework.context.support.ResourceBundleMessageSource">
    		<property name="basename">
    			<value>sql</value>
    		</property>
    	</bean>
    
    </beans>

  2. #2

    Default

    So far identified the FlatFileItemReader is not performing well in reading large files (5 files of different size containing a total of 1/2 million records). FlatFileItemReader is using the default BufferedReader for reading as we didn't want to extend our custom class with the existing FlatFileItemReader. As a test we tried with our barebone custom File Item Writer using BufferedReader with configurable buffer size which sped up the process a little bit faster trying out different buffer sizes. I think the FlatFileItemWriter is using NIOs FileChannel for faster writing.
    Question, what would be the best approach to solve our problem of slow reading of large data files?

    Thanks

  3. #3

    Default

    Sorry for the typos in the earlier posting. Re-posting -

    So far identified the FlatFileItemReader is not performing well in reading large files (5 files of different size containing a total of 1/2 million records). FlatFileItemReader is using the default BufferedReader for reading. As we didn't want to extend our custom class with the existing FlatFileItemReader we wrote our barebone custom File Item Reader using BufferedReader with configurable buffer size. which sped up the process a little bit faster by trying out different buffer sizes. I think the FlatFileItemWriter is using NIOs FileChannel for faster writing.
    Question, what would be the best approach to solve our problem of slow reading of large data files?

    Thanks

  4. #4

    Default

    If you have a raw home made implementation that takes 7 sec and your job takes 30 sec, it would help to identify what are the differences between the two. I can see you are processing 2 files in parallel. Are you doing the same thing with your simple example?

  5. #5

    Default

    The business logic code almost identical. Both are processing the same 5 files with a total of 1/2 million records. The home made one is single threaded while using partition for parallel processing of 5 files in the Spring Batch one.

    Thanks

  6. #6

    Default

    can you remove the partitioning to see if that makes any difference? That could be an IO burst on your system actually.

  7. #7

    Default

    I'm planning on doing the same. Eventually I guess I have to have some sort of multi-threaded batch process in production to handle 30 million records with goal of 100 million records.

    Thanks

  8. #8

    Default

    By the way, would you have such experience handling large volume of records? If so, what approach did you take?

    Thanks

  9. #9

    Default

    it would probably not fit with what you do but this might help:
    http://forum.springsource.org/showth...FileItemReader

  10. #10

    Default

    I read your thread. Just curious what did you eventually end up doing? Single or multi-thread? Looks like there isn't much difference. Did you see any other performance difference like CPU usage?

    Thanks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •