Results 1 to 8 of 8

Thread: Grid Size in Spring batch partition Handler

Hybrid View

  1. #1

    Default Grid Size in Spring batch partition Handler

    Hi,

    I have batch job which reads data from bulk files, process it and insert in DB.

    I'm using spring's partitioning features to read and process using the default partition handler.

    Code:
     <bean class="org.spr...TaskExecutorPartitionHandler">
              <property name="taskExecutor" ref="taskExecutor"/>
              <property name="step" ref="readFromFile" />
              <property name="gridSize" value="10" />
        </bean>
    What is the significance of the gridSize here and what is the optimum size ? I have configured in such a way that it is equal to the concurrency in taskExecutor.

  2. #2
    Join Date
    May 2011
    Location
    New Delhi, India
    Posts
    157

    Default

    Grid size determines the number of partitions that will be created by Spring Batch. The size will be determined by the type of processing that you are doing in the job, SLA's in terms of processing time for the job, hardware/resources availaible for the job, etc.

  3. #3
    Join Date
    May 2011
    Location
    New Delhi, India
    Posts
    157

    Default

    Duplicate post by mistake..
    Last edited by rishishehrawat; Oct 14th, 2011 at 07:45 AM.

  4. #4

    Default

    @rishishehrawat

    Hi,
    Thanks for the reply.

    So the grid size determines the number of partitions. And if grid size is determined by factors other than the size of the input to be processed , number of splits is going to be same for a job irrespective of the size of the input(size of the file/number of rows in DB, whatever it is). Is it possible to configure grid size as per the size of the input in hand ?

  5. #5
    Join Date
    Oct 2011
    Posts
    1

    Default

    I to need to know how to populate grid size from a previous step , I am able to set a value from a tasklet and on this value ,next step is getting called but when i am trying set the some value to the partition handler's gridsize. i am getting below exception


    at org.springframework.beans.factory.support.Abstract AutowireCapableBeanFactory.doCreateBean(AbstractAu towireCapableBeanFactory.java:472)
    ... 22 more
    Caused by: java.lang.NumberFormatException: For input string: "{stepExecutionContext[GridSize]}"
    at java.lang.NumberFormatException.forInputString(Unk nown Source)
    at java.lang.Integer.parseInt(Unknown Source)
    at java.lang.Integer.valueOf(Unknown Source)
    at java.lang.Integer.decode(Unknown Source)
    at org.springframework.util.NumberUtils.parseNumber(N umberUtils.java:157)
    at org.springframework.beans.propertyeditors.CustomNu mberEditor.setAsText(CustomNumberEditor.java:114)
    at org.springframework.beans.TypeConverterDelegate.do ConvertTextValue(TypeConverterDelegate.java:382)
    at org.springframework.beans.TypeConverterDelegate.do ConvertValue(TypeConverterDelegate.java:358)
    at org.springframework.beans.TypeConverterDelegate.co nvertIfNecessary(TypeConverterDelegate.java:173)
    at org.springframework.beans.TypeConverterDelegate.co nvertIfNecessary(TypeConverterDelegate.java:138)
    at org.springframework.beans.BeanWrapperImpl.convertF orProperty(BeanWrapperImpl.java:386)
    ... 26 more

  6. #6

    Default Performance Issue While reading the data more than 500 Files with syncTaskExecutor

    Hi Michel,

    I need help from you.

    I am using partinor handler to read the multiple files.

    I am reading and writing the multiple files(10 no of files) in DB sucessfully and there is no issue with that.
    It takes within 1 min all operations (reading and wrting to DB)

    When I trying to do with morethan 500 files, it takes morethan 10min to insert into DB.
    It get struck for 4 to 5 mins while the reading the files..afterthat It start reading the files and wrting into DB.
    I dont know why it is happening even also I set commit interval to 30000 but result is same.
    One morething i am using syncTaskExecutor for inserting purpose.
    Here is code below
    Code:
    <job id="file_partition_Job"  job-repository="jobRepository" >		
    		<step id="fileProcessStep" parent="step1:master" />			
    	</job>	
    
    
    <beans:bean name="step1:master" class="org.springframework.batch.core.partition.support.PartitionStep">
    		<beans:property name="jobRepository" ref="jobRepository" />
    		<beans:property name="stepExecutionSplitter">
    			<beans:bean class="org.springframework.batch.core.partition.support.SimpleStepExecutionSplitter">
    				<beans:constructor-arg ref="jobRepository" />
    				<beans:constructor-arg ref="step1" />
    				<beans:constructor-arg>
    					<beans:bean class="org.springframework.batch.core.partition.support.MultiResourcePartitioner">
    						<beans:property name="resources" value="file:c:/data/input/splitfiles/*.SHD" />
    					</beans:bean>
    				</beans:constructor-arg>
    			</beans:bean>
    		</beans:property>
    		<beans:property name="partitionHandler">
    			<beans:bean class="org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler">
    				<beans:property name="taskExecutor" ref="syncTaskExecutor" />
    				<beans:property name="step" ref="step1" />
    			</beans:bean>
    		</beans:property>
    	</beans:bean>
    
    <bean id="syncTaskExecutor" class="org.springframework.core.task.SyncTaskExecutor" />
    
    <step id="step1">
    		<tasklet job-repository="jobRepository" transaction-manager="jobRepository-transactionManager">			
    			<chunk reader="playerFileItemReader"  writer="playerFileItemWriter"  commit-interval="5000" />
    			
    		</tasklet>
    	</step>
    
    <beans:bean id="playerFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">		
    		<beans:property name="resource" value="#{stepExecutionContext[fileName]}" />
    		<beans:property name="strict" value="false" />
    
    		<beans:property name="lineMapper">
    			<beans:bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">				
    				<beans:property name="lineTokenizer">				
    					<beans:bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
    						<beans:property name="delimiter" value="="/>
    						<beans:property name="names" value="FieldKey,FieldValue" />
    					</beans:bean>
    				</beans:property>
    				<beans:property name="fieldSetMapper">
    					<beans:bean class="com.hsbc.readers.PlayerFieldSetMapper" />
    				</beans:property>
    			</beans:bean>
    		</beans:property>
    	</beans:bean>
    Please advice. it is very urgent to reolsve this issue as per my project deadline.

    Thanks

  7. #7
    Join Date
    May 2011
    Location
    New Delhi, India
    Posts
    157

    Default

    It might be a memory issue. You should try to increase the heap size, as the memory required to process 500 files will be much higher than that for 10 files. You can enable GC logging also, which will give you an indication if it is a memory issuse. You can also take a thread dump to see what is being processed when the job is stuck.

  8. #8

    Default

    Quote Originally Posted by Prabhabati Moharana View Post
    Hi Michel,

    I need help from you.

    I am using partinor handler to read the multiple files.

    I am reading and writing the multiple files(10 no of files) in DB sucessfully and there is no issue with that.
    It takes within 1 min all operations (reading and wrting to DB)

    When I trying to do with morethan 500 files, it takes morethan 10min to insert into DB.
    It get struck for 4 to 5 mins while the reading the files..afterthat It start reading the files and wrting into DB.
    I dont know why it is happening even also I set commit interval to 30000 but result is same.
    One morething i am using syncTaskExecutor for inserting purpose.
    Here is code below
    Code:
    <job id="file_partition_Job"  job-repository="jobRepository" >		
    		<step id="fileProcessStep" parent="step1:master" />			
    	</job>	
    
    
    <beans:bean name="step1:master" class="org.springframework.batch.core.partition.support.PartitionStep">
    		<beans:property name="jobRepository" ref="jobRepository" />
    		<beans:property name="stepExecutionSplitter">
    			<beans:bean class="org.springframework.batch.core.partition.support.SimpleStepExecutionSplitter">
    				<beans:constructor-arg ref="jobRepository" />
    				<beans:constructor-arg ref="step1" />
    				<beans:constructor-arg>
    					<beans:bean class="org.springframework.batch.core.partition.support.MultiResourcePartitioner">
    						<beans:property name="resources" value="file:c:/data/input/splitfiles/*.SHD" />
    					</beans:bean>
    				</beans:constructor-arg>
    			</beans:bean>
    		</beans:property>
    		<beans:property name="partitionHandler">
    			<beans:bean class="org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler">
    				<beans:property name="taskExecutor" ref="syncTaskExecutor" />
    				<beans:property name="step" ref="step1" />
    			</beans:bean>
    		</beans:property>
    	</beans:bean>
    
    <bean id="syncTaskExecutor" class="org.springframework.core.task.SyncTaskExecutor" />
    
    <step id="step1">
    		<tasklet job-repository="jobRepository" transaction-manager="jobRepository-transactionManager">			
    			<chunk reader="playerFileItemReader"  writer="playerFileItemWriter"  commit-interval="5000" />
    			
    		</tasklet>
    	</step>
    
    <beans:bean id="playerFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">		
    		<beans:property name="resource" value="#{stepExecutionContext[fileName]}" />
    		<beans:property name="strict" value="false" />
    
    		<beans:property name="lineMapper">
    			<beans:bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">				
    				<beans:property name="lineTokenizer">				
    					<beans:bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
    						<beans:property name="delimiter" value="="/>
    						<beans:property name="names" value="FieldKey,FieldValue" />
    					</beans:bean>
    				</beans:property>
    				<beans:property name="fieldSetMapper">
    					<beans:bean class="com.hsbc.readers.PlayerFieldSetMapper" />
    				</beans:property>
    			</beans:bean>
    		</beans:property>
    	</beans:bean>
    Please advice. it is very urgent to reolsve this issue as per my project deadline.

    Thanks
    I gets stuck for 4 or 5 min because you probably have an IO burst at the beginning. Since your processing is partitioned, you have X threads reading from X files. You may want to decrease the number of concurrent partitions (i.e. grid size)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •