We have developed an Extract Tool using SpringBatch. The tool when deployed in an US based server (i386, linux, jre 1.6.0_07_32) performs well compared to a deployment that was made in a server based in Tokyo (amd64, linux, jdk 1.6.0_20). The source database from where the data is extracted resides in Tokyo. The application metadata that the tool uses to connect to the database and execute the SQL and write to a destination file resides, in the US . The job spec in the applicationContext.xml is as follows
The FetchSize is set to 2000.Code:<batch:job id="dataExtractjob" incrementer="lastRunSetter" > <batch:listeners> <batch:listener ref="appJobExecutionListener" /> </batch:listeners> <batch:step id="step1" next="transferFile"> <batch:tasklet > <batch:chunk reader="itemReaderFactory" writer="dataExtractItemWriter" commit-interval="1000" /> <batch:listeners> <batch:listener ref="itemFailureLoggerListener" /> <batch:listener ref="headerCallback" /> <batch:listener ref="footerCallback" /> <batch:listener ref="promotionListener" /> </batch:listeners> </batch:tasklet> </batch:step> <batch:step id="transferFile" next="archiveFile"> <batch:tasklet ref="transferTasklet" /> </batch:step> <batch:step id="archiveFile"> <batch:tasklet ref="archiveFileTasklet" /> </batch:step> </batch:job>
Now if I look the log file, it is as follows.
From the US Server
2011-07-11 21:11:38,219 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time before Read() : Mon Jul 11 21:11:38 EDT 2011
2011-07-11 21:11:38,219 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time before mappingRow() : Mon Jul 11 21:11:38 EDT 2011
2011-07-11 21:11:38,219 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time after mappingRow() : Mon Jul 11 21:11:38 EDT 2011
2011-07-11 21:11:38,219 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time after Read() : Mon Jul 11 21:11:38 EDT 2011
2011-07-11 21:11:38,220 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time before Read() : Mon Jul 11 21:11:38 EDT 2011
2011-07-11 21:11:38,220 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time before mappingRow() : Mon Jul 11 21:11:38 EDT 2011
2011-07-11 21:11:38,221 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time after mappingRow() : Mon Jul 11 21:11:38 EDT 2011
2011-07-11 21:11:38,221 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time after Read() : Mon Jul 11 21:11:38 EDT 2011
This is from the Tokyo Server
2011-07-12 20:54:06,820 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time before Read() : Tue Jul 12 20:54:06 JST 2011
2011-07-12 20:54:06,821 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time before mappingRow() : Tue Jul 12 20:54:06 JST 2011
2011-07-12 20:54:06,823 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time after mappingRow() : Tue Jul 12 20:54:06 JST 2011
2011-07-12 20:54:06,823 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time after Read() : Tue Jul 12 20:54:06 JST 2011
2011-07-12 20:54:06,995 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time before Read() : Tue Jul 12 20:54:06 JST 2011
2011-07-12 20:54:06,995 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time before mappingRow() : Tue Jul 12 20:54:06 JST 2011
2011-07-12 20:54:06,996 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time after mappingRow() : Tue Jul 12 20:54:06 JST 2011
2011-07-12 20:54:06,996 [main] INFO com.nomura.gfa.clet.util.LogCreation - Time after Read() : Tue Jul 12 20:54:06 JST 2011
Check out the difference in time after every read cycle.. In the US server it is hardly 1 millisecond, where as in the Tokyo Server it is around 170 ms.. Multipy it by 500000 records and you can now see the significant delay in the Tokyo server that I am talking about..
Any ideas what is causing this delay ?


Reply With Quote