Hi all,

We are experiencing some problems with our Spring Batch JobRepository. Normally, our batch jobs run without any issue, the job gets finished and the steps are logged in the JobRepository.

The problem occurs when we have a long running job, i.e. a tasklet commits a large set of data into our production database, this can sometime take more than 5 hours. We got exceptions like the one below:

ERROR JobRepository failure forcing exit with unknown status
org.springframework.dao.RecoverableDataAccessExcep tion: PreparedStatementCallback; SQL [UPDATE BATCH_STEP_EXECUTION_CONTEXT SET SHORT_CONTEXT = ?, SERIALIZED_CONTEXT = ? WHERE STEP_EXECUTION_ID = ?]; The last packet successfully received from the server was51849 seconds ago.The last packet sent successfully to the server was 51849 seconds ago, which is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.; nested exception is com.mysql.jdbc.exceptions.jdbc4.CommunicationsExce ption: The last packet successfully received from the server was51849 seconds ago.The last packet sent successfully to the server was 51849 seconds ago, which is longer than the server configured value of 'wait_timeout'. You should consider either e!
xpiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.
at org.springframework.jdbc.support.SQLExceptionSubcl assTranslator.doTranslate(SQLExceptionSubclassTran slator.java:98) ~[spring-jdbc-3.1.0.RELEASE.jar:3.1.0.RELEASE]

We have tried four different connection pool implementations, and working with our DBA, we are pretty sure that the connection pools are working properly, all the connections in the pool get tested periodically. However, we have noticed that there is one connection to the JobRepository always open and doing nothing.

We are wondering whether anyone has come across this problem before? Any suggestions are welcome.

Many thanks,
Rui