Lock wait timeout exceeded?程式碼該優化了

  • 2019 年 10 月 22 日
  • 筆記

背景

最近在排查問題時發現,偶爾會發生關於資料庫鎖超時的現象,會發生像如下的報錯資訊:

Exception in thread "pool-3-thread-1" org.springframework.dao.CannotAcquireLockException:  ### Error updating database.  Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Lock wait timeout exceeded; try restarting transaction  ### The error may involve com.zr.center.mybatis.auto.mapper.UserMapper.updateByExampleSelective-Inline  ### The error occurred while setting parameters  ### SQL: update user      SET user_name = ?                          WHERE (  user_id = ? )  ### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Lock wait timeout exceeded; try restarting transaction  ; ]; Lock wait timeout exceeded; try restarting transaction; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Lock wait timeout exceeded; try restarting transaction      at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:262)      at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72)      at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:75)      at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:447)      at com.sun.proxy.$Proxy101.update(Unknown Source)      at org.mybatis.spring.SqlSessionTemplate.update(SqlSessionTemplate.java:295)      at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:59)      at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:53)      at com.sun.proxy.$Proxy103.updateByExampleSelective(Unknown Source)      at com.zr.center.framework.web.service.BaseService.updateByExampleSelective(BaseService.java:97)      at com.zr.center.api.test.service.TestService.updateUserName(TestService.java:34)      at com.zr.center.api.test.service.TestService$$FastClassBySpringCGLIB$$bd3aa32.invoke(<generated>)      at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)      at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:746)      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)      at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:294)      at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:98)      at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)      at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688)      at com.zr.center.api.test.service.TestService$$EnhancerBySpringCGLIB$$59b19302.updateUserName(<generated>)      at com.zr.center.ApplicationTests.lambda$testTxLockWaiting$0(ApplicationTests.java:32)      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)      at java.lang.Thread.run(Thread.java:745)  Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Lock wait timeout exceeded; try restarting transaction      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)      at java.lang.reflect.Constructor.newInstance(Constructor.java:423)      at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)      at com.mysql.jdbc.Util.getInstance(Util.java:408)      at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:952)      at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3976)      at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3912)      at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2530)      at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2683)      at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2486)

排查

經過排查,DBA給出的日誌中並未有死鎖,死鎖的原因排除,查詢業務日誌發現在高並發的時期有時會有重複請求過來,也有一個服務在處理某個邏輯時會發一條mq消息,而同時會消費這條消息,此時也會導致鎖超時。超時原因就是因為一個事務中處理的邏輯過多,有調外部服務(超時),有更新其它多張表的操作,這樣就會導致後面事務請求超時,報以上錯誤。

重現步驟

  • 資料庫配置

    關於配置資訊查看,可以看到事務隔離是RR,事務鎖等待時長為默認的50s

  • 事務程式碼

/**   * @description: 用戶服務超時測試   * @author: chong guo   * @create: 2018-12-10 14:44   */  @Service  @Slf4j  public class TestService extends BaseService<User, UserExample, Long> {        /**       * 更新用戶名稱       *       * @param userId       * @param name       */      @Transactional(rollbackFor = Exception.class)      public void updateUserName(Long userId, String name) throws InterruptedException {          log.info("開始更新用戶名【{}】,用戶ID為【{}】", name, userId);          UserExample userExample = new UserExample();          userExample.createCriteria().andUserIdEqualTo(userId);            User user = new User();          user.setUserName(name);            super.updateByExampleSelective(user, userExample);            // 模擬業務超時,有可能調用外部遠程服務超時,也有可能處理其它邏輯          Thread.sleep(55000);          log.info("結束更新用戶名【{}】,用戶ID為【{}】", name, userId);        }    }  
  • 模擬並發

/**   * @author  Chong Guo   */  @RunWith(SpringRunner.class)  @SpringBootTest  @Slf4j  public class ApplicationTests {      @Resource      TestService testService;        private final int threadCount = 5;        @Test      public void testTxLockWaiting() throws InterruptedException {          CountDownLatch countDownLatch = new CountDownLatch(threadCount);            ExecutorService threadPool = Executors.newFixedThreadPool(300);          for (int i = 0; i < threadCount; i++) {              threadPool.execute(() -> {                  try {                      testService.updateUserName(611526166943105024L, "chongguo");                  } catch (InterruptedException e) {                      // TODO Auto-generated catch block                      e.printStackTrace();                  } finally {                      countDownLatch.countDown();                  }                });          }          countDownLatch.await();          threadPool.shutdown();          log.info("Test tx lock is over");          Thread.sleep(100000);      }  }

運行程式碼後會現三次失敗,二次成功,失敗原因都是鎖超時

總結

  1. 程式碼優化,將大事務中無關的邏輯拆出來,非同步處理
  2. 業務提供方涉及到外部調用的介面需要把超時時長設置短一些
  3. 如果調用方有重試機制,可以針對業務去掉重試,將connetion time設置大一些