常用的重试技术—如何优雅的重试

2019 年 10 月 3 日
筆記

背景

分布式环境下，重试是高可用技术中的一个部分，大家在调用RPC接口或者发送MQ时，针对可能会出现网络抖动请求超时情况采取一下重试操作，自己简单的编写重试大多不够优雅，而重试目前已有很多技术实现和框架支持，但也是有个有缺点，本文主要对其中进行整理，以求找到比较优雅的实现方案；

重试在功能设计上需要根据应用场景进行设计，读数据的接口比较适合重试的场景，写数据的接口就需要注意接口的幂等性了，还有就是重试次数如果太多的话会导致请求量加倍，给后端造成更大的压力，设置合理的重试机制是关键；

重试技术实现

本文整理比较常见的重试技术实现：
1、Spring Retry重试框架；
2、Guava Retry重试框架；
3、Spring Cloud 重试配置；

具体使用面进行整理：

1、 Spring Retry重试框架

SpringRetry使用有两种方式：

注解方式
最简单的一种方式

@Retryable(value = RuntimeException.class,maxAttempts = 3, backoff = @Backoff(delay = 5000L, multiplier = 2))

设置重试捕获条件，重试策略，熔断机制即可实现重试到熔断整个机制，这种标准方式查阅网文即可;
这里介绍一个自己处理熔断的情况，及不用 @Recover 来做兜底处理，继续往外抛出异常，代码大致如下：
Service中对方法进行重试:

@Override@Transactional      @Retryable(value = ZcSupplyAccessException.class,maxAttempts = 3,backoff = @Backoff(delay = 2000,multiplier = 1.5))      public OutputParamsDto doZcSupplyAccess(InputParamsDto inputDto) throws ZcSupplyAccessException {          //1. 校验         ....          //2. 数据转换        ....          //3、存储          try {              doSaveDB(ioBusIcsRtnDatList);              log.info("3.XXX-数据接入存储完成");          } catch (Exception e) {              log.info("3.XXX-数据接入存储失败{}", e);              throw new ZcSupplyAccessException("XXX数据接入存储失败");          }          return new OutputParamsDto(true, "XXX处理成功");      }

Controller中捕获异常进行处理，注意这里不用异常我们需要进行不同的处理，不能在@Recover 中进行处理，以免无法在外层拿到不同的异常；

@PostMapping("/accessInfo")      public OutputParamsDto accessInfo( @RequestBody InputParamsDto inputDto ){            log.info("接入报文为："+JSONUtil.serialize(inputDto));          OutputParamsDto output = validIdentity(inputDto);          if(output==null || output.getSuccess()==false){              return output;          }          log.info("Pre.1.安全认证通过");          IAccessService accessService = null;          try {              ....              accessService = (IAccessService) ApplicationContextBeansHolder.getBean(param.getParmVal());              //先转发(异常需处理)              output = accessService.doZcSupplyTranfer(inputDto);              //后存储(异常不处理)              accessService.doZcSupplyAccess(inputDto);          } catch (ZcSupplyTransferException e){              log.error("转发下游MQ重试3次均失败,请确认是否MQ服务不可用");              return new OutputParamsDto(false,"转发下游MQ重试3次均失败,请确认是否MQ服务不可用");          } catch (ZcSupplyAccessException e){              log.error("接入存储重试3次均失败,请确认是否数据库不可用");          } catch (Exception e) {              log.error("通过bean名调用方法和处理发生异常："+e);              return new OutputParamsDto(false,"通过bean名调用方法和处理发生异常");          }          ...            return output;        }

注意：
1、 @Recover中不能再抛出Exception，否则会报无法识别该异常的错误；
2、以注解的方式对方法进行重试，重试逻辑是同步执行的，重试的“失败”针对的是Throwable，如果你要以返回值的某个状态来判定是否需要重试，可能只能通过自己判断返回值然后显式抛出异常了。

方法式
注解式只是让我们使用更加便捷，但是有一定限制，比如要求抛异常才能重试，不能基于实体，Recover方法如果定义多个比较难指定具体哪个，尤其是在结构化的程序设计中，父子类中的覆盖等需要比较小心，SpringRetry提供编码方式可以提高灵活性，返回你自定义的实体进行后续处理，也更加友好。

下面代码中RecoveryCallback部分进行了异常的抛出，这里也可以返回实体对象，这样就比注解式更友好了。

import com.alibaba.fastjson.JSONObject;  import com.alibaba.fastjson.serializer.SerializerFeature;  import lombok.extern.slf4j.Slf4j;  import org.springframework.beans.factory.annotation.Autowired;  import org.springframework.beans.factory.annotation.Value;  import org.springframework.cloud.context.config.annotation.RefreshScope;  import org.springframework.retry.RecoveryCallback;  import org.springframework.retry.RetryCallback;  import org.springframework.retry.RetryContext;  import org.springframework.retry.backoff.ExponentialBackOffPolicy;  import org.springframework.retry.backoff.FixedBackOffPolicy;  import org.springframework.retry.policy.CircuitBreakerRetryPolicy;  import org.springframework.retry.policy.SimpleRetryPolicy;  import org.springframework.retry.support.RetryTemplate;  import org.springframework.stereotype.Component;    import java.time.LocalTime;  import java.util.Collections;  import java.util.Map;    /**   * <p>   * 系统 <br>   * <br>   * Created by    on 2019/9/1016:12  <br>   * Revised by [修改人] on [修改日期] for [修改说明]<br>   * </p>   */  @Slf4j  @Component  @RefreshScope  public class ZcSupplySynRemoteRetryHandler {        @Autowired      RestTemplateFactory restTemplateFactory;        final RetryTemplate retryTemplate = new RetryTemplate();        //简单重试策略      final SimpleRetryPolicy retryPolicy = new SimpleRetryPolicy(3, Collections.<Class<? extends Throwable>, Boolean>              singletonMap(ZcSupplySynRemoteException.class, true));        @Value("${retry.initialInterval}")      private String initialInterval;        @Value("${retry.multiplier}")      private String multiplier;        /**       * 重试处理       *       * @param reqMap       * @return       * @throws ZcSupplySynRemoteException       */      public  Map<String, Object> doSyncWithRetry(Map<String, Object> reqMap, String url) throws ZcSupplySynRemoteException {          //熔断重试策略          CircuitBreakerRetryPolicy cbRetryPolicy = new CircuitBreakerRetryPolicy(new SimpleRetryPolicy(3));          cbRetryPolicy.setOpenTimeout(3000);          cbRetryPolicy.setResetTimeout(10000);            //固定值退避策略          FixedBackOffPolicy fixedBackOffPolicy = new FixedBackOffPolicy();          fixedBackOffPolicy.setBackOffPeriod(100);            //指数退避策略          ExponentialBackOffPolicy exponentialBackOffPolicy = new ExponentialBackOffPolicy();          exponentialBackOffPolicy.setInitialInterval(Long.parseLong(initialInterval));          exponentialBackOffPolicy.setMultiplier(Double.parseDouble(multiplier));            //设置策略          retryTemplate.setRetryPolicy(retryPolicy);          retryTemplate.setBackOffPolicy(exponentialBackOffPolicy);            //重试回调          RetryCallback<Map<String, Object>, ZcSupplySynRemoteException> retryCallback = new RetryCallback<Map<String, Object>, ZcSupplySynRemoteException>() {              /**               * Execute an operation with retry semantics. Operations should generally be               * idempotent, but implementations may choose to implement compensation               * semantics when an operation is retried.               *               * @param context the current retry context.               * @return the result of the successful operation.               * @throws ZcSupplySynRemoteException of type E if processing fails               */              @Override              public Map<String, Object> doWithRetry(RetryContext context) throws ZcSupplySynRemoteException {                  try {                      log.info(String.valueOf(LocalTime.now()));                      Map<String, Object> rtnMap = (Map<String, Object>) restTemplateFactory.callRestService(url,                              JSONObject.toJSONString(reqMap, SerializerFeature.WriteMapNullValue));                      context.setAttribute("rtnMap",rtnMap);                      return rtnMap;                  }catch (Exception e){                      throw new ZcSupplySynRemoteException("调用资采同步接口发生错误,准备重试");                  }              }          };            //兜底回调          RecoveryCallback<Map<String, Object>> recoveryCallback = new RecoveryCallback<Map<String, Object>>() {              /**               * @param context the current retry context               * @return an Object that can be used to replace the callback result that failed               * @throws ZcSupplySynRemoteException when something goes wrong               */              public Map<String, Object> recover(RetryContext context) throws ZcSupplySynRemoteException{                  Map<String, Object> rtnMap = (Map<String, Object>)context.getAttribute("rtnMap");                  log.info("xxx重试3次均错误，请确认是否对方服务可用,调用结果{}", JSONObject.toJSONString(rtnMap, SerializerFeature.WriteMapNullValue));                    //注意:这里可以抛出异常，注解方式不可以，需要外层处理的需要使用这种方式                  throw new ZcSupplySynRemoteException("xxx重试3次均错误，请确认是否对方服务可用。");              }          };            return retryTemplate.execute(retryCallback, recoveryCallback);      }  }

核心类
RetryCallback: 封装你需要重试的业务逻辑；

RecoverCallback：封装在多次重试都失败后你需要执行的业务逻辑；

RetryContext: 重试语境下的上下文，可用于在多次Retry或者Retry 和Recover之间传递参数或状态；

RetryOperations : 定义了“重试”的基本框架（模板），要求传入RetryCallback，可选传入RecoveryCallback；

RetryListener：典型的“监听者”，在重试的不同阶段通知“监听者”；

RetryPolicy : 重试的策略或条件，可以简单的进行多次重试，可以是指定超时时间进行重试；

BackOffPolicy: 重试的回退策略，在业务逻辑执行发生异常时。如果需要重试，我们可能需要等一段时间(可能服务器过于繁忙，如果一直不间隔重试可能拖垮服务器)，当然这段时间可以是 0，也可以是固定的，可以是随机的（参见tcp的拥塞控制算法中的回退策略）。回退策略在上文中体现为wait()；

RetryTemplate: RetryOperations的具体实现，组合了RetryListener[]，BackOffPolicy，RetryPolicy。

重试策略
NeverRetryPolicy：只允许调用RetryCallback一次，不允许重试

AlwaysRetryPolicy：允许无限重试，直到成功，此方式逻辑不当会导致死循环

SimpleRetryPolicy：固定次数重试策略，默认重试最大次数为3次，RetryTemplate默认使用的策略

TimeoutRetryPolicy：超时时间重试策略，默认超时时间为1秒，在指定的超时时间内允许重试

ExceptionClassifierRetryPolicy：设置不同异常的重试策略，类似组合重试策略，区别在于这里只区分不同异常的重试

CircuitBreakerRetryPolicy：有熔断功能的重试策略，需设置3个参数openTimeout、resetTimeout和delegate

CompositeRetryPolicy：组合重试策略，有两种组合方式，乐观组合重试策略是指只要有一个策略允许重试即可以，
悲观组合重试策略是指只要有一个策略不允许重试即可以，但不管哪种组合方式，组合中的每一个策略都会执行

重试回退策略
重试回退策略，指的是每次重试是立即重试还是等待一段时间后重试。

默认情况下是立即重试，如果需要配置等待一段时间后重试则需要指定回退策略BackoffRetryPolicy。

NoBackOffPolicy：无退避算法策略，每次重试时立即重试

FixedBackOffPolicy：固定时间的退避策略，需设置参数sleeper和backOffPeriod，sleeper指定等待策略，默认是Thread.sleep，即线程休眠，backOffPeriod指定休眠时间，默认1秒

UniformRandomBackOffPolicy：随机时间退避策略，需设置sleeper、minBackOffPeriod和maxBackOffPeriod，该策略在[minBackOffPeriod,maxBackOffPeriod之间取一个随机休眠时间，minBackOffPeriod默认500毫秒，maxBackOffPeriod默认1500毫秒

ExponentialBackOffPolicy：指数退避策略，需设置参数sleeper、initialInterval、maxInterval和multiplier，initialInterval指定初始休眠时间，默认100毫秒，maxInterval指定最大休眠时间，默认30秒，multiplier指定乘数，即下一次休眠时间为当前休眠时间*multiplier

ExponentialRandomBackOffPolicy：随机指数退避策略，引入随机乘数可以实现随机乘数回退

2、Guava retry重试框架

guava retryer工具与spring-retry类似，都是通过定义重试者角色来包装正常逻辑重试，但是Guava retryer有更优的策略定义，在支持重试次数和重试频度控制基础上，能够兼容支持多个异常或者自定义实体对象的重试源定义，让重试功能有更多的灵活性。

3、Spring Cloud 重试配置

Spring Cloud Netflix 提供了各种HTTP请求的方式。
你可以使用负载均衡的RestTemplate, Ribbon, 或者 Feign。
无论你选择如何创建HTTP 请求，都存在请求失败的可能性。
当一个请求失败时，你可能想它自动地去重试。
当使用Sping Cloud Netflix这么做，你需要在应用的classpath引入Spring Retry。
当存在Spring Retry，负载均衡的RestTemplates, Feign, 和 Zuul，会自动地重试失败的请求

RestTemplate+Ribbon全局设置：

spring:    cloud:     loadbalancer:        retry:          enabled: true  ribbon:      ReadTimeout: 6000      ConnectTimeout: 6000      MaxAutoRetries: 1      MaxAutoRetriesNextServer: 2         OkToRetryOnAllOperations: true

指定服务service1配置

service1:    ribbon:      MaxAutoRetries: 1      MaxAutoRetriesNextServer: 2      ConnectTimeout: 5000      ReadTimeout: 2000      OkToRetryOnAllOperations: true

配置	说明
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds	断路器的超时时间需要大于ribbon的超时时间，不然不会触发重试。
hello-service.ribbon.ConnectTimeout	请求连接的超时时间
hello-service.ribbon.ReadTimeout	请求处理的超时时间
hello-service.ribbon.OkToRetryOnAllOperations	是否对所有操作请求都进行重试
hello-service.ribbon.MaxAutoRetriesNextServer	重试负载均衡其他的实例最大重试次数，不包括首次server
hello-service.ribbon.MaxAutoRetries	同一台实例最大重试次数，不包括首次调用

feign重试完整配置yml

eureka:    client:      serviceUrl:        defaultZone: http://localhost:8761/eureka/  server:    port: 7001  spring:    application:      name: feign-service    feign:    hystrix:      enabled: true      client1:    ribbon:      #配置首台服务器重试1次      MaxAutoRetries: 1      #配置其他服务器重试两次      MaxAutoRetriesNextServer: 2      #链接超时时间      ConnectTimeout: 500      #请求处理时间      ReadTimeout: 2000      #每个操作都开启重试机制      OkToRetryOnAllOperations: true    #配置断路器超时时间，默认是1000（1秒）  hystrix:    command:      default:        execution:          isolation:            thread:              timeoutInMilliseconds: 2001

参考

1、https://www.jianshu.com/p/96a5003c470c
2、https://www.imooc.com/article/259204
3、https://blog.csdn.net/kisscatforever/article/details/80048395
4、https://houbb.github.io/2018/08/07/guava-retry

常用的重试技术—如何优雅的重试

背景

重试技术实现

1、 Spring Retry重试框架

2、Guava retry重试框架

3、Spring Cloud 重试配置

参考

VirMach 便宜 VPS

QNews

常用的重试技术—如何优雅的重试

背景

重试技术实现

1、 Spring Retry重试框架

2、Guava retry重试框架

3、Spring Cloud 重试配置

参考

分享此文：

Related Posts

React添加事件

Nginx WebUI管理

线程实现模型

【MySQL】为什么SQL会这么慢

VirMach 便宜 VPS

QNews

熱門搜尋