Halo 開源項目學習（七）：緩存機制

2022 年 4 月 29 日
筆記
halo, JAVA, 後端, 開源項目學習

基本介紹

我們知道，頻繁操作數據庫會降低服務器的系統性能，因此通常需要將頻繁訪問、更新的數據存入到緩存。Halo 項目也引入了緩存機制，且設置了多種實現方式，如自定義緩存、Redis、LevelDB 等，下面我們分析一下緩存機制的實現過程。

自定義緩存

1. 緩存的配置

由於數據在緩存中以鍵值對的形式存在，且不同類型的緩存系統定義的存儲和讀取等操作都大同小異，所以本文僅介紹項目中默認的自定義緩存。自定義緩存指的是作者自己編寫的緩存，以 ConcurrentHashMap 作為容器，數據存儲在服務器的內存中。在介紹自定義緩存之前，我們先看一下 Halo 緩存的體系圖：

本人使用的 Halo 1.4.13 版本中並未設置 Redis 緩存，上圖來自 1.5.2 版本。

可以看到，作者的設計思路是在上層的抽象類和接口中定義通用的操作方法，而具體的緩存容器、數據的存儲以及讀取方法則是在各個實現類中定義。如果希望修改緩存的類型，只需要在配置類 HaloProperties 中修改 cache 字段的值：

@Bean
@ConditionalOnMissingBean
AbstractStringCacheStore stringCacheStore() {
    AbstractStringCacheStore stringCacheStore;
    // 根據 cache 字段的值選擇具體的緩存類型
    switch (haloProperties.getCache()) {
        case "level":
            stringCacheStore = new LevelCacheStore(this.haloProperties);
            break;
        case "redis":
            stringCacheStore = new RedisCacheStore(stringRedisTemplate);
            break;
        case "memory":
        default:
            stringCacheStore = new InMemoryCacheStore();
            break;
    }
    log.info("Halo cache store load impl : [{}]", stringCacheStore.getClass());
    return stringCacheStore;
}

上述代碼來自 1.5.2 版本。

cache 字段的默認值為 “memory”，因此緩存的實現類為 InMemoryCacheStore（自定義緩存）：

public class InMemoryCacheStore extends AbstractStringCacheStore {

    /**
     * Cleaner schedule period. (ms)
     */
    private static final long PERIOD = 60 * 1000;

    /**
     * Cache container.
     */
    public static final ConcurrentHashMap<String, CacheWrapper<String>> CACHE_CONTAINER =
        new ConcurrentHashMap<>();

    private final Timer timer;

    /**
     * Lock.
     */
    private final Lock lock = new ReentrantLock();

    public InMemoryCacheStore() {
        // Run a cache store cleaner
        timer = new Timer();
        // 每 60s 清除一次過期的 key
        timer.scheduleAtFixedRate(new CacheExpiryCleaner(), 0, PERIOD);
    }
    // 省略部分代碼
}

InMemoryCacheStore 成員變量的含義如下：

CACHE_CONTAINER 是 InMemoryCacheStore 的緩存容器，類型為 ConcurrentHashMap。使用 ConcurrentHashMap 是為了保證線程安全，因為緩存中會存放緩存鎖相關的數據（下文中介紹），每當用戶訪問後台的服務時，就會有新的數據進入緩存，這些數據可能來自於不同的線程，因此 CACHE_CONTAINER 需要考慮多個線程同時操作的情況。

timer 負責執行周期任務，任務的執行頻率為 PERIOD，默認為一分鐘，周期任務的處理邏輯是清除緩存中已經過期的 key。
lock 是 ReentrantLock 類型的排它鎖，與緩存鎖有關。

2. 緩存中的數據

緩存中存儲的數據包括：

系統設置中的選項信息，其實就是 options 表中存儲的數據。
已登錄用戶（博主）的 token。
已獲得文章授權的客戶端的 sessionId。
緩存鎖相關的數據。

在之前的文章中，我們介紹過 token 和 sessionId 的存儲和獲取，因此本文就不再贅述這一部分內容了，詳見 Halo 開源項目學習（三）：註冊與登錄 和 Halo 開源項目學習（四）：發佈文章與頁面。緩存鎖我們在下一節再介紹，本節中我們先看看 Halo 如何保存 options 信息。

首先需要了解一下 options 信息是什麼時候存入到緩存中的，實際上，程序在啟動後會發佈 ApplicationStartedEvent 事件，項目中定義了負責監聽 ApplicationStartedEvent 事件的監聽器 StartedListener（listener 包下），該監聽器在事件發佈後會執行 initThemes 方法，下面是 initThemes 方法中的部分代碼片段：

private void initThemes() {
    // Whether the blog has initialized
    Boolean isInstalled = optionService
        .getByPropertyOrDefault(PrimaryProperties.IS_INSTALLED, Boolean.class, false);
    // 省略部分代碼
}

該方法會調用 getByPropertyOrDefault 方法從緩存中查詢博客的安裝狀態，我們從 getByPropertyOrDefault 方法開始，沿着調用鏈向下搜索，可以追蹤到 OptionProvideService 接口中的 getByKey 方法：

default Optional<Object> getByKey(@NonNull String key) {
    Assert.hasText(key, "Option key must not be blank");
    // 如果 val = listOptions().get(key) 不為空, 返回 value 為 val 的 Optional 對象, 否則返回 value 為空的 Optional 對象
    return Optional.ofNullable(listOptions().get(key));
}

可以看到，重點是這個 listOptions 方法，該方法在 OptionServiceImpl 類中定義：

public Map<String, Object> listOptions() {
    // Get options from cache
    // 從緩存 CACHE_CONTAINER 中獲取 "options" 這個 key 對應的數據, 並將該數據轉化為 Map 對象
    return cacheStore.getAny(OPTIONS_KEY, Map.class).orElseGet(() -> {
        // 初次調用時需要從 options 表中獲取所有的 Option 對象
        List<Option> options = listAll();
        // 所有 Option 對象的 key 集合
        Set<String> keys = ServiceUtils.fetchProperty(options, Option::getKey);

        /*
            * options 表中存儲的記錄其實就是用戶自定義的 Option 選項, 當用戶修改博客設置時, 會自動更新 options 表,
            * Halo 中對一些選項的 value 設置了確定的類型, 例如 EmailProperties 這個類中的 HOST 為 String 類型, 而
            * SSL_PORT 則為 Integer 類型, 由於 Option 類中 value 一律為 String 類型, 因此需要將某些 value 轉化為指
            * 定的類型
            */
        Map<String, Object> userDefinedOptionMap =
            ServiceUtils.convertToMap(options, Option::getKey, option -> {
                String key = option.getKey();

                PropertyEnum propertyEnum = propertyEnumMap.get(key);

                if (propertyEnum == null) {
                    return option.getValue();
                }
                // 對 value 進行類型轉換
                return PropertyEnum.convertTo(option.getValue(), propertyEnum);
            });

        Map<String, Object> result = new HashMap<>(userDefinedOptionMap);

        // Add default property
        /*
            * 有些選項是 Halo 默認設定的, 例如 EmailProperties 中的 SSL_PORT, 用戶未設置時, 它也會被設定為默認的 465,
            * 同樣, 也需要將默認的 "465" 轉化為 Integer 類型的 465
            */
        propertyEnumMap.keySet()
            .stream()
            .filter(key -> !keys.contains(key))
            .forEach(key -> {
                PropertyEnum propertyEnum = propertyEnumMap.get(key);

                if (StringUtils.isBlank(propertyEnum.defaultValue())) {
                    return;
                }
                // 對 value 進行類型轉換並存入 result
                result.put(key,
                    PropertyEnum.convertTo(propertyEnum.defaultValue(), propertyEnum));
            });

        // Cache the result
        // 將所有的選項加入緩存
        cacheStore.putAny(OPTIONS_KEY, result);

        return result;
    });
}

服務器首先從 CACHE_CONTAINER 中獲取 “options” 這個 key 對應的數據，然後將該數據轉化為 Map 類型的對象。由於初次查詢時 CACHE_CONTAINER 中並沒有 “options” 對應的 value，因此需要進行初始化：

首先從 options 表中獲取所有的 Option 對象，並將這些對象存入到 Map 中。其中 key 和 value 均為 Option 對象中的 key 和 value，但 value 還需要進行一個類型轉換，因為在 Option 類中 value 被定義為了 String 類型。例如，”is_installed” 對應的 value 為 “true”，為了能夠正常使用 value，需要將字符串 “true” 轉化成 Boolean 類型的 true。結合上下文，我們發現程序是根據 PrimaryProperties 類（繼承 PropertyEnum 的枚舉類）中定義的枚舉對象 IS_INSTALLED("is_installed", Boolean.class, "false") 來確認目標類型 Boolean 的。
options 表中的選項是用戶自定義的選項，除此之外，Halo 中還設置了一些默認的選項，這些選項均在 PropertyEnum 的子類中定義，例如 EmailProperties 類中的 SSL_PORT("email_ssl_port", Integer.class, "465")，其對應的 key 為 “email_ssl_port”，value 為 “465”。服務器也會將這些 key – value 對存入到 Map，並對 value 進行類型轉換。

以上便是 listOptions 方法的處理邏輯，我們回到 getByKey 方法，當獲取到 listOptions 方法返回的 Map 對象後，服務器可以根據指定的 key（如 “is_installed”）獲取到對應的屬性值（如 true）。當用戶在管理員後台修改博客的系統設置時，服務器會根據用戶的配置更新 options 表，並發佈 OptionUpdatedEvent 事件，之後負責處理事件的監聽器會將緩存中的 “options” 刪除，下次查詢時再根據上述步驟執行初始化操作（詳見 FreemarkerConfigAwareListener 中的 onOptionUpdate 方法）。

3. 緩存的過期處理

緩存的過期處理是一個非常重要的知識點，數據過期後，通常需要將其從緩存中刪除。從上文中的 cacheStore.putAny(OPTIONS_KEY, result) 方法中我們得知，服務器將數據存儲到緩存之前，會先將其封裝成 CacheWrapper 對象：

class CacheWrapper<V> implements Serializable {

    /**
     * Cache data
     */
    private V data;

    /**
     * Expired time.
     */
    private Date expireAt;

    /**
     * Create time.
     */
    private Date createAt;
}

其中 data 是需要存儲的數據，createAt 和 expireAt 分別是數據的創建時間和過期時間。Halo 項目中，”options” 是沒有過期時間的，只有當數據更新時，監聽器才會將舊的數據刪除。需要注意的是，token 和 sessionId 均有過期時間，對於有過期時間的 key，項目中也有相應的處理辦法。以 token 為例，攔截器攔截到用戶的請求後會確認用戶的身份，也就是查詢緩存中是否具有 token 對應的用戶 id，這個查詢操作的底層調用的是 get 方法（在 AbstractCacheStore 類中定義）：

public Optional<V> get(K key) {
    Assert.notNull(key, "Cache key must not be blank");

    return getInternal(key).map(cacheWrapper -> {
        // Check expiration
        // 過期
        if (cacheWrapper.getExpireAt() != null
            && cacheWrapper.getExpireAt().before(run.halo.app.utils.DateUtils.now())) {
            // Expired then delete it
            log.warn("Cache key: [{}] has been expired", key);

            // Delete the key
            delete(key);

            // Return null
            return null;
        }
        // 未過期返回緩存數據
        return cacheWrapper.getData();
    });
}

服務器獲取到 key 對應的 CacheWrapper 對象後，會檢查其中的過期時間，如果數據已過期，那麼直接將其刪除並返回 null。另外，上文中提到，timer（InMemoryCacheStore 的成員變量）的周期任務也負責刪除過期的數據，下面是 timer 周期任務執行的方法：

private class CacheExpiryCleaner extends TimerTask {

    @Override
    public void run() {
        CACHE_CONTAINER.keySet().forEach(key -> {
            if (!InMemoryCacheStore.this.get(key).isPresent()) {
                log.debug("Deleted the cache: [{}] for expiration", key);
            }
        });
    }
}

可見，周期任務也是通過調用 get 方法來刪除過期數據的。

緩存鎖

Halo 項目中的緩存鎖也是一個比較有意思的模塊，其作用是限制用戶對某個功能的調用頻率，可認為是對請求的方法進行加鎖。緩存鎖主要利用自定義註解 @CacheLock 和 AOP 來實現，@CacheLock 註解的定義如下：

@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
@Documented
@Inherited
public @interface CacheLock {

    @AliasFor("value")
    String prefix() default "";


    @AliasFor("prefix")
    String value() default "";


    long expired() default 5;


    TimeUnit timeUnit() default TimeUnit.SECONDS;


    String delimiter() default ":";


    boolean autoDelete() default true;


    boolean traceRequest() default false;
}

各個成員變量的含義為：

prefix：用於構建 cacheLockKey（一個字符串）的前綴。
value：同 prefix。
expired：緩存鎖的持續時間。
timeUnit：持續時間的單位。
delimiter：分隔符，構建 cacheLockKey 時使用。
autoDelete：是否自動刪除緩存鎖。
traceRequest：是否追蹤請求的 IP，如果是，那麼構建 cacheLockKey 時會添加用戶的 IP。

緩存鎖的使用方法是在需要加鎖的方法上添加 @CacheLock 註解，然後通過 Spring 的 AOP 在方法執行前對方法進行加鎖，方法執行結束後再將鎖取消。項目中的切面類為 CacheLockInterceptor，負責加/解鎖的邏輯如下：

Around("@annotation(run.halo.app.cache.lock.CacheLock)")
public Object interceptCacheLock(ProceedingJoinPoint joinPoint) throws Throwable {
    // 獲取方法簽名
    // Get method signature
    MethodSignature methodSignature = (MethodSignature) joinPoint.getSignature();

    log.debug("Starting locking: [{}]", methodSignature.toString());

    // 獲取方法上的 CacheLock 註解
    // Get cache lock
    CacheLock cacheLock = methodSignature.getMethod().getAnnotation(CacheLock.class);
    // 構造緩存鎖的 key
    // Build cache lock key
    String cacheLockKey = buildCacheLockKey(cacheLock, joinPoint);
    System.out.println(cacheLockKey);
    log.debug("Built lock key: [{}]", cacheLockKey);

    try {
        // Get from cache
        Boolean cacheResult = cacheStore
            .putIfAbsent(cacheLockKey, CACHE_LOCK_VALUE, cacheLock.expired(),
                cacheLock.timeUnit());

        if (cacheResult == null) {
            throw new ServiceException("Unknown reason of cache " + cacheLockKey)
                .setErrorData(cacheLockKey);
        }

        if (!cacheResult) {
            throw new FrequentAccessException("訪問過於頻繁，請稍後再試！").setErrorData(cacheLockKey);
        }
        // 執行註解修飾的方法
        // Proceed the method
        return joinPoint.proceed();
    } finally {
        // 方法執行結束後, 是否自動刪除緩存鎖
        // Delete the cache
        if (cacheLock.autoDelete()) {
            cacheStore.delete(cacheLockKey);
            log.debug("Deleted the cache lock: [{}]", cacheLock);
        }
    }
}

@Around("@annotation(run.halo.app.cache.lock.CacheLock)") 表示，如果請求的方法被 @CacheLock 註解修飾，那麼服務器不會執行該方法，而是執行 interceptCacheLock 方法：

獲取方法上的 CacheLock 註解並構建 cacheLockKey。
查看緩存中是否存在 cacheLockKey，如果存在，那麼拋出異常，提醒用戶訪問過於頻繁。如果不存在，那麼將 cacheLockKey 存入到緩存（有效時間為 expired），並執行請求的方法。
如果 CacheLock 註解中的 autoDelete 為 true，那麼方法執行結束後立即刪除 cacheLockKey。

緩存鎖的原理和 Redis 的 setnx + expire 相似，如果 key 已存在，就不能再次添加。下面是構建 cacheLockKey 的邏輯：

private String buildCacheLockKey(@NonNull CacheLock cacheLock,
    @NonNull ProceedingJoinPoint joinPoint) {
    Assert.notNull(cacheLock, "Cache lock must not be null");
    Assert.notNull(joinPoint, "Proceeding join point must not be null");

    // Get the method
    MethodSignature methodSignature = (MethodSignature) joinPoint.getSignature();

    // key 的前綴
    // Build the cache lock key
    StringBuilder cacheKeyBuilder = new StringBuilder(CACHE_LOCK_PREFIX);
    // 分隔符
    String delimiter = cacheLock.delimiter();
    // 如果 CacheLock 中設置了前綴, 那麼直接使用該前綴, 否則使用方法名
    if (StringUtils.isNotBlank(cacheLock.prefix())) {
        cacheKeyBuilder.append(cacheLock.prefix());
    } else {
        cacheKeyBuilder.append(methodSignature.getMethod().toString());
    }
    // 提取被 CacheParam 註解修飾的變量的值
    // Handle cache lock key building
    Annotation[][] parameterAnnotations = methodSignature.getMethod().getParameterAnnotations();

    for (int i = 0; i < parameterAnnotations.length; i++) {
        log.debug("Parameter annotation[{}] = {}", i, parameterAnnotations[i]);

        for (int j = 0; j < parameterAnnotations[i].length; j++) {
            Annotation annotation = parameterAnnotations[i][j];
            log.debug("Parameter annotation[{}][{}]: {}", i, j, annotation);
            if (annotation instanceof CacheParam) {
                // Get current argument
                Object arg = joinPoint.getArgs()[i];
                log.debug("Cache param args: [{}]", arg);

                // Append to the cache key
                cacheKeyBuilder.append(delimiter).append(arg.toString());
            }
        }
    }
    // 是否添加請求的 IP
    if (cacheLock.traceRequest()) {
        // Append http request info
        cacheKeyBuilder.append(delimiter).append(ServletUtils.getRequestIp());
    }
    return cacheKeyBuilder.toString();
}

可以發現，cacheLockKey 的結構為 cache_lock_ + CacheLock 註解中設置的前綴或方法簽名 + 分隔符 + CacheParam 註解修飾的參數的值 + 分隔符 + 請求的 IP，例如：

cache_lock_public void run.halo.app.controller.content.api.PostController.like(java.lang.Integer):1:127.0.0.1

CacheParam 同 CacheLock 一樣，都是為實現緩存鎖而定義的註解。CacheParam 的作用是將鎖的粒度精確到具體的實體，如點贊請求：

@PostMapping("{postId:\\d+}/likes")
@ApiOperation("Likes a post")
@CacheLock(autoDelete = false, traceRequest = true)
public void like(@PathVariable("postId") @CacheParam Integer postId) {
    postService.increaseLike(postId);
}

參數 postId 被 CacheParam 修飾，根據 buildCacheLockKey 方法的邏輯，postId 也將是 cacheLockKey 的一部分，這樣鎖定的就是 “為 id 等於 postId 的文章點贊” 這一方法，而非鎖定 “點贊” 方法。

此外，CacheLock 註解中的 traceRequest 參數也很重要，如果 traceRequest 為 true，那麼請求的 IP 會被添加到 cacheLockKey 中，此時緩存鎖僅限制同一 IP 對某個方法的請求頻率，不同 IP 之間互不干擾。如果 traceRequest 為 false，那麼緩存鎖就是一個分佈式鎖，不同 IP 不能同時訪問同一個功能，例如當某個用戶為某篇文章點贊後，短時間內其它用戶不能為該文章點贊。

最後我們再分析一下 putIfAbsent 方法（在 interceptCacheLock 中被調用），其功能和 Redis 的 setnx 相似，該方法的具體處理邏輯可追蹤到 InMemoryCacheStore 類中的 putInternalIfAbsent 方法：

Boolean putInternalIfAbsent(@NonNull String key, @NonNull CacheWrapper<String> cacheWrapper) {
    Assert.hasText(key, "Cache key must not be blank");
    Assert.notNull(cacheWrapper, "Cache wrapper must not be null");

    log.debug("Preparing to put key: [{}], value: [{}]", key, cacheWrapper);
    // 加鎖
    lock.lock();
    try {
        // 獲取 key 對應的 value
        // Get the value before
        Optional<String> valueOptional = get(key);
        // value 不為空返回 false
        if (valueOptional.isPresent()) {
            log.warn("Failed to put the cache, because the key: [{}] has been present already",
                key);
            return false;
        }
        // 在緩存中添加 value 並返回 true
        // Put the cache wrapper
        putInternal(key, cacheWrapper);
        log.debug("Put successfully");
        return true;
    } finally {
        // 解鎖
        lock.unlock();
    }
}

上節中我們提到，自定義緩存 InMemoryCacheStore 中有一個 ReentrantLock 類型的成員變量 lock，lock 的作用就是保證 putInternalIfAbsent 方法的線程安全性，因為向緩存容器中添加 cacheLockKey 是多個線程並行執行的。如果不添加 lock，那麼當多個線程同時操作同一個 cacheLockKey 時，不同線程可能都會檢測到緩存中沒有 cacheLockKey，因此 putInternalIfAbsent 方法均返回 true，之後多個線程就可以同時執行某個方法，添加 lock 後就能夠避免這種情況。

結語

關於 Halo 項目緩存機制就介紹到這裡了，如有理解錯誤，歡迎大家批評指正 ( ̳• ◡ • ̳)。

Tags: halo JAVA 後端開源項目學習