PostgreSQL VACUUM 之深入淺出 (三)

VACUUM 相關參數

對 VACUUM 有了一定的了解之後,下面系統介紹下 VACUUM 相關參數。

VACUUM 相關參數主要分為三大類。

第一類 與資源相關參數

#-----------------------------
# RESOURCE USAGE (except WAL)
#-----------------------------
# - Memory -
#maintenance_work_mem = 64MB            # min 1MB
#autovacuum_work_mem = -1               # min 1MB, or -1 to use maintenance_work_mem
# - Cost-Based Vacuum Delay -
#vacuum_cost_delay = 0                  # 0-100 milliseconds (0 disables)
#vacuum_cost_page_hit = 1               # 0-10000 credits
#vacuum_cost_page_miss = 10             # 0-10000 credits
#vacuum_cost_page_dirty = 20            # 0-10000 credits
#vacuum_cost_limit = 200                # 1-10000 credits

這裡有兩部分。

第一部分是記憶體相關。主要是 autovacuum_work_mem,默認值為 -1,即同 maintenance_work_memmaintenance_work_mem 默認值為 64MB。

第二部分是 Cost-Based Vacuum Delay。

當 VACUUM 工作超出一定量之後,會 sleep 一段時間。

一定量是多少呢?是 vacuum_cost_limit。默認值為 200。

sleep 多長時間呢?是 vacuum_cost_delay 。默認值是 0,即不 sleep。

工作量又是怎麼算出來的?根據要 VACUUM 的 page 的不同,其 cost 是不一樣的。

以下是三種不同 page 的 cost,默認值分別為 1、10、20,基本不用調整。

vacuum_cost_page_hit – The estimated cost for vacuuming a buffer found in the shared buffer cache.

vacuum_cost_page_miss – The estimated cost for vacuuming a buffer that has to be read from disk.

vacuum_cost_page_dirty – The estimated cost charged when vacuum modifies a block that was previously clean.

日常工作中手動 VACUUM 時主要調整 vacuum_cost_limitvacuum_cost_delay 。如調整為:

vacuum_cost_delay = 2
vacuum_cost_limit = 2000

即當 VACUUM 工作量超出 2000 之後,sleep 2ms。

需要注意,手動 VACUUM 和 AUTOVACUUM 的參數是不一樣的。當 AUTOVACUUM 參數為 -1 時,則同手動 VACUUM 參數。

手動 VACUUM 對應的參數是 maintenance_work_memvacuum_cost_delayvacuum_cost_limit

AUTOVACUUM 對應的參數是 autovacuum_work_memautovacuum_vacuum_cost_delayautovacuum_vacuum_cost_limit

可以從下面 AUTOVACUUM 參數中可以看到, autovacuum_vacuum_cost_delay 默認值為 20ms,這樣的話,AUTOVACUUM 運行時其對資料庫影響較小。PostgreSQL 12 開始,其默認值調整為了 2ms。

#autovacuum_vacuum_cost_delay = 20ms    # default vacuum cost delay for
                                        # autovacuum, in milliseconds;
                                        # -1 means use vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1      # default vacuum cost limit for
                                        # autovacuum, -1 means use
                                        # vacuum_cost_limit

第二類 AUTOVACUUM 相關參數

#------------------------------------------------------------------------------
# AUTOVACUUM
#------------------------------------------------------------------------------

#autovacuum = on                        # Enable autovacuum subprocess?  'on'
                                        # requires track_counts to also be on.
#log_autovacuum_min_duration = -1       # -1 disables, 0 logs all actions and
                                        # their durations, > 0 logs only
                                        # actions running at least this number
                                        # of milliseconds.
#autovacuum_max_workers = 3             # max number of autovacuum subprocesses
                                        # (change requires restart)
#autovacuum_naptime = 1min              # time between autovacuum runs
#autovacuum_vacuum_threshold = 50       # min number of row updates before
                                        # vacuum
#autovacuum_analyze_threshold = 50      # min number of row updates before
                                        # analyze
#autovacuum_vacuum_scale_factor = 0.2   # fraction of table size before vacuum
#autovacuum_analyze_scale_factor = 0.1  # fraction of table size before analyze
#autovacuum_freeze_max_age = 200000000  # maximum XID age before forced vacuum
                                        # (change requires restart)
#autovacuum_multixact_freeze_max_age = 400000000        # maximum multixact age
                                        # before forced vacuum
                                        # (change requires restart)
#autovacuum_vacuum_cost_delay = 20ms    # default vacuum cost delay for
                                        # autovacuum, in milliseconds;
                                        # -1 means use vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1      # default vacuum cost limit for
                                        # autovacuum, -1 means use
                                        # vacuum_cost_limit

以下參數前面已有提到,一般調整為通用配置後基本不調整,調整的話多是調整表級的參數,即根據不同的表設置不同的參數

#autovacuum = on
#log_autovacuum_min_duration = -1
#autovacuum_naptime = 1min
#autovacuum_vacuum_threshold = 50
#autovacuum_analyze_threshold = 50
#autovacuum_vacuum_scale_factor = 0.2
#autovacuum_analyze_scale_factor = 0.1
#autovacuum_vacuum_cost_delay = 20ms
#autovacuum_vacuum_cost_limit = -1

下面兩個參數是當某個表的 age 達到一定閾值後,AUTOVACUUM 會對整個資料庫實例進行 aggressive vacuum 以避免 wraparound,即使表沒有 dead tuple。資料庫運行良好的話,很少會觸發。

#autovacuum_freeze_max_age = 200000000
#autovacuum_multixact_freeze_max_age = 400000000

當資料庫中表比較多,甚至一個實例中資料庫也比較多的情況,可適當增大 autovacuum_max_workers

#autovacuum_max_workers = 3             # max number of autovacuum subprocesses
                                        # (change requires restart)

問題來了,增大 autovacuum_max_workers 後,一定會提高 AUTOVACUUM 速度嗎?

這裡需要注意,autovacuum_vacuum_cost_limit 是所有 autovacuum worker 所用 cost 之和達到 limit 之後 sleep,增大 autovacuum_max_workers 之後,每個 worker 平均的 cost limit 就小了,即就相對更容易達到 limit,這樣做同樣的工作,就會 sleep 更多的時間,反而就更慢了。

所以,在增大 autovacuum_max_workers 之後,可以相應比例增大 autovacuum_vacuum_cost_limit

第三類 FREEZE 相關參數

以下是 FREEZE 相關參數,以後將系統介紹 FREEZE,本文不再展開討論。

#------------------------------------------
# CLIENT CONNECTION DEFAULTS
#------------------------------------------
#vacuum_freeze_min_age = 50000000
#vacuum_freeze_table_age = 150000000
#vacuum_multixact_freeze_min_age = 5000000
#vacuum_multixact_freeze_table_age = 150000000

公眾號

關注 DBA Daily 公眾號,第一時間收到文章的更新。
通過一線 DBA 的日常工作,學習實用資料庫技術乾貨!

公眾號優質文章推薦

PostgreSQL VACUUM 之深入淺出

華山論劍之 PostgreSQL sequence

[PG Upgrade Series] Extract Epoch Trap

[PG Upgrade Series] Toast Dump Error

GitLab supports only PostgreSQL now

MySQL or PostgreSQL?

PostgreSQL hstore Insight

ReIndex 失敗原因調查

PG 數據導入 Hive 亂碼問題調查

PostGIS 擴展創建失敗原因調查