邊緣集群場景下的鏡像緩存
邊緣集群場景下的鏡像緩存
1、問題背景
數量龐大的邊緣節點拉取中心雲私有倉庫里的同一鏡像時會給雲端服務器造成很大的壓力,因此如果同一份鏡像邊緣側只拉取一次,然後把鏡像緩存到邊緣側,之後邊緣節點再去拉取鏡像時直接從緩存中獲取,會大大減少雲端服務器的壓力,有點類似於鏡像CDN的功能。
本次實驗需要至少三個節點,一個雲中心節點用來部署我們的私有鏡像倉庫,兩個邊緣節點,其中一個用來緩存鏡像,一個用來做拉取實驗。
2、環境搭建
2.1、雲端私有倉庫Harbor搭建
2.1.1、安裝docker環境
#官方倉庫安裝
//docs.docker.com/install/linux/docker-ce/centos
#二進制安裝
//docs.docker.com/install/linux/docker-ce/binaries
//download.docker.com/linux/static/stable #下載二進制文件
2.1.2、安裝docker-compose
curl -L //github.com/docker/compose/releases/download/1.27.4/docker-compose-`uname -s`-`uname -m` -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
docker-compose -version
2.1.3、安裝Harbor
wget //github.com/goharbor/harbor/releases/download/v2.1.1/harbor-offline-installer-v2.1.1.tgz
#解壓到home目錄下
tar xvf harbor-offline-installer-v2.1.1.tgz -C /home/xing && cd /home/xing/harbor/
修改harbor.yml文件,配置文件中對應目錄要創建好
# Configuration file of Harbor
# The IP address or hostname to access admin UI and registry service.
# DO NOT use localhost or 127.0.0.1, because Harbor needs to be accessed by external clients.
hostname: harbor.xing.com
# http related config
#http:
# port for http, default is 80. If https enabled, this port will redirect to https port
# port: 80
# https related config
https:
# https port for harbor, default is 443
port: 443
# The path of cert and key files for nginx
certificate: /home/xing/harbor/certs/harbor.crt
private_key: /home/xing/harbor/certs/harbor.key
# # Uncomment following will enable tls communication between all harbor components
# internal_tls:
# # set enabled to true means internal tls is enabled
# enabled: true
# # put your cert and key files on dir
# dir: /etc/harbor/tls/internal
# Uncomment external_url if you want to enable external proxy
# And when it enabled the hostname will no longer used
# external_url: //reg.mydomain.com:8433
# The initial password of Harbor admin
# It only works in first time to install harbor
# Remember Change the admin password from UI after launching Harbor.
harbor_admin_password: Harbor12345
# Harbor DB configuration
database:
# The password for the root user of Harbor DB. Change this before any production use.
password: root123
# The maximum number of connections in the idle connection pool. If it <=0, no idle connections are retained.
max_idle_conns: 50
# The maximum number of open connections to the database. If it <= 0, then there is no limit on the number of open connections.
# Note: the default number of connections is 1024 for postgres of harbor.
max_open_conns: 1000
# The default data volume
data_volume: /home/xing/harbor/data
# Harbor Storage settings by default is using /data dir on local filesystem
# Uncomment storage_service setting If you want to using external storage
# storage_service:
# # ca_bundle is the path to the custom root ca certificate, which will be injected into the truststore
# # of registry's and chart repository's containers. This is usually needed when the user hosts a internal storage with self signed certificate.
# ca_bundle:
# # storage backend, default is filesystem, options include filesystem, azure, gcs, s3, swift and oss
# # for more info about this configuration please refer //docs.docker.com/registry/configuration/
# filesystem:
# maxthreads: 100
# # set disable to true when you want to disable registry redirect
# redirect:
# disabled: false
# Trivy configuration
#
# Trivy DB contains vulnerability information from NVD, Red Hat, and many other upstream vulnerability databases.
# It is downloaded by Trivy from the GitHub release page //github.com/aquasecurity/trivy-db/releases and cached
# in the local file system. In addition, the database contains the update timestamp so Trivy can detect whether it
# should download a newer version from the Internet or use the cached one. Currently, the database is updated every
# 12 hours and published as a new release to GitHub.
trivy:
# ignoreUnfixed The flag to display only fixed vulnerabilities
ignore_unfixed: false
# skipUpdate The flag to enable or disable Trivy DB downloads from GitHub
#
# You might want to enable this flag in test or CI/CD environments to avoid GitHub rate limiting issues.
# If the flag is enabled you have to download the `trivy-offline.tar.gz` archive manually, extract `trivy.db` and
# `metadatta.json` files and mount them in the `/home/scanner/.cache/trivy/db` path.
skip_update: false
#
# insecure The flag to skip verifying registry certificate
insecure: false
# github_token The GitHub access token to download Trivy DB
#
# Anonymous downloads from GitHub are subject to the limit of 60 requests per hour. Normally such rate limit is enough
# for production operations. If, for any reason, it's not enough, you could increase the rate limit to 5000
# requests per hour by specifying the GitHub access token. For more details on GitHub rate limiting please consult
# //developer.github.com/v3/#rate-limiting
#
# You can create a GitHub token by following the instructions in
# //help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line
#
# github_token: xxx
jobservice:
# Maximum number of job workers in job service
max_job_workers: 10
notification:
# Maximum retry count for webhook job
webhook_job_max_retry: 10
chart:
# Change the value of absolute_url to enabled can enable absolute url in chart
absolute_url: disabled
# Log configurations
log:
# options are debug, info, warning, error, fatal
level: info
# configs for logs in local storage
local:
# Log files are rotated log_rotate_count times before being removed. If count is 0, old versions are removed rather than rotated.
rotate_count: 50
# Log files are rotated only if they grow bigger than log_rotate_size bytes. If size is followed by k, the size is assumed to be in kilobytes.
# If the M is used, the size is in megabytes, and if G is used, the size is in gigabytes. So size 100, size 100k, size 100M and size 100G
# are all valid.
rotate_size: 200M
# The directory on your host that store log
location: /var/log/harbor
# Uncomment following lines to enable external syslog endpoint.
# external_endpoint:
# # protocol used to transmit log to external endpoint, options is tcp or udp
# protocol: tcp
# # The host of external endpoint
# host: localhost
# # Port of external endpoint
# port: 5140
#This attribute is for migrator to detect the version of the .cfg file, DO NOT MODIFY!
_version: 2.2.0
# Uncomment external_database if using external database.
# external_database:
# harbor:
# host: harbor_db_host
# port: harbor_db_port
# db_name: harbor_db_name
# username: harbor_db_username
# password: harbor_db_password
# ssl_mode: disable
# max_idle_conns: 2
# max_open_conns: 0
# notary_signer:
# host: notary_signer_db_host
# port: notary_signer_db_port
# db_name: notary_signer_db_name
# username: notary_signer_db_username
# password: notary_signer_db_password
# ssl_mode: disable
# notary_server:
# host: notary_server_db_host
# port: notary_server_db_port
# db_name: notary_server_db_name
# username: notary_server_db_username
# password: notary_server_db_password
# ssl_mode: disable
# Uncomment external_redis if using external Redis server
# external_redis:
# # support redis, redis+sentinel
# # host for redis: <host_redis>:<port_redis>
# # host for redis+sentinel:
# # <host_sentinel1>:<port_sentinel1>,<host_sentinel2>:<port_sentinel2>,<host_sentinel3>:<port_sentinel3>
# host: redis:6379
# password:
# # sentinel_master_set must be set to support redis+sentinel
# #sentinel_master_set:
# # db_index 0 is for core, it's unchangeable
# registry_db_index: 1
# jobservice_db_index: 2
# chartmuseum_db_index: 3
# trivy_db_index: 5
# idle_timeout_seconds: 30
# Uncomment uaa for trusting the certificate of uaa instance that is hosted via self-signed cert.
# uaa:
# ca_file: /path/to/ca
# Global proxy
# Config http proxy for components, e.g. //my.proxy.com:3128
# Components doesn't need to connect to each others via http proxy.
# Remove component from `components` array if want disable proxy
# for it. If you want use proxy for replication, MUST enable proxy
# for core and jobservice, and set `http_proxy` and `https_proxy`.
# Add domain to the `no_proxy` field, when you want disable proxy
# for some special registry.
proxy:
http_proxy:
https_proxy:
no_proxy:
components:
- core
- jobservice
- trivy
# metric:
# enabled: false
# port: 9090
# path: /metrics
2.1.4、openssl生成自簽證書
# 1、生成證書,並保存到 /home/xing/harbor/certs 目錄下
openssl req -newkey rsa:4096 -nodes -sha256 -keyout /home/xing/harbor/certs/harbor.key -x509 -out /home/xing/harbor/certs/harbor.crt -subj /C=CN/ST=BJ/L=BJ/O=DEVOPS/CN=harbor.xing.com -days 3650
req 產生證書籤發申請命令
-newkey 生成新私鑰
rsa:4096 生成秘鑰位數
-nodes 表示私鑰不加密
-sha256 使用SHA-2哈希算法
-keyout 將新創建的私鑰寫入的文件名
-x509 簽發X.509格式證書命令。X.509是最通用的一種簽名證書格式。
-out 指定要寫入的輸出文件名
-subj 指定用戶信息
-days 有效期(3650表示十年)
2.1.5、啟動harbor服務
./install.sh
修改hosts文件,添加域名。也可以直接登錄//hostip:80,用戶密碼默認admin Harbor12345(配置文件中)
2.2、上傳鏡像到私有倉庫
2.2.1、添加私有鏡像信任倉庫
# 1、添加倉庫地址
vi /etc/docker/daemon.json
{
"registry-mirrors": ["//k1ktap5m.mirror.aliyuncs.com"],
"insecure-registries": ["172.16.9.3","harbor.xing.com"]
}
# 2、重啟 docker 服務
systemctl daemon-reload
systemctl restart docker
#添加證書
2.2.2、登錄私有倉庫
2.2.3、從Harbor倉庫上傳/下載鏡像
# 1、將本地鏡像打上私有倉庫
# 格式:docker tag 本地鏡像名:版本 Harbor服務器訪問ip+端口/項目名/倉庫鏡像名:版本
docker tag nginx:latest harbor.xing.com/xing/mynginx:v1
# 2、上傳鏡像
docker push harbor.xing.com/xing/mynginx:v1
# 3、下載鏡像
docker pull harbor.xing.com/xing/mynginx:v1
2.3、在k8s集群中使用Harbor
k8s 默認https訪問harbor,訪問http,需要修改整個集群節點
/etc/docker/daemon.json
文件
由於harbor採用了用戶名密碼認證,所以在鏡像下載時需要配置sercet
#創建一個給Docker registry使用的secret
kubectl create secret docker-registry registry-secret --namespace=default \
--docker-server=172.16.9.3 \
--docker-username=admin \
--docker-password=Harbor12345
#查看secret
[root@master demo]# kubectl get secret
NAME TYPE DATA AGE
default-token-gdwgn kubernetes.io/service-account-token 3 2d18h
registry-secret kubernetes.io/dockerconfigjson 1 116s
#刪除
kubectl delete secret registry-secret
至此只需要把containers中的images鏡像指定為harbor倉庫鏡像地址即可。
3、邊緣部署registry緩存
3.1、運行服務
可以通過官方鏡像啟動docker registry緩存服務,也可以通過二進制啟動。
-
啟動服務
二進制方式
git clone //github.com/distribution/distribution.git cd distribution make binaries #然後進入/bin目錄啟動 ./registry serve /etc/docker/registry/config.yml #前提是配置文件寫好
docker啟動,把相應存儲卷掛載到容器里。
docker run -itd -p 5000:5000 -v /var/lib/registry:/var/lib/registry -v /etc/docker/registry/config.yml:/etc/docker/registry/config.yml --name registry registry:v2 #端口要對外暴漏出來,端口的定義在配置文件里
-
編寫distribution配置文件,默認放在/etc/docker/registry/config.yml里
version: 0.1 log: fields: service: registry storage: cache: blobdescriptor: inmemory filesystem: rootdirectory: /var/lib/registry http: addr: :5000 #服務端口 headers: X-Content-Type-Options: [nosniff] proxy: remoteurl: //harbor.xing.com #私有鏡像地址 username: *** #用戶名 password: *** #密碼 health: storagedriver: enabled: true interval: 10s threshold: 3
-
更改hosts文件
vi /etc/hosts #添加域名 10.10.102.190 harbor.xing.com #ip為harbor鏡像倉庫的地址
如果對外暴露的是80端口,docker pull的時候可以不加端口,如果使用默認暴露的5000端口,拉取鏡像時候要加端口
docker pull localhost:5000/xing/imagecache:v2
如果想要上傳的話,先更改鏡像名稱
但是這樣只是push到了緩存中,遠程倉庫並沒有更改。
鏡像緩存在/var/lib/registry/docker/registry/v2/repositories/項目名/倉庫名下面。
其他節點使用緩存的鏡像只需要把域名改為緩存的地址就可以了
vi /etc/hosts
192.168.123.160 harbor.xing.com #此IP為緩存節點的IP而非harbor倉庫的IP!!!
#當我們在其他節點上執行這個命令的時候(如果是80端口可以省略),鏡像拉取的請求會被打入到緩存倉庫里去,緩存倉庫里如果有,直接返回鏡像,如果沒有,緩存倉庫會到config.yml里配置的remoteurl的地址里也就是我們的私有harbor倉庫里去拉取鏡像,把鏡像緩存並返回給請求節點(cache-aside策略)
docker pull harbor.xing.com:5000/xing/imagecahe:v2
3.2、問題總結
-
當我用docker啟動的時候,拉取鏡像時會報錯誤
我們的harbor.xing.com域名解析不了,本以為是docker內部沒有解析到這個域名,我嘗試進入容器內部修改hosts文件,沒能解決,索性把容器的啟動網絡設置為host模式,依舊不行。因此我把源代碼重新編譯成二進制形式運行,沒有錯誤,正常緩存。
-
跨平台運行端口問題:我把服務部署在x86_64平台下,服務正常運行,但由於我們有arm64的邊緣節點,我嘗試編譯成arm架構的指令去運行在邊緣節點下時,docker pull必須帶端口,即使是80端口,也不能省略。否則會報
invalid character '<' looking for beginning of value
錯誤,原因未知。
3.3、注意事項
-
docker pull 默認使用的是https協議,因為我們拉取私有地址時候可能用到http協議,因此需要修改damon.json文件。只要我們請求的域名報https錯誤,就把這個域名添加到
insecure-registries
字段里就行了。 -
證書問題
我們需要把前面生成的harbor.crt證書導入到我們的緩存節點上,去更新證書,更新方法自行搜索。如果更新後出現上面的
certificate relies on legacy Conmmon Name field
,要麼降低Golang的版本,要麼運行時候注入GODEBUG=x509ignoreCN=0
環境變量。
4、緩存策略(補充知識)
4.1、Cache-Aside策略
這種策略下,應用程序會與cache和data source進行通信,應用程序會在命中data source之前先檢查cache。
這種策略下,應用程序在首先讀cache裏面的數據,如果未命中,則去data source裏面獲取數據,然後在存到cache里。
優點
- 適合
讀多
應用場景 - 在一定程度上可以抵抗緩存故障,如果緩存服務故障,系統可以直接訪問data source獲取數據
缺點
- 不能保證數據存儲和緩存之間的一致性
- 首次請求數據時,總是緩存未命中(可通過手動觸發查詢操作來對數據進行
預熱
)
4.2、Read-Through策略
這種策略,應用程序無需管理數據源和緩存,只需要將數據源的同步委託給緩存提供程序Cache Provider即可。所有數據交互都是通過抽象緩存層完成的。
Read-Through適用於多次請求相同數據的場景
優點
- 進行大量讀取時,可以減少數據源的負載
- 也對緩存服務的故障具備一定的彈性
缺點
- 首次請求數據,會導致緩存未命中,還是可以通過緩存預熱來解決
與Cache-Aside相比,實際對緩存和數據源的操作通過Cache Provider支持
4.3、Write-Through策略
這種策略下,當數據發生更新時,Cache Provider負責更新數據源和緩存。緩存與數據源保持一致,並且寫入時始終通過抽象緩存層到達數據源。
由於需要將數據同步寫入緩存和數據源,因此數據寫入速度較慢。但是,與Read-Through配合使用時,我們將獲得Read-Through的所有好處,並且還可以獲得數據一致性保證。
4.4、Write-Behind策略
如果沒有強一致性要求,可以簡單地使緩存的更新請求入隊,並且定期將其flush到數據存儲中
Write-Behind在數據更新時,只寫入緩存。
優點
- 數據寫入速度快,適用於頻繁的寫工作負載
- 與
Read-Through
配合使用,可以很好地用於混合工作負載,最近更新和訪問的數據總是在緩存中可用 - 可以抵抗數據源故障,並可以容忍某些數據源停機時間
缺點
- 一旦更新後的緩存數據還未被寫入數據源時(斷電),數據將無法找回