《Prometheus监控实战》第12章 监控Kubernetes

  • 2019 年 12 月 19 日
  • 筆記

第12章 监控Kubernetes

  • 我们会把Tornado、Prometheus部署到Kubernetes集群中(https://github.com/turnbullpress/tornado-api)

12.1 Kubernetes集群

  • Kubernetes集群名为tornado.quicknuke.com,使用kops构建了该集群(https://github.com/kubernetes/kops),可以在Github上找到集群配置。它有3个主节点和6个工作节点,所有节点都在三个可用区之间分布
  • 代码清单:kops创建集群命令
kops create cluster     --node-count 6     --zones us-east-2a,us-east-2b,us-east-2c     --master-zones us-east-2a,us-east-2b,us-east-2c     --node-size t2.micro     --topology private     --networking kopeio-vxlan     --api-loadbalancer-type=public     --bastion     tornado.quicknuke.com  

12.2 在Kubernetes上运行Prometheus

  • 在Kubernetes上部署Prometheus有多种方法,其中哪一种方法最适合很大程度上取决于你的环境。比如,你可以构建自己的部署然后通过服务的形式暴露Prometheus,这种方法已经有一些配置可供参考(https://github.com/kayrus/prometheus-kubernetes),或者可以使用CoreOS的PrometheusOperator(https://github.com/coreos/prometheus-operator)
  • 我们选择了手动创建部署和服务的方法,如下所示
apiVersion: v1  kind: Service  metadata:    name: prometheus    namespace: monitoring    labels:      app: prometheus      component: core    annotations:      prometheus.io/scrape: 'true'      external-dns.alpha.kubernetes.io/hostname: prometheus.quicknuke.com.  spec:    type: LoadBalancer    ports:      - port: 9090        targetPort: webui        name: webui    selector:      app: prometheus      component: core  ---  apiVersion: extensions/v1beta1  kind: Deployment  metadata:    name: prometheus-core    namespace: monitoring    labels:      app: prometheus      component: core  spec:    replicas: 1    template:      metadata:        name: prometheus-main        labels:          app: prometheus          component: core      spec:        containers:          - name: prometheus            image: prom/prometheus:v2.1.0            args:              - "--config.file=/etc/prometheus/prometheus.yml"              - "--storage.tsdb.path=/prometheus/"              - "--web.enable-lifecycle"            ports:              - containerPort: 9090                name: webui            volumeMounts:              - name: prometheus-config-volume                mountPath: /etc/prometheus              - name: prometheus-rules-volume                mountPath: /var/prometheus/rules              - name: prometheus-storage-volume                mountPath: /prometheus/          - name: watch            image: weaveworks/watch:master-5b2a6e5            imagePullPolicy: IfNotPresent            args: ["-v", "-t", "-p=/etc/prometheus", "-p=/var/prometheus", "curl", "-X", "POST", "--fail", "-o", "-", "-sS", "http://localhost:9090/-/reload"]            volumeMounts:              - name: prometheus-config-volume                mountPath: /etc/prometheus              - name: prometheus-rules-volume                mountPath: /var/prometheus/rules        volumes:          - name: prometheus-config-volume            configMap:              defaultMode: 420              name: prometheus-server-conf          - name: prometheus-rules-volume            configMap:              name: prometheus-rules-conf          - name: prometheus-storage-volume            emptyDir: {}    
  • 我们使用ConfigMaps来配置Prometheus服务器和管理规则,并在部署中将它们作为卷挂载
apiVersion: v1  kind: ConfigMap  metadata:    name: prometheus-server-conf    labels:      name: prometheus-server-conf    namespace: monitoring  data:    prometheus.yml: |-      global:        scrape_interval:     15s        evaluation_interval: 15s        alerting:        alertmanagers:        - kubernetes_sd_configs:          - role: endpoints          relabel_configs:          - source_labels: [__meta_kubernetes_service_name]            regex: alertmanager            action: keep          - source_labels: [__meta_kubernetes_namespace]            regex: monitoring            action: keep          - source_labels: [__meta_kubernetes_endpoint_port_name]            action: keep            regex: cluster        rule_files:        - "/var/prometheus/rules/*_rules.yml"        - "/var/prometheus/rules/*_alerts.yml"        scrape_configs:      - job_name: 'kubernetes-service-endpoints'        kubernetes_sd_configs:        - role: endpoints        relabel_configs:        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]          action: keep          regex: true        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]          action: replace          target_label: __scheme__          regex: (https?)        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]          action: replace          target_label: __metrics_path__          regex: (.+)        - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]          action: replace          target_label: __address__          regex: ([^:]+)(?::d+)?;(d+)          replacement: $1:$2        - action: labelmap          regex: __meta_kubernetes_service_label_(.+)        - source_labels: [__meta_kubernetes_namespace]          action: replace          target_label: kubernetes_namespace        - source_labels: [__meta_kubernetes_service_name]          action: replace          target_label: kubernetes_name        - job_name: 'kubernetes-nodes'        scheme: https        tls_config:          insecure_skip_verify: true          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: node        relabel_configs:        - action: labelmap          regex: __meta_kubernetes_node_label_(.+)        - target_label: __address__          replacement: kubernetes.default.svc:443        - source_labels: [__meta_kubernetes_node_name]          regex: (.+)          target_label: __metrics_path__          replacement: /api/v1/nodes/${1}/proxy/metrics        - job_name: 'kubernetes-cadvisor'        scheme: https        tls_config:          insecure_skip_verify: true          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: node        relabel_configs:        - action: labelmap          regex: __meta_kubernetes_node_label_(.+)        - target_label: __address__          replacement: kubernetes.default.svc:443        - source_labels: [__meta_kubernetes_node_name]          regex: (.+)          target_label: __metrics_path__          replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor        - job_name: 'kubernetes-apiservers'        scheme: https        tls_config:          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt          insecure_skip_verify: true        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: endpoints        relabel_configs:        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]          action: keep          regex: default;kubernetes;https    
  • 我们在Kubernetes上部署了包含3个Alertmanager节点的集群,并将所有这些资源都放在名为monitoring的命名空间
apiVersion: v1  kind: Service  metadata:    name: alertmanager-webui    namespace: monitoring    labels:      app: alertmanager      component: core    annotations:      prometheus.io/scrape: "true"      prometheus.io/port: "9093"      external-dns.alpha.kubernetes.io/hostname: alertmanager.quicknuke.com.  spec:    type: LoadBalancer    ports:      - port: 9093        name: metrics    selector:      app: alertmanager      component: core  ---  apiVersion: v1  kind: Service  metadata:    name: alertmanager    namespace: monitoring    labels:      app: alertmanager      component: core  spec:    ports:    - port: 9093      name: cluster    type: ClusterIP    clusterIP: None    selector:      app: alertmanager      component: core  ---  apiVersion: apps/v1beta2  kind: StatefulSet  metadata:    name: alertmanager    namespace: monitoring    labels:      app: alertmanager      component: core  spec:    updateStrategy:      type: RollingUpdate    replicas: 3    selector:      matchLabels:        app: alertmanager        component: core    serviceName: alertmanager    template:      metadata:        labels:          app: alertmanager          component: core      spec:        containers:        - name: alertmanager          image: quay.io/prometheus/alertmanager:master          imagePullPolicy: IfNotPresent          command:          - "sh"          - "-c"          args:          - /bin/alertmanager              --config.file=/etc/alertmanager/config.yml              --web.listen-address=0.0.0.0:9093              --cluster.listen-address=0.0.0.0:8001              --storage.path=/alertmanager              --cluster.peer="alertmanager-0.alertmanager.monitoring.svc:8001"              --cluster.peer="alertmanager-1.alertmanager.monitoring.svc:8001"              --cluster.peer="alertmanager-2.alertmanager.monitoring.svc:8001"              --log.level=debug          ports:          - containerPort: 9093            name: web            protocol: TCP          - containerPort: 8001            name: cluster            protocol: TCP          livenessProbe:            httpGet:              path: /api/v1/status              port: web              scheme: HTTP            failureThreshold: 10          readinessProbe:            failureThreshold: 10            httpGet:              path: /api/v1/status              port: web              scheme: HTTP            initialDelaySeconds: 3            periodSeconds: 5            successThreshold: 1            timeoutSeconds: 3          volumeMounts:          - name: alertmanager-config-volume            mountPath: /etc/alertmanager/          - name: alertmanager-data-volume            mountPath: /alertmanager/        volumes:        - name: alertmanager-config-volume          configMap:            name: alertmanager-server-conf        - name: alertmanager-data-volume          emptyDir: {}  

12.3 监控Kubernetes

  • 利用Prometheus为Kubernetes提供的内置服务发现机制(https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config)
# A scrape configuration for running Prometheus on a Kubernetes cluster.  # This uses separate scrape configs for cluster components (i.e. API server, node)  # and services to allow each to use different authentication configs.  #  # Kubernetes labels will be added as Prometheus labels on metrics via the  # `labelmap` relabeling action.  #  # If you are using Kubernetes 1.7.2 or earlier, please take note of the comments  # for the kubernetes-cadvisor job; you will need to edit or remove this job.    # Scrape config for API servers.  #  # Kubernetes exposes API servers as endpoints to the default/kubernetes  # service so this uses `endpoints` role and uses relabelling to only keep  # the endpoints associated with the default/kubernetes service using the  # default named port `https`. This works for single API server deployments as  # well as HA API server deployments.  scrape_configs:  - job_name: 'kubernetes-apiservers'      kubernetes_sd_configs:    - role: endpoints      # Default to scraping over https. If required, just disable this or change to    # `http`.    scheme: https      # This TLS & bearer token file config is used to connect to the actual scrape    # endpoints for cluster components. This is separate to discovery auth    # configuration because discovery & scraping are two separate concerns in    # Prometheus. The discovery auth config is automatic if Prometheus runs inside    # the cluster. Otherwise, more config options have to be provided within the    # <kubernetes_sd_config>.    tls_config:      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt      # If your node certificates are self-signed or use a different CA to the      # master CA, then disable certificate verification below. Note that      # certificate verification is an integral part of a secure infrastructure      # so this should only be disabled in a controlled environment. You can      # disable certificate verification by uncommenting the line below.      #      # insecure_skip_verify: true    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token      # Keep only the default/kubernetes service endpoints for the https port. This    # will add targets for each API server which Kubernetes adds an endpoint to    # the default/kubernetes service.    relabel_configs:    - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]      action: keep      regex: default;kubernetes;https    # Scrape config for nodes (kubelet).  #  # Rather than connecting directly to the node, the scrape is proxied though the  # Kubernetes apiserver.  This means it will work if Prometheus is running out of  # cluster, or can't connect to nodes for some other reason (e.g. because of  # firewalling).  - job_name: 'kubernetes-nodes'      # Default to scraping over https. If required, just disable this or change to    # `http`.    scheme: https      # This TLS & bearer token file config is used to connect to the actual scrape    # endpoints for cluster components. This is separate to discovery auth    # configuration because discovery & scraping are two separate concerns in    # Prometheus. The discovery auth config is automatic if Prometheus runs inside    # the cluster. Otherwise, more config options have to be provided within the    # <kubernetes_sd_config>.    tls_config:      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token      kubernetes_sd_configs:    - role: node      relabel_configs:    - action: labelmap      regex: __meta_kubernetes_node_label_(.+)    - target_label: __address__      replacement: kubernetes.default.svc:443    - source_labels: [__meta_kubernetes_node_name]      regex: (.+)      target_label: __metrics_path__      replacement: /api/v1/nodes/${1}/proxy/metrics    # Scrape config for Kubelet cAdvisor.  #  # This is required for Kubernetes 1.7.3 and later, where cAdvisor metrics  # (those whose names begin with 'container_') have been removed from the  # Kubelet metrics endpoint.  This job scrapes the cAdvisor endpoint to  # retrieve those metrics.  #  # In Kubernetes 1.7.0-1.7.2, these metrics are only exposed on the cAdvisor  # HTTP endpoint; use "replacement: /api/v1/nodes/${1}:4194/proxy/metrics"  # in that case (and ensure cAdvisor's HTTP server hasn't been disabled with  # the --cadvisor-port=0 Kubelet flag).  #  # This job is not necessary and should be removed in Kubernetes 1.6 and  # earlier versions, or it will cause the metrics to be scraped twice.  - job_name: 'kubernetes-cadvisor'      # Default to scraping over https. If required, just disable this or change to    # `http`.    scheme: https      # This TLS & bearer token file config is used to connect to the actual scrape    # endpoints for cluster components. This is separate to discovery auth    # configuration because discovery & scraping are two separate concerns in    # Prometheus. The discovery auth config is automatic if Prometheus runs inside    # the cluster. Otherwise, more config options have to be provided within the    # <kubernetes_sd_config>.    tls_config:      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token      kubernetes_sd_configs:    - role: node      relabel_configs:    - action: labelmap      regex: __meta_kubernetes_node_label_(.+)    - target_label: __address__      replacement: kubernetes.default.svc:443    - source_labels: [__meta_kubernetes_node_name]      regex: (.+)      target_label: __metrics_path__      replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor    # Example scrape config for service endpoints.  #  # The relabeling allows the actual service scrape endpoint to be configured  # for all or only some endpoints.  - job_name: 'kubernetes-service-endpoints'      kubernetes_sd_configs:    - role: endpoints      relabel_configs:    # Example relabel to scrape only endpoints that have    # "example.io/should_be_scraped = true" annotation.    #  - source_labels: [__meta_kubernetes_service_annotation_example_io_should_be_scraped]    #    action: keep    #    regex: true    #    # Example relabel to customize metric path based on endpoints    # "example.io/metric_path = <metric path>" annotation.    #  - source_labels: [__meta_kubernetes_service_annotation_example_io_metric_path]    #    action: replace    #    target_label: __metrics_path__    #    regex: (.+)    #    # Example relabel to scrape only single, desired port for the service based    # on endpoints "example.io/scrape_port = <port>" annotation.    #  - source_labels: [__address__, __meta_kubernetes_service_annotation_example_io_scrape_port]    #    action: replace    #    regex: ([^:]+)(?::d+)?;(d+)    #    replacement: $1:$2    #    target_label: __address__    #    # Example relabel to configure scrape scheme for all service scrape targets    # based on endpoints "example.io/scrape_scheme = <scheme>" annotation.    #  - source_labels: [__meta_kubernetes_service_annotation_example_io_scrape_scheme]    #    action: replace    #    target_label: __scheme__    #    regex: (https?)    - action: labelmap      regex: __meta_kubernetes_service_label_(.+)    - source_labels: [__meta_kubernetes_namespace]      action: replace      target_label: kubernetes_namespace    - source_labels: [__meta_kubernetes_service_name]      action: replace      target_label: kubernetes_name    # Example scrape config for probing services via the Blackbox Exporter.  #  # The relabeling allows the actual service scrape endpoint to be configured  # for all or only some services.  - job_name: 'kubernetes-services'      metrics_path: /probe    params:      module: [http_2xx]      kubernetes_sd_configs:    - role: service      relabel_configs:    # Example relabel to probe only some services that have "example.io/should_be_probed = true" annotation    #  - source_labels: [__meta_kubernetes_service_annotation_example_io_should_be_probed]    #    action: keep    #    regex: true    - source_labels: [__address__]      target_label: __param_target    - target_label: __address__      replacement: blackbox-exporter.example.com:9115    - source_labels: [__param_target]      target_label: instance    - action: labelmap      regex: __meta_kubernetes_service_label_(.+)    - source_labels: [__meta_kubernetes_namespace]      target_label: kubernetes_namespace    - source_labels: [__meta_kubernetes_service_name]      target_label: kubernetes_name    # Example scrape config for probing ingresses via the Blackbox Exporter.  #  # The relabeling allows the actual ingress scrape endpoint to be configured  # for all or only some services.  - job_name: 'kubernetes-ingresses'      metrics_path: /probe    params:      module: [http_2xx]      kubernetes_sd_configs:    - role: ingress      relabel_configs:    # Example relabel to probe only some ingresses that have "example.io/should_be_probed = true" annotation    #  - source_labels: [__meta_kubernetes_ingress_annotation_example_io_should_be_probed]    #    action: keep    #    regex: true    - source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]      regex: (.+);(.+);(.+)      replacement: ${1}://${2}${3}      target_label: __param_target    - target_label: __address__      replacement: blackbox-exporter.example.com:9115    - source_labels: [__param_target]      target_label: instance    - action: labelmap      regex: __meta_kubernetes_ingress_label_(.+)    - source_labels: [__meta_kubernetes_namespace]      target_label: kubernetes_namespace    - source_labels: [__meta_kubernetes_ingress_name]      target_label: kubernetes_name    # Example scrape config for pods  #  # The relabeling allows the actual pod scrape to be configured  # for all the declared ports (or port-free target if none is declared)  # or only some ports.  - job_name: 'kubernetes-pods'      kubernetes_sd_configs:    - role: pod      relabel_configs:    # Example relabel to scrape only pods that have    # "example.io/should_be_scraped = true" annotation.    #  - source_labels: [__meta_kubernetes_pod_annotation_example_io_should_be_scraped]    #    action: keep    #    regex: true    #    # Example relabel to customize metric path based on pod    # "example.io/metric_path = <metric path>" annotation.    #  - source_labels: [__meta_kubernetes_pod_annotation_example_io_metric_path]    #    action: replace    #    target_label: __metrics_path__    #    regex: (.+)    #    # Example relabel to scrape only single, desired port for the pod    # based on pod "example.io/scrape_port = <port>" annotation.    # Note that __address__ is modified here, so if pod containers' ports    # are declared, they all will be ignored.    #  - source_labels: [__address__, __meta_kubernetes_pod_annotation_example_io_scrape_port]    #    action: replace    #    regex: ([^:]+)(?::d+)?;(d+)    #    replacement: $1:$2    #    target_label: __address__    - action: labelmap      regex: __meta_kubernetes_pod_label_(.+)    - source_labels: [__meta_kubernetes_namespace]      action: replace      target_label: kubernetes_namespace    - source_labels: [__meta_kubernetes_pod_name]      action: replace      target_label: kubernetes_pod_name  

12.4 监控Kubernetes节点

  • 我们可以利用Kubernetes DaemonSet控制器在集群中的每个节点上自动部署pod。当你无法控制实例时,这个方法很有用——例如,如果你使用托管的Kubernetes解决方案(https://kubernetes.io/zh/docs/concepts/workloads/controllers/daemonset/)
  • 警告:使用这种方法需要特别注意!Node Exporter可访问许多root级别的资源,在Docker容器中运行它需要将这些资源挂载到容器中,对于systemd收集器来说,需要以root身份运行容器。这带来了潜在的安全风险,如果你无法接受该风险,则应将Node Exporter直接安装到实例上

12.4.1 Node Exporter DaemonSet

  • DaemonSet使用toleration(容忍)确保pod在所有节点上运行,可能也包含主节点。它非常适合监控或日志代理等项目
  • 注意:可以在Github上找到Node Exporter的完整配置(https://github.com/turnbullpress/prometheusbook-code/blob/master/12-13/node-exporter.yml)
apiVersion: v1  kind: Service  metadata:    annotations:      prometheus.io/scrape: 'true'    labels:      app: node-exporter    name: node-exporter    namespace: monitoring  spec:    clusterIP: None    ports:    - name: scrape      port: 9100      protocol: TCP    selector:      app: node-exporter    type: ClusterIP  ---  apiVersion: extensions/v1beta1  kind: DaemonSet  metadata:    name: node-exporter    namespace: monitoring  spec:    template:      metadata:        labels:          app: node-exporter        name: node-exporter      spec:        tolerations:        - key: node-role.kubernetes.io/master          effect: NoSchedule        hostNetwork: true        hostPID: true        hostIPC: true        securityContext:          runAsUser: 0        containers:        - image: prom/node-exporter:latest          name: node-exporter          volumeMounts:            - mountPath: /run/systemd/private              name: systemd-socket              readOnly: true          args:            - "--collector.systemd"            - "--collector.systemd.unit-whitelist=(docker|ssh|rsyslog|kubelet).service"          ports:            - containerPort: 9100              hostPort: 9100              name: scrape          livenessProbe:            httpGet:              path: /metrics              port: 9100            initialDelaySeconds: 30            timeoutSeconds: 10            periodSeconds: 1          readinessProbe:            failureThreshold: 5            httpGet:              path: /metrics              port: 9100            initialDelaySeconds: 10            timeoutSeconds: 10            periodSeconds: 2        volumes:          - hostPath:              path: /run/systemd/private            name: systemd-socket  
  • 代码清单:Node Exporter DaemonSet toleration
apiVersion: extensions/v1beta1  kind: DaemonSet  metadata:    name: node-exporter    namespace: monitoring  spec:    template:      metadata:        labels:          app: node-exporter        name: node-exporter      spec:        tolerations:        - key: node-role.kubernetes.io/master          effect: NoSchedule        hostNetwork: true        hostPID: true        hostIPC: true        securityContext:          runAsUser: 0  ……………………  
  • 使用toleration来确保pod也会被调度到Kubernetes主节点,而不仅是工作节点
  • 我们以用户0或root运行pod(这允许访问systemd),并且还启用了hostNetwork、hostPID和hostIPC,以指定实例的网络、进程和IPC命名空间在容器中可用。这些都是潜在的安全风险,你必须考虑是否可以承担此风险。如果这种风险是不可接受的,那么将Node Exporter放到实例的镜像中可能是一种更好的方法
  • 代码清单:Node Exporter DaemonSet容器
………………        containers:        - image: prom/node-exporter:latest          name: node-exporter          volumeMounts:            - mountPath: /run/systemd/private              name: systemd-socket              readOnly: true          args:            - "--collector.systemd"            - "--collector.systemd.unit-whitelist=(docker|ssh|rsyslog|kubelet).service"          ports:            - containerPort: 9100              hostPort: 9100              name: scrape  ……………………  
  • 使用DockerHub中的Node Exporter镜像prom/node_exporter(https://hub.docker.com/r/prom/node-exporter),并获取最新版本。我们还为实例上的目录`/run/systemd/private`挂载了一个卷,这允许Node Exporter访问systemd并在实例上收集systemd管理服务的服务状态
  • 为了帮助Node Exporter pod保持健康并提高它们的正常运行时间,我们还在Node Exporter容器中添加了liveness和readiness探针(https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/),其中liveness探针检测容器内应用程序的状态
……………………          livenessProbe:            httpGet:              path: /metrics              port: 9100            initialDelaySeconds: 30            timeoutSeconds: 10            periodSeconds: 1          readinessProbe:            failureThreshold: 5            httpGet:              path: /metrics              port: 9100            initialDelaySeconds: 10            timeoutSeconds: 10            periodSeconds: 2  ………………  
  • readiness探针可确认应用程序正常运行。这意味着,在将容器标记为可用并发送流量之前,HTTP GET可以连接到端口9100的/metrics路径。其余设置控制探针的行为,在检查准备就绪之前,它将等待10秒(initialDelaySecond),然后,它会每隔2秒(periodSeconds)检查一次。如果探针在10秒(timeoutSeconds)后超时超过5次(failureThreshold),则容器将被标记为Unready

12.4.2 Node Exporter服务

  • 创建一个服务来暴露Node Exporter,以便进行指标抓取
  • 代码清单:Node Exporter服务
apiVersion: v1  kind: Service  metadata:    annotations:      prometheus.io/scrape: 'true'    labels:      app: node-exporter    name: node-exporter    namespace: monitoring  spec:    clusterIP: None    ports:    - name: scrape      port: 9100      protocol: TCP    selector:      app: node-exporter    type: ClusterIP  
  • 我们添加了一个注解prometheus.io/scrape:'true'(https://kubernetes.io/zh/docs/concepts/overview/working-with-objects/annotations/),作为服务的元数据,它将告诉Prometheus应该抓取这个服务

12.4.3 部署Node Exporter

  • 部署Node Exporter DaemonSet和服务
kubectl create -f ./node-exporter.yml -n monitoring  daemonset "node-exporter" created  service "node-exporter" created  
  • 代码清单:默认命名空间
kubectl config set-context $(kubectl config current-context) --namespace=monitoring  
  • 代码清单:检查Node Exporter pod是否运行
  • 集群中每个实例1个:3个主节点和6个工作节点。我们还可以看到Prometheus服务和Alertmanager的pod,分别是prometheus-core和alertmanager
  • 代码清单:捕捉日志检查Node Exporter pod日志确定是否正常运行
  • 代码清单:检查Node Exporter服务

12.4.4 Node Exporter作业

  • 我们还会控制Prometheus仅抓取具有注定注解prometheus.io/scrape(设置为true)的端点。然后,我们使用内置的Kubernetes服务发现来查找端点,并将它们作为Prometheus的潜在目标返回
  • 注意:所有这些工作都是基于Prometheus自带的Kubernetes作业案例
# A scrape configuration for running Prometheus on a Kubernetes cluster.  # This uses separate scrape configs for cluster components (i.e. API server, node)  # and services to allow each to use different authentication configs.  #  # Kubernetes labels will be added as Prometheus labels on metrics via the  # `labelmap` relabeling action.  #  # If you are using Kubernetes 1.7.2 or earlier, please take note of the comments  # for the kubernetes-cadvisor job; you will need to edit or remove this job.    # Scrape config for API servers.  #  # Kubernetes exposes API servers as endpoints to the default/kubernetes  # service so this uses `endpoints` role and uses relabelling to only keep  # the endpoints associated with the default/kubernetes service using the  # default named port `https`. This works for single API server deployments as  # well as HA API server deployments.  scrape_configs:    - job_name: 'kubernetes-apiservers'        kubernetes_sd_configs:        - role: endpoints        # Default to scraping over https. If required, just disable this or change to      # `http`.      scheme: https        # This TLS & bearer token file config is used to connect to the actual scrape      # endpoints for cluster components. This is separate to discovery auth      # configuration because discovery & scraping are two separate concerns in      # Prometheus. The discovery auth config is automatic if Prometheus runs inside      # the cluster. Otherwise, more config options have to be provided within the      # <kubernetes_sd_config>.      tls_config:        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt        # If your node certificates are self-signed or use a different CA to the        # master CA, then disable certificate verification below. Note that        # certificate verification is an integral part of a secure infrastructure        # so this should only be disabled in a controlled environment. You can        # disable certificate verification by uncommenting the line below.        #        # insecure_skip_verify: true      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        # Keep only the default/kubernetes service endpoints for the https port. This      # will add targets for each API server which Kubernetes adds an endpoint to      # the default/kubernetes service.      relabel_configs:        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]          action: keep          regex: default;kubernetes;https      # Scrape config for nodes (kubelet).    #    # Rather than connecting directly to the node, the scrape is proxied though the    # Kubernetes apiserver.  This means it will work if Prometheus is running out of    # cluster, or can't connect to nodes for some other reason (e.g. because of    # firewalling).    - job_name: 'kubernetes-nodes'        # Default to scraping over https. If required, just disable this or change to      # `http`.      scheme: https        # This TLS & bearer token file config is used to connect to the actual scrape      # endpoints for cluster components. This is separate to discovery auth      # configuration because discovery & scraping are two separate concerns in      # Prometheus. The discovery auth config is automatic if Prometheus runs inside      # the cluster. Otherwise, more config options have to be provided within the      # <kubernetes_sd_config>.      tls_config:        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: node        relabel_configs:        - action: labelmap          regex: __meta_kubernetes_node_label_(.+)        - target_label: __address__          replacement: kubernetes.default.svc:443        - source_labels: [__meta_kubernetes_node_name]          regex: (.+)          target_label: __metrics_path__          replacement: /api/v1/nodes/${1}/proxy/metrics      # Scrape config for Kubelet cAdvisor.    #    # This is required for Kubernetes 1.7.3 and later, where cAdvisor metrics    # (those whose names begin with 'container_') have been removed from the    # Kubelet metrics endpoint.  This job scrapes the cAdvisor endpoint to    # retrieve those metrics.    #    # In Kubernetes 1.7.0-1.7.2, these metrics are only exposed on the cAdvisor    # HTTP endpoint; use "replacement: /api/v1/nodes/${1}:4194/proxy/metrics"    # in that case (and ensure cAdvisor's HTTP server hasn't been disabled with    # the --cadvisor-port=0 Kubelet flag).    #    # This job is not necessary and should be removed in Kubernetes 1.6 and    # earlier versions, or it will cause the metrics to be scraped twice.    - job_name: 'kubernetes-cadvisor'        # Default to scraping over https. If required, just disable this or change to      # `http`.      scheme: https        # This TLS & bearer token file config is used to connect to the actual scrape      # endpoints for cluster components. This is separate to discovery auth      # configuration because discovery & scraping are two separate concerns in      # Prometheus. The discovery auth config is automatic if Prometheus runs inside      # the cluster. Otherwise, more config options have to be provided within the      # <kubernetes_sd_config>.      tls_config:        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: node        relabel_configs:        - action: labelmap          regex: __meta_kubernetes_node_label_(.+)        - target_label: __address__          replacement: kubernetes.default.svc:443        - source_labels: [__meta_kubernetes_node_name]          regex: (.+)          target_label: __metrics_path__          replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor      # Example scrape config for service endpoints.    #    # The relabeling allows the actual service scrape endpoint to be configured    # for all or only some endpoints.    - job_name: 'kubernetes-service-endpoints'        kubernetes_sd_configs:        - role: endpoints        relabel_configs:        # Example relabel to scrape only endpoints that have        # "example.io/should_be_scraped = true" annotation.        #  - source_labels: [__meta_kubernetes_service_annotation_example_io_should_be_scraped]        #    action: keep        #    regex: true        #        # Example relabel to customize metric path based on endpoints        # "example.io/metric_path = <metric path>" annotation.        #  - source_labels: [__meta_kubernetes_service_annotation_example_io_metric_path]        #    action: replace        #    target_label: __metrics_path__        #    regex: (.+)        #        # Example relabel to scrape only single, desired port for the service based        # on endpoints "example.io/scrape_port = <port>" annotation.        #  - source_labels: [__address__, __meta_kubernetes_service_annotation_example_io_scrape_port]        #    action: replace        #    regex: ([^:]+)(?::d+)?;(d+)        #    replacement: $1:$2        #    target_label: __address__        #        # Example relabel to configure scrape scheme for all service scrape targets        # based on endpoints "example.io/scrape_scheme = <scheme>" annotation.        #  - source_labels: [__meta_kubernetes_service_annotation_example_io_scrape_scheme]        #    action: replace        #    target_label: __scheme__        #    regex: (https?)        - action: labelmap          regex: __meta_kubernetes_service_label_(.+)        - source_labels: [__meta_kubernetes_namespace]          action: replace          target_label: kubernetes_namespace        - source_labels: [__meta_kubernetes_service_name]          action: replace          target_label: kubernetes_name      # Example scrape config for probing services via the Blackbox Exporter.    #    # The relabeling allows the actual service scrape endpoint to be configured    # for all or only some services.    - job_name: 'kubernetes-services'        metrics_path: /probe      params:        module: [http_2xx]        kubernetes_sd_configs:        - role: service        relabel_configs:        # Example relabel to probe only some services that have "example.io/should_be_probed = true" annotation        #  - source_labels: [__meta_kubernetes_service_annotation_example_io_should_be_probed]        #    action: keep        #    regex: true        - source_labels: [__address__]          target_label: __param_target        - target_label: __address__          replacement: blackbox-exporter.example.com:9115        - source_labels: [__param_target]          target_label: instance        - action: labelmap          regex: __meta_kubernetes_service_label_(.+)        - source_labels: [__meta_kubernetes_namespace]          target_label: kubernetes_namespace        - source_labels: [__meta_kubernetes_service_name]          target_label: kubernetes_name      # Example scrape config for probing ingresses via the Blackbox Exporter.    #    # The relabeling allows the actual ingress scrape endpoint to be configured    # for all or only some services.    - job_name: 'kubernetes-ingresses'        metrics_path: /probe      params:        module: [http_2xx]        kubernetes_sd_configs:        - role: ingress        relabel_configs:        # Example relabel to probe only some ingresses that have "example.io/should_be_probed = true" annotation        #  - source_labels: [__meta_kubernetes_ingress_annotation_example_io_should_be_probed]        #    action: keep        #    regex: true        - source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]          regex: (.+);(.+);(.+)          replacement: ${1}://${2}${3}          target_label: __param_target        - target_label: __address__          replacement: blackbox-exporter.example.com:9115        - source_labels: [__param_target]          target_label: instance        - action: labelmap          regex: __meta_kubernetes_ingress_label_(.+)        - source_labels: [__meta_kubernetes_namespace]          target_label: kubernetes_namespace        - source_labels: [__meta_kubernetes_ingress_name]          target_label: kubernetes_name      # Example scrape config for pods    #    # The relabeling allows the actual pod scrape to be configured    # for all the declared ports (or port-free target if none is declared)    # or only some ports.    - job_name: 'kubernetes-pods'        kubernetes_sd_configs:        - role: pod        relabel_configs:        # Example relabel to scrape only pods that have        # "example.io/should_be_scraped = true" annotation.        #  - source_labels: [__meta_kubernetes_pod_annotation_example_io_should_be_scraped]        #    action: keep        #    regex: true        #        # Example relabel to customize metric path based on pod        # "example.io/metric_path = <metric path>" annotation.        #  - source_labels: [__meta_kubernetes_pod_annotation_example_io_metric_path]        #    action: replace        #    target_label: __metrics_path__        #    regex: (.+)        #        # Example relabel to scrape only single, desired port for the pod        # based on pod "example.io/scrape_port = <port>" annotation.        # Note that __address__ is modified here, so if pod containers' ports        # are declared, they all will be ignored.        #  - source_labels: [__address__, __meta_kubernetes_pod_annotation_example_io_scrape_port]        #    action: replace        #    regex: ([^:]+)(?::d+)?;(d+)        #    replacement: $1:$2        #    target_label: __address__        - action: labelmap          regex: __meta_kubernetes_pod_label_(.+)        - source_labels: [__meta_kubernetes_namespace]          action: replace          target_label: kubernetes_namespace        - source_labels: [__meta_kubernetes_pod_name]          action: replace          target_label: kubernetes_pod_name    
  • 代码清单:Kubernetes服务端点作业
- job_name: 'kubernetes-service-endpoints'    kubernetes_sd_configs:    - role: endpoints    relabel_configs:    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]      action: keep      regex: true    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]      action: replace      target_label: __scheme__      regex: (https?)    - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]      action: replace      target_label: __metrics_path__      regex: (.+)    - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]      action: replace      target_label: __address__      regex: ([^:]+)(?::d+)?;(d+)      replacement: $1:$2    - action: labelmap      regex: __meta_kubernetes_service_label_(.+)    - source_labels: [__meta_kubernetes_namespace]      action: replace      target_label: kubernetes_namespace    - source_labels: [__meta_kubernetes_service_name]      action: replace      target_label: kubernetes_name  
  • 我们称这个作业为kubernetes-service-endpoints,指定服务发现使用kubernetes_sd_discovery机制——这是内置的服务发现机制,专门用于监控Kubernetes。它向Kubernetes API查询符合特定搜索条件的目标
  • 由于Prometheus服务器在Kubernetes内部运行,因此我们能够以最少的配置自动获取与特定角色匹配的Kubernetes目标。节点、pod、服务和入口都有不同的角色,由参数role指定,我们要求服务发现返回所有Kubernetes端点。endpoints角色返回服务的所有已列出端点的目标,每个端点地址的每个端口都是一个目标
  • 代码清单:endpoints角色
endpoints  The endpoints role discovers targets from listed endpoints of a service. For each endpoint address one target is discovered per port. If the endpoint is backed by a pod, all additional container ports of the pod, not bound to an endpoint port, are discovered as targets as well.    Available meta labels:    __meta_kubernetes_namespace: The namespace of the endpoints object.  __meta_kubernetes_endpoints_name: The names of the endpoints object.  For all targets discovered directly from the endpoints list (those not additionally inferred from underlying pods), the following labels are attached:  __meta_kubernetes_endpoint_hostname: Hostname of the endpoint.  __meta_kubernetes_endpoint_node_name: Name of the node hosting the endpoint.  __meta_kubernetes_endpoint_ready: Set to true or false for the endpoint's ready state.  __meta_kubernetes_endpoint_port_name: Name of the endpoint port.  __meta_kubernetes_endpoint_port_protocol: Protocol of the endpoint port.  __meta_kubernetes_endpoint_address_target_kind: Kind of the endpoint address target.  __meta_kubernetes_endpoint_address_target_name: Name of the endpoint address target.  If the endpoints belong to a service, all labels of the role: service discovery are attached.  For all targets backed by a pod, all labels of the role: pod discovery are attached.  
  • 服务发现还会填充各种元数据
  • 第一条规则检查Node Exporter服务中设置的注解prometheus.io/scrape:'true'。在服务发现过程中prometheus.io/scrape注解会被转换为prometheus_io_scrape,以创建一个有效的标签名称,这是因为在Prometheus指标标签中点和斜杠不是合法字符。由于这是Kubernetes服务的注解,因此Prometheus服务进程还会在标签中添加__meta_kubernetes_service_annotation_前缀
  • 作业只保留具有元数据标签的目标,即__meta_kubernetes_service_annotation_(设置为true)。所有其他目标都会被丢弃,这使得你只抓取所需的端点
  • prometheus.io/schemeprometheus.io/pathprometheus.io/port。如果这些标签存在,那么,它将使用这些注解的内容作为要抓取的scheme、path和port。这使我们能够从服务端点精确控制要抓取的内容,进而使作业变得更加灵活
  • __meta_kubernetes_service_label_app元数据标签映射为一个简单的app标签。下一个规则将__meta_kubernetes_namespace标签复制到kubernetes_namespace,将__meta_kubernetes_service_name元数据标签复制到kubernetes_name
  • 代码清单:将作业添加到用于Prometheus服务配置的ConfigMap
kubectl replace -f ./prom-config-map-v1.yml -n monitoring  
  • 删除Prometheus pod,以便重新创建并加载新配置
  • 代码清单:监控服务
  • 这个作业非常有价值,因为我们只需要定义一次,未来所有Kubernetes服务端点都针被自动发现和监控

12.4.5 Node Exporter规则

  • 代码清单:Kubernetes可用性警报规则
- alert: KubernetesServiceDown    expr: up{job="kubernetes-service-endpoints"} == 0    for: 10m    labels:      severity: critical    annotations:      summary: Pod {{ $labels.instance }} is down!  - alert: KubernetesServicesGone    expr:  absent(up{job="kubernetes-service-endpoints"})    for: 10m    labels:      severity: critical    annotations:      summary: No Kubernetes services are reporting!      description: Werner Heisenberg says - OMG Where are my servicez?  
  • 当kubernetes-service-endpoints作业的up指标值 为0时,将触发第一个警报,它表示Prometheus未能抓取任务。第二个警报检查服务是否消失,并使用absent函数检查up指标是否存在
  • 还使用node_systemd_unit_state指标为各个节点上监控的服务添加了警报规则,该指标跟踪systemd服务的状态
  • 代码清单:Kubernetes可用性警报规则
- alert: CriticalServiceDown    expr: node_systemd_unit_state{state="active"} != 1    for: 2m    labels:      severity: critical    annotations:      summary: Service {{ $labels.name }} failed to start.      description: Service {{ $labels.instance }} failed to (re)start service {{ $labels.name }}.  

12.5 Kubernetes

  • 有很多种方法可以监控Kubernetes本身,包括开源Kubernetes生态系统中的工具,如Heapster(https://github.com/kubernetes-retired/heapster)和Kube-state-metrics(https://github.com/kubernetes/kube-state-metrics)

12.5.1 Kube-state-metrics

  • 将在Kubernetes集群中使用部署和服务来安装Kube-state-metrics。部署使用Docker镜像Kube-state-metrics,并在集群的一个节点上运行。服务在端口8080上暴露指标,由于它是一项服务,因此我们可以利用在上一节中创建的Prometheus服务作业。当作业运行时,Prometheus将自动发现新的服务端点并开始抓取Kube-state-metrics
  • 代码清单:部署服务
apiVersion: v1  kind: Service  metadata:    name: kube-state-metrics    namespace: monitoring    labels:      app: kube-state-metrics    annotations:      prometheus.io/scrape: 'true'  spec:    ports:    - name: metrics      port: 8080      targetPort: metrics      protocol: TCP    selector:      app: kube-state-metrics  ---  apiVersion: extensions/v1beta1  kind: Deployment  metadata:    name: kube-state-metrics    namespace: monitoring    labels:      app: kube-state-metrics  spec:    replicas: 1    template:      metadata:        name: kube-state-metrics-main        labels:          app: kube-state-metrics      spec:        containers:          - name: kube-state-metrics            image: quay.io/coreos/kube-state-metrics:latest            ports:            - containerPort: 8080              name: metrics  
  • 代码清单:Prometheus服务作业
apiVersion: v1  kind: ConfigMap  metadata:    name: prometheus-server-conf    labels:      name: prometheus-server-conf    namespace: monitoring  data:    prometheus.yml: |-      global:        scrape_interval:     15s        evaluation_interval: 15s        alerting:        alertmanagers:        - kubernetes_sd_configs:          - role: endpoints          relabel_configs:          - source_labels: [__meta_kubernetes_service_name]            regex: alertmanager            action: keep          - source_labels: [__meta_kubernetes_namespace]            regex: monitoring            action: keep          - source_labels: [__meta_kubernetes_endpoint_port_name]            action: keep            regex: cluster        rule_files:        - "/var/prometheus/rules/*_rules.yml"        - "/var/prometheus/rules/*_alerts.yml"        scrape_configs:      - job_name: 'kubernetes-service-endpoints'        kubernetes_sd_configs:        - role: endpoints        relabel_configs:        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]          action: keep          regex: true        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]          action: replace          target_label: __scheme__          regex: (https?)        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]          action: replace          target_label: __metrics_path__          regex: (.+)        - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]          action: replace          target_label: __address__          regex: ([^:]+)(?::d+)?;(d+)          replacement: $1:$2        - action: labelmap          regex: __meta_kubernetes_service_label_(.+)        - source_labels: [__meta_kubernetes_namespace]          action: replace          target_label: kubernetes_namespace        - source_labels: [__meta_kubernetes_service_name]          action: replace          target_label: kubernetes_name        - job_name: 'kubernetes-nodes'        scheme: https        tls_config:          insecure_skip_verify: true          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: node        relabel_configs:        - action: labelmap          regex: __meta_kubernetes_node_label_(.+)        - target_label: __address__          replacement: kubernetes.default.svc:443        - source_labels: [__meta_kubernetes_node_name]          regex: (.+)          target_label: __metrics_path__          replacement: /api/v1/nodes/${1}/proxy/metrics        - job_name: 'kubernetes-cadvisor'        scheme: https        tls_config:          insecure_skip_verify: true          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: node        relabel_configs:        - action: labelmap          regex: __meta_kubernetes_node_label_(.+)        - target_label: __address__          replacement: kubernetes.default.svc:443        - source_labels: [__meta_kubernetes_node_name]          regex: (.+)          target_label: __metrics_path__          replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor        - job_name: 'kubernetes-apiservers'        scheme: https        tls_config:          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt          insecure_skip_verify: true        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: endpoints        relabel_configs:        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]          action: keep          regex: default;kubernetes;https  
  • 提示:你可以在文档 中查看Kube-state-metrics生成的指标的完整列表(https://prometheus.io/docs/prometheus/latest/querying/functions/)
  • 代码清单:Kube-state-metrics部署版本警报
- alert: TornadoRedisCacheMissesHigh    expr: redis_keyspace_hits_total / (redis_keyspace_hits_total + redis_keyspace_misses_total) > 0.8    for: 10m    labels:      severity: warning    annotations:      summary: Redis Server {{ $labels.instance }} Cache Misses are high.  
  • 第一条规则检测部署是否成功。它将部署的运行版本与元数据中的版本进行比较。如果两者在5分钟内不相等,则会发出警报,指示部署失败
  • 代码清单:Kube-state-metrics部署副本未更新警报
- alert: DeploymentReplicasNotUpdated    expr: ((kube_deployment_status_replicas_updated != kube_deployment_spec_replicas)      or (kube_deployment_status_replicas_available != kube_deployment_spec_replicas))      unless (kube_deployment_spec_paused == 1)    for: 5m    labels:      severity: warning    annotations:      description: Replicas are not updated and available for deployment {{ $labels.namespace }}/{{ $labels.deployment }}      summary: Deployment replicas are outdated  
  • 假设部署没有暂停,确认更新或可用的副本是否与部署规范中的副本数相匹配
  • 代码清单:Kube-state-metrics 检查pod重启警报
- alert: PodzFrequentlyRestarting    expr: increase(kube_pod_container_status_restarts_total[1h]) > 5    for: 10m    labels:      severity: warning    annotations:      description: Pod {{ $labels.namespace }}/{{ $labels.pod }} was restarted {{ $value }} times within the last hour      summary: Pod is restarting frequently  
  • 使用increase函数测量pod重启次数。increase函数测量范围向量中时间序列的增长率,这里是1小时。如果数值持续10分钟超过5,则会触发警报

increase

increase(v range-vector) calculates the increase in the time series in the range vector. Breaks in monotonicity (such as counter resets due to target restarts) are automatically adjusted for. The increase is extrapolated to cover the full time range as specified in the range vector selector, so that it is possible to get a non-integer result even if a counter increases only by integer increments. The following example expression returns the number of HTTP requests as measured over the last 5 minutes, per time series in the range vector:

increase(http_requests_total{job="api-server"}[5m])  

increase should only be used with counters. It is syntactic sugar for rate(v) multiplied by the number of seconds under the specified time range window, and should be used primarily for human readability. Use rate in recording rules so that increases are tracked consistently on a per-second basis.

  • 我们还可以使用许多其他时间序列来监控Kubernetes。例如,可以使用kube_node_status_condition来确定Kubernetes节点的可用性。其它警报规则示例
  • 代码清单:警报规则示例kubernetes_alerts.yml
groups:    - name: kubernetes_alerts      rules:        - alert: DeploymentGenerationOff          expr: kube_deployment_status_observed_generation != kube_deployment_metadata_generation          for: 5m          labels:            severity: warning          annotations:            description: Deployment generation does not match expected generation {{ $labels.namespace }}/{{ $labels.deployment }}            summary: Deployment is outdated        - alert: DeploymentReplicasNotUpdated          expr: ((kube_deployment_status_replicas_updated != kube_deployment_spec_replicas)            or (kube_deployment_status_replicas_available != kube_deployment_spec_replicas))            unless (kube_deployment_spec_paused == 1)          for: 5m          labels:            severity: warning          annotations:            description: Replicas are not updated and available for deployment {{ $labels.namespace }}/{{ $labels.deployment }}            summary: Deployment replicas are outdated        - alert: PodzFrequentlyRestarting          expr: increase(kube_pod_container_status_restarts_total[1h]) > 5          for: 10m          labels:            severity: warning          annotations:            description: Pod {{ $labels.namespace }}/{{ $labels.pod }} was restarted {{ $value }} times within the last hour            summary: Pod is restarting frequently        - alert: KubeNodeNotReady          expr: kube_node_status_condition{condition="Ready",status="true"} == 0          for: 1h          labels:            severity: warning          annotations:            description: The Kubelet on {{ $labels.node }} has not checked in with the API,              or has set itself to NotReady, for more than an hour            summary: Node status is NotReady        - alert: KubeManyNodezNotReady          expr: count(kube_node_status_condition{condition="Ready",status="true"} == 0)            > 1 and (count(kube_node_status_condition{condition="Ready",status="true"} ==            0) / count(kube_node_status_condition{condition="Ready",status="true"})) > 0.2          for: 1m          labels:            severity: critical          annotations:            description: '{{ $value }}% of Kubernetes nodes are not ready'        - alert: APIHighLatency          expr: apiserver_latency_seconds:quantile{quantile="0.99",subresource!="log",verb!~"^(?:WATCH|WATCHLIST|PROXY|CONNECT)$"} > 4          for: 10m          labels:            severity: critical          annotations:            description: the API server has a 99th percentile latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}        - alert: APIServerErrorsHigh          expr: rate(apiserver_request_count{code=~"^(?:5..)$"}[5m]) / rate(apiserver_request_count[5m]) * 100 > 5          for: 10m          labels:            severity: critical          annotations:            description: API server returns errors for {{ $value }}% of requests        - alert: KubernetesAPIServerDown          expr: up{job="kubernetes-apiservers"} == 0          for: 10m          labels:            severity: critical          annotations:            summary: Apiserver {{ $labels.instance }} is down!        - alert: KubernetesAPIServersGone          expr:  absent(up{job="kubernetes-apiservers"})          for: 10m          labels:            severity: critical          annotations:            summary: No Kubernetes apiservers are reporting!            description: Werner Heisenberg says - OMG Where are my apiserverz?  

12.5.2 Kube API

  • 我们还想创建一个作业来监控Kubernetes API本身。与API相关的指标将构成Kubernetes监控的核心,使我们能够监控集群的延迟、错误率和可用性
  • 代码清单:API服务器作业
- job_name: 'kubernetes-apiservers'        scheme: https        tls_config:          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt          insecure_skip_verify: true        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: endpoints        relabel_configs:        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]          action: keep          regex: default;kubernetes;https  
  • 我们将作业命名为kubernetes-apiservers,使用https来抓取指标,并指定证书颁发机构和一个本地令牌文件,以对Kubernetes进行身份验证。我们再次使用Kubernetes发现,这次会返回Kubernetes endpoints列表。我们不会使用所有端点,并且重新标记配置使用keep操作,以在default命名空间中仅保留名为kubenetes的服务,即运行API的Kubernetes主节点
  • 代码清单:API服务器记录规则
- record: apiserver_latency_seconds:quantile    expr: histogram_quantile(0.99, rate(apiserver_request_latencies_bucket[5m])) / 1e+06    labels:      quantile: "0.99"    - record: apiserver_latency_seconds:quantile      expr: histogram_quantile(0.9, rate(apiserver_request_latencies_bucket[5m])) / 1e+06      labels:        quantile: "0.9"    - record: apiserver_latency_seconds:quantile      expr: histogram_quantile(0.5, rate(apiserver_request_latencies_bucket[5m])) / 1e+06      labels:        quantile: "0.5"  
  • 这里使用apiserver_request_latencies_bucket指标来计算延迟,该指标包含特定API资源、子资源和谓词的维度,用于衡量请求延迟。我们为第50、90和99百分位数创建了三个规则,将quantile设置为特定百分位数,然后使用histogram_quantile函数从指标桶中创建百分位数。我们已经指定了正在寻找的百分位数,例如0.99,然后计算了5分钟向量的速率rate,并将结果除以1e+06或1 000 000以获得微秒延迟

histogram_quantile(φ float, b instant-vector) calculates the φ-quantile (0 ≤ φ ≤ 1) from the buckets b of a histogram. (See histograms and summaries for a detailed explanation of φ-quantiles and the usage of the histogram metric type in general.) The samples in b are the counts of observations in each bucket. Each sample must have a label le where the label value denotes the inclusive upper bound of the bucket. (Samples without such a label are silently ignored.) The histogram metric type automatically provides time series with the _bucketsuffix and the appropriate labels. Use the rate() function to specify the time window for the quantile calculation. Example: A histogram metric is called http_request_duration_seconds. To calculate the 90th percentile of request durations over the last 10m, use the following expression:

histogram_quantile(0.9, rate(http_request_duration_seconds_bucket[10m]))  

The quantile is calculated for each label combination in http_request_duration_seconds. To aggregate, use the sum() aggregator around the rate() function. Since the le label is required by histogram_quantile(), it has to be included in the by clause. The following expression aggregates the 90th percentile by job:

histogram_quantile(0.9, sum(rate(http_request_duration_seconds_bucket[10m])) by (job, le))  

To aggregate everything, specify only the le label:

histogram_quantile(0.9, sum(rate(http_request_duration_seconds_bucket[10m])) by (le))  

The histogram_quantile() function interpolates quantile values by assuming a linear distribution within a bucket. The highest bucket must have an upper bound of +Inf. (Otherwise, NaN is returned.) If a quantile is located in the highest bucket, the upper bound of the second highest bucket is returned. A lower limit of the lowest bucket is assumed to be 0 if the upper bound of that bucket is greater than 0. In that case, the usual linear interpolation is applied within that bucket. Otherwise, the upper bound of the lowest bucket is returned for quantiles located in the lowest bucket. If b contains fewer than two buckets, NaN is returned. For φ < 0, -Inf is returned. For φ > 1, +Inf is returned.

  • 代码清单:API高延迟警报
- alert: APIHighLatency    expr: apiserver_latency_seconds:quantile{quantile="0.99",subresource!="log",verb!~"^(?:WATCH|WATCHLIST|PROXY|CONNECT)$"} > 4    for: 10m    labels:      severity: critical    annotations:      description: the API server has a 99th percentile latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}  
  • 警报使用了刚刚创建的apiserver_latency_seconds:quantile指标。我们使用标签来选择第99百分位数,不是log的子资源,以及任何不是WATCH、WATCHLIST、PROXY或CONNECT的谓词。如果任何指标的延迟持续10分钟超过4,则会触发警报
  • 代码清单:检测API高错误率警报
- alert: APIServerErrorsHigh    expr: rate(apiserver_request_count{code=~"^(?:5..)$"}[5m]) / rate(apiserver_request_count[5m]) * 100 > 5    for: 10m    labels:      severity: critical    annotations:      description: API server returns errors for {{ $value }}% of requests  
  • 此警报计算API请求的错误率,使用正则表达式来匹配任何以5xx开头的错误。如果5分钟向量的百分比超过5%,则会触发警报
  • 代码清单:监控API服务器故障或不存在
- alert: KubernetesAPIServerDown    expr: up{job="kubernetes-apiservers"} == 0    for: 10m    labels:      severity: critical    annotations:      summary: Apiserver {{ $labels.instance }} is down!  
  • 我们可以监控Kubernetes节点以及在节点上运行的Docker守护进程和容器

12.5.3 cAdvisor和节点

  • Kubernetes默认提供cAdvisor和特定节点的时间序列。我们可以创建一个作业来从每个节点的Kubernetes API中抓取这些时间序列。我们可以使用这些时间序列监控节点,以及每个节点上的Docker守护进程和容器
  • 代码清单:为cAdvisor添加一个作业
- job_name: 'kubernetes-cadvisor'    scheme: https    tls_config:      insecure_skip_verify: true      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt    bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token    kubernetes_sd_configs:      - role: node    relabel_configs:      - action: labelmap        regex: __meta_kubernetes_node_label_(.+)      - target_label: __address__        replacement: kubernetes.default.svc:443      - source_labels: [__meta_kubernetes_node_name]        regex: (.+)        target_label: __metrics_path__        replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor  
  • 这里将作业命名为kubernetes-cadvisor,并使用服务发现来返回node角色的Kubernetes节点列表。我们使用https来抓取指标,并指定证书颁发机构和一个本地令牌文件以对Kubernetes进行身份验证
  • 然后我们重新标记时间序列,以便从使用labelmap发现的元数据标签中创建标签,将__adress__标签替换为Kubernetes API服务器的默认DNS名称。然后我们使用其中一个元数据标签,一个带有节点名称的标签,在API上创建一个新标签__metrics_path__,它将节点名称传递给路径
/api/v1/nodes/${1}/proxy/metrics/cadvisor  
  • 这将为作业发现的每个节点抓取所需的时间序列。在我们的配置中也有一个监控节点本身的作业,它暴露了一些Kubernetes节点级指标
  • 我们可以使用这些指标来监控底层容器、Docker守护进程和Kubernetes级节点性能
apiVersion: v1  kind: ConfigMap  metadata:    name: prometheus-server-conf    labels:      name: prometheus-server-conf    namespace: monitoring  data:    prometheus.yml: |-      global:        scrape_interval:     15s        evaluation_interval: 15s        alerting:        alertmanagers:        - kubernetes_sd_configs:          - role: endpoints          relabel_configs:          - source_labels: [__meta_kubernetes_service_name]            regex: alertmanager            action: keep          - source_labels: [__meta_kubernetes_namespace]            regex: monitoring            action: keep          - source_labels: [__meta_kubernetes_endpoint_port_name]            action: keep            regex: cluster        rule_files:        - "/var/prometheus/rules/*_rules.yml"        - "/var/prometheus/rules/*_alerts.yml"        scrape_configs:      - job_name: 'kubernetes-service-endpoints'        kubernetes_sd_configs:        - role: endpoints        relabel_configs:        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]          action: keep          regex: true        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]          action: replace          target_label: __scheme__          regex: (https?)        - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]          action: replace          target_label: __metrics_path__          regex: (.+)        - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]          action: replace          target_label: __address__          regex: ([^:]+)(?::d+)?;(d+)          replacement: $1:$2        - action: labelmap          regex: __meta_kubernetes_service_label_(.+)        - source_labels: [__meta_kubernetes_namespace]          action: replace          target_label: kubernetes_namespace        - source_labels: [__meta_kubernetes_service_name]          action: replace          target_label: kubernetes_name        - job_name: 'kubernetes-nodes'        scheme: https        tls_config:          insecure_skip_verify: true          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: node        relabel_configs:        - action: labelmap          regex: __meta_kubernetes_node_label_(.+)        - target_label: __address__          replacement: kubernetes.default.svc:443        - source_labels: [__meta_kubernetes_node_name]          regex: (.+)          target_label: __metrics_path__          replacement: /api/v1/nodes/${1}/proxy/metrics        - job_name: 'kubernetes-cadvisor'        scheme: https        tls_config:          insecure_skip_verify: true          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: node        relabel_configs:        - action: labelmap          regex: __meta_kubernetes_node_label_(.+)        - target_label: __address__          replacement: kubernetes.default.svc:443        - source_labels: [__meta_kubernetes_node_name]          regex: (.+)          target_label: __metrics_path__          replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor        - job_name: 'kubernetes-apiservers'        scheme: https        tls_config:          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt          insecure_skip_verify: true        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token        kubernetes_sd_configs:        - role: endpoints        relabel_configs:        - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]          action: keep          regex: default;kubernetes;https  

  • 《Kubernetes:Up and Running》