【Kubernetes 系列五】在 AWS 中使用 Kubernetes:EKS
- 2019 年 10 月 3 日
- 筆記
1. 概述
Amazon Elastic Kubernetes Service (Amazon EKS) 是一项托管服务,可让您在 AWS 上轻松运行 Kubernetes,而无需支持或维护您自己的 Kubernetes 控制层面。
Amazon EKS 跨多个可用区运行 Kubernetes 控制层面实例以确保高可用性。Amazon EKS 可以自动检测和替换运行状况不佳的控制层面实例,并为它们提供自动版本升级和修补。
Amazon EKS 还与许多 AWS 服务集成以便为您的应用程序提供可扩展性和安全性,包括:
- 用于容器镜像的 Amazon ECR
- 用于负载分配的 Elastic Load Balancing
- 用于身份验证的 IAM
- 用于隔离的 Amazon VPC
2. 版本
K8S 版本 | K8S 发布时间 | EKS 平台版本 | EKS 发布日志 |
---|---|---|---|
1.13.7 | 2019.6.7 | eks.1 | Initial release of Kubernetes 1.13 for Amazon EKS. For more information, see Kubernetes 1.13. |
1.12.6 | 2019.2.27 | eks.2 | New platform version to support custom DNS names in the Kubelet certificate and improve etcd performance. This fixes a bug that caused worker node Kubelet daemons to request a new certificate every few seconds. |
1.12.6 | 2019.2.27 | eks.1 | Initial release of Kubernetes 1.12 for Amazon EKS. |
1.11.8 | 2019.3.1 | eks.3 | New platform version to support custom DNS names in the Kubelet certificate and improve etcd performance. |
1.11.8 | 2019.3.1 | eks.2 | New platform version updating Amazon EKS Kubernetes 1.11 clusters to patch level 1.11.8 to address CVE-2019-1002100. |
3. 预备
3.1. 操作环境
3.1.1 Python
- 版本要求:>= 2.7.9
- 用途:安装 aws cli
3.1.2 aws cli
- 版本要求:>= 1.16.156
- 用途:操作 aws 资源
- 安装过程:
pip install awscli --upgrade --user
3.1.3 eksctl
- 版本要求:>= 0.1.37
- 用途:操作 aws eks 资源
- 安装过程:
curl --silent --location "https://github.com/weaveworks/eksctl/releases/download/latest_release/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp sudo mv /tmp/eksctl /usr/local/bin eksctl version
3.1.4 kubectl
- 版本要求:最新版本或不低于 Kubernetes 版本 1 个小版本号。
- 用途:操作 Kubernetes 集群
- 安装过程:
curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl chmod +x ./kubectl sudo mv ./kubectl /usr/local/bin/kubectl kubectl version
3.2. 角色权限
参考:
- Amazon EKS 基于身份的策略示例
- https://github.com/weaveworks/eksctl/issues/204#issuecomment-450280786(这位小哥说他亲自试了 30 多次才补全的,而我试了将近 40 次)
- https://docs.aws.amazon.com/autoscaling/ec2/userguide/control-access-using-iam.html
注意:要有适量网关、VPC 和 IP 数量空余,否则会报达到最大限制错误。
3.2.1. CloudFormation 完全权限
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "cloudformation:*" ], "Resource": "*" } ] }
3.2.2. EKS 读写权限
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "eks:ListClusters", "eks:CreateCluster" ], "Resource": "*" }, { "Sid": "VisualEditor1", "Effect": "Allow", "Action": [ "eks:UpdateClusterVersion", "eks:ListUpdates", "eks:DescribeUpdate", "eks:DescribeCluster", "eks:ListClusters", "eks:CreateCluster" ], "Resource": "arn:aws:eks:*:*:cluster/*" } ] }
3.2.3. EC2 相关权限
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "ec2:CreateInternetGateway", "ec2:CreateVpc", "ec2:Describe*", "ec2:createTags", "ec2:ModifyVpcAttribute", "ec2:CreateSubnet", "ec2:CreateSubnet", "ec2:CreateRouteTable", "ec2:CreateSecurityGroup", "ec2:DeleteSecurityGroup", "ec2:AttachInternetGateway", "ec2:CreateRoute", "ec2:AuthorizeSecurityGroupIngress", "ec2:AuthorizeSecurityGroupEgress", "ec2:RevokeSecurityGroupEgress", "ec2:RevokeSecurityGroupIngress", "ec2:AssociateRouteTable", "ec2:CreateNatGateway", "ec2:AllocateAddress", "ec2:DeleteInternetGateway", "ec2:DeleteNatGateway", "ec2:DeleteRoute", "ec2:DeleteRouteTable", "ec2:DeleteSubnet", "ec2:DeleteTags", "ec2:DeleteVpc", "ec2:DescribeInternetGateways", "ec2:DescribeNatGateways", "ec2:DescribeRouteTables", "ec2:DescribeSecurityGroups", "ec2:DescribeSubnets", "ec2:DescribeTags", "ec2:DescribeVpcAttribute", "ec2:DetachInternetGateway", "ec2:DisassociateRouteTable", "ec2:RunInstances", "ec2:ReleaseAddress" ], "Resource": "*" } ] }
3.2.4. CloudWatch 相关权限
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "cloudwatch:ListMetrics", "cloudwatch:GetMetricStatistics", "cloudwatch:Describe*" ], "Resource": "*" }, ] }
3.2.5. autoscaling 相关权限
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "autoscaling:CreateAutoScalingGroup", "autoscaling:DeleteAutoScalingGroup", "autoscaling:DeleteLaunchConfiguration", "autoscaling:DescribeAutoScalingGroups", "autoscaling:DescribeLaunchConfigurations", "autoscaling:DescribeScalingActivities", "autoscaling:UpdateAutoScalingGroup" ], "Resource": "*" } ] }
3.2.6. elasticloadbalancing 相关权限
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "elasticloadbalancing:Describe*", "Resource": "*" } ] }
3.2.7. iam 相关权限
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "iam:CreateRole", "iam:AttachRolePolicy", "iam:DetachRolePolicy", "iam:GetRole", "iam:PassRole", "iam:CreateInstanceProfile", "iam:AddRoleToInstanceProfile", "iam:RemoveRoleFromInstanceProfile", "iam:GetInstanceProfile", "iam:PutRolePolicy", "iam:DeleteRolePolicy", "iam:GetRolePolicy", "iam:ListInstanceProfiles", "iam:CreateServiceLinkedRole", "iam:ListInstanceProfilesForRole" ], "Resource": "*" } ] }
3.2.8. LaunchTemplate 相关权限
{ "Sid": "VisualEditor2", "Effect": "Allow", "Action": [ "autoscaling:CreateLaunchConfiguration", "ec2:DeleteLaunchTemplate", "ec2:ModifyLaunchTemplate", "ec2:DeleteLaunchTemplateVersions", "ec2:CreateLaunchTemplateVersion" ], "Resource": [ "arn:aws:autoscaling:*:*:launchConfiguration:*:launchConfigurationName/*", "arn:aws:ec2:*:*:launch-template/*" ] }
3.3. 安装 aws-iam-authenticator
参见:https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/install-aws-iam-authenticator.html
curl -o aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.13.7/2019-06-11/bin/linux/amd64/aws-iam-authenticator chmod +x ./aws-iam-authenticator mkdir -p $HOME/bin && cp ./aws-iam-authenticator $HOME/bin/aws-iam-authenticator && export PATH=$HOME/bin:$PATH echo 'export PATH=$HOME/bin:$PATH' >> ~/.bashrc // 获取 token? aws-iam-authenticator token -i <cluster name> // 查看调用者? aws sts get-caller-identity
3.4. 创建 kubeconfig
参见:https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/create-kubeconfig.html
使用以下命令自动生成 kubeconfig
// 生成 kubeconfig aws eks --region <your region> update-kubeconfig --name <cluster name> // 查看 kubeconfig cat ~/.kube/config
4. 开始使用
4.1. 创建集群
使用以下命令开始创建集群,其原理是:通过 aws cli 调用 CloudFormation 的相关 API,启动一个创建 EKS Cluster 的 Stack 和一个创建 EKS nodes 的 Stack 去创建集群所需的各种资源(包括网关、IP、VPC、EC2 等等)。
eksctl create cluster --name prod --version 1.13 --nodegroup-name standard-workers --node-type t3.medium --nodes 3 --nodes-min 1 --nodes-max 4 --node-ami auto
注意:如果选择 P2 或 P3 实例类型和 Amazon EKS 优化的 AMI(具有 GPU 支持),则必须使用以下命令在集群上将适用于 Kubernetes 的 NVIDIA 设备插件用作守护程序集。
kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta/nvidia-device-plugin.yml
4.2. 查看集群状态
// 查看节点状态 kubectl get nodes // 查看服务状态 kubectl get svc // 查看事件 kubectl get events --all-namespaces
4.3. 部署 Dashboard
参见:
- https://aws.amazon.com/cn/premiumsupport/knowledge-center/eks-cluster-kubernetes-dashboard/
- https://docs.aws.amazon.com/zh_cn/eks/latest/userguide/dashboard-tutorial.html
- https://www.youtube.com/watch?v=JcZJqSa65Yc
// 将 Kubernetes 控制面板部署到集群 kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml // 部署 heapster 以在集群上启用容器集群监控和性能分析 kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/heapster.yaml // 将 heapster 的 influxdb 后端部署到集群 kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/influxdb/influxdb.yaml // 为控制面板创建 heapster 集群角色绑定 kubectl apply -f https://raw.githubusercontent.com/kubernetes/heapster/master/deploy/kube-config/rbac/heapster-rbac.yaml // 创建一个具有新集群管理权限的新服务账户 cat > eks-admin-service-account.yaml << EOF apiVersion: v1 kind: ServiceAccount metadata: name: eks-admin namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: eks-admin roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: eks-admin namespace: kube-system EOF // 将此服务账户和集群角色绑定应用到您的集群 kubectl apply -f eks-admin-service-account.yaml // 检索 eks-admin 服务账户的身份验证令牌。从输出中复制 <authentication_token> 值。您可以使用此令牌连接到控制面板 kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep eks-admin | awk '{print $1}') // 将所有请求从您的 Amazon EC2 实例本地主机端口转发到 Kubernetes 控制面板端口 kubectl port-forward svc/kubernetes-dashboard -n kube-system 6443:443 // 从带 SSH 隧道的本地计算机访问端口 ssh -i EC2KeyPair.pem ec2-user@IP -L 6443:127.0.0.1:6443
访问 https://127.0.0.1:6443 输入 Token 即可访问 Dashboard。
4.4. 删除集群
eksctl delete cluster --region=<your region> --name=<cluster name>
4.5. 更多操作
参见:
- https://kubernetes.io/docs/tutorials/