查看Kubernetes集群升级步骤
首先
我正在进行Kubernetes的验证,但现在暂时停下来,想要升级集群的版本。
考虑到版本升级可能会导致一些故障,所以必须先确保我可以接受“就算出了问题也没关系”这样的状态,才能踏出这一步。
本次升级将按照以下步骤,从v1.17升级至v1.18。
升级kubeadm集群
升级版本的概要
这次是一个由一个Master节点和两个Worker节点组成的集群结构。
全部都是在本地环境(虚拟机上的个人电脑)中进行,操作系统是CentOS 7。
对每个节点进行逐一升级,无需停止业务即可完成升级(滚动更新)。
我没有备份数据,但我认为在需要之前最好提前备份。
关于Kubernetes的版本。
Kubernetes目前的最新版本是v1.18。其中,“1”是主要版本,“18”是次要版本。
次要版本大约每三个月发布一次,并支持三代版本。因此,目前支持的版本是v1.18/v1.17/v1.16。
一旦支持结束,即使发现了漏洞也不会修复,所以在生产环境中使用Kubernetes时,必须始终升级集群版本。
请注意
如果你想避免这个问题,使用专为企业定制的 Kubernetes 产品也是一个选择。Red Hat 的 OpenShift 是其中一个典型的产品。似乎 OpenShift 提供了为期三年的支持。
主节点的升级
从主节点进行升级。
确认现有环境
确认Kubeadm和kubelet的版本。
$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:12:12Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
$ kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 138d v1.17.3
k8s-worker01 Ready <none> 138d v1.17.3
k8s-worker02 Ready <none> 138d v1.17.3
另外,每个worker节点都部署了一个Pod。
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-dep-5589d85476-4t4zb 1/1 Running 0 18m 192.168.79.127 k8s-worker01 <none> <none>
nginx-dep-5589d85476-p4zgk 1/1 Running 0 18m 192.168.69.227 k8s-worker02 <none> <none>
确认升级版本 (Confirm upgrade version)
确认可用版本。
$ yum list --showduplicates kubeadm --disableexcludes=kubernetes
読み込んだプラグイン:fastestmirror, langpacks
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
Determining fastest mirrors
* base: ftp-srv2.kddilabs.jp
* extras: ftp-srv2.kddilabs.jp
* updates: ftp.yz.yamagata-u.ac.jp
kubernetes 505/505
インストール済みパッケージ
kubeadm.x86_64 1.17.3-0 @kubernetes
利用可能なパッケージ
kubeadm.x86_64 1.6.0-0 kubernetes
kubeadm.x86_64 1.6.1-0 kubernetes
kubeadm.x86_64 1.6.2-0 kubernetes
kubeadm.x86_64 1.6.3-0 kubernetes
・・・
kubeadm.x86_64 1.17.6-0 kubernetes
kubeadm.x86_64 1.18.0-0 kubernetes
kubeadm.x86_64 1.18.1-0 kubernetes
kubeadm.x86_64 1.18.2-0 kubernetes
kubeadm.x86_64 1.18.3-0 kubernetes
这次我们将使用最新的版本”v1.18.3″。
kubeadm 升级
升级kubeadm工具。
$ sudo yum install -y kubeadm-1.18.3-0 --disableexcludes=kubernetes
読み込んだプラグイン:fastestmirror, langpacks
Determining fastest mirrors
* base: ftp.iij.ad.jp
* extras: ftp.iij.ad.jp
* updates: ftp.iij.ad.jp
base | 3.6 kB 00:00:00
docker-ce-stable | 3.5 kB 00:00:00
extras | 2.9 kB 00:00:00
kubernetes/signature | 454 B 00:00:00
kubernetes/signature | 1.4 kB 00:00:00 !!!
updates | 2.9 kB 00:00:00
(1/4): extras/7/x86_64/primary_db | 205 kB 00:00:00
(2/4): docker-ce-stable/x86_64/primary_db | 45 kB 00:00:00
(3/4): kubernetes/primary | 73 kB 00:00:01
(4/4): updates/7/x86_64/primary_db | 3.0 MB 00:00:01
kubernetes 533/533
依存性の解決をしています
--> トランザクションの確認を実行しています。
---> パッケージ kubeadm.x86_64 0:1.17.3-0 を 更新
---> パッケージ kubeadm.x86_64 0:1.18.3-0 を アップデート
--> 依存性解決を終了しました。
依存性を解決しました
====================================================================================================================================================================================================================================
Package アーキテクチャー バージョン リポジトリー 容量
====================================================================================================================================================================================================================================
更新します:
kubeadm x86_64 1.18.3-0 kubernetes 8.8 M
トランザクションの要約
====================================================================================================================================================================================================================================
更新 1 パッケージ
総ダウンロード容量: 8.8 M
Downloading packages:
Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
a23839a743e789babb0ce912fa440f6e6ceb15bc5db42dd91aa0838c994b3452-kubeadm-1.18.3-0.x86_64.rpm | 8.8 MB 00:00:02
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
更新します : kubeadm-1.18.3-0.x86_64 1/2
整理中 : kubeadm-1.17.3-0.x86_64 2/2
検証中 : kubeadm-1.18.3-0.x86_64 1/2
検証中 : kubeadm-1.17.3-0.x86_64 2/2
更新:
kubeadm.x86_64 0:1.18.3-0
完了しました!
我会确认版本是否升级。
$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.3", GitCommit:"2e7996e3e2712684bc73f0dec0200d64eec7fe40", GitTreeState:"clean", BuildDate:"2020-05-20T12:49:29Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
已经是指定的版本(v1.18.3)了。
排水(将Pod中的内容备份)
从主节点上移除Pod并将其从调度列表中剔除。在此过程中,使用”–ignore-daemonsets”选项,也将排除DaemonSet。
$ kubectl drain k8s-master --ignore-daemonsets
node/k8s-master cordoned
evicting pod "calico-kube-controllers-77c4b7448-6prr9"
evicting pod "coredns-6955765f44-55wbn"
evicting pod "coredns-6955765f44-bhdvr"
pod/calico-kube-controllers-77c4b7448-6prr9 evicted
pod/coredns-6955765f44-55wbn evicted
pod/coredns-6955765f44-bhdvr evicted
node/k8s-master evicted
确认升级计划
确认集群可以进行升级。
$ sudo kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.17.3
[upgrade/versions] kubeadm version: v1.18.3
[upgrade/versions] Latest stable version: v1.18.5
[upgrade/versions] Latest stable version: v1.18.5
[upgrade/versions] Latest version in the v1.17 series: v1.17.8
[upgrade/versions] Latest version in the v1.17 series: v1.17.8
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT AVAILABLE
Kubelet 3 x v1.17.3 v1.17.8
Upgrade to the latest version in the v1.17 series:
COMPONENT CURRENT AVAILABLE
API Server v1.17.3 v1.17.8
Controller Manager v1.17.3 v1.17.8
Scheduler v1.17.3 v1.17.8
Kube Proxy v1.17.3 v1.17.8
CoreDNS 1.6.5 1.6.7
Etcd 3.4.3 3.4.3-0
You can now apply the upgrade by executing the following command:
kubeadm upgrade apply v1.17.8
_____________________________________________________________________
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT AVAILABLE
Kubelet 3 x v1.17.3 v1.18.5
Upgrade to the latest stable version:
COMPONENT CURRENT AVAILABLE
API Server v1.17.3 v1.18.5
Controller Manager v1.17.3 v1.18.5
Scheduler v1.17.3 v1.18.5
Kube Proxy v1.17.3 v1.18.5
CoreDNS 1.6.5 1.6.7
Etcd 3.4.3 3.4.3-0
You can now apply the upgrade by executing the following command:
kubeadm upgrade apply v1.18.5
Note: Before you can perform this upgrade, you have to update kubeadm to v1.18.5.
_____________________________________________________________________
最后会显示升级命令。不知为何,此处显示为“v1.18.5”。
升级
我将尝试在升级计划中显示的“v1.18.5”版本进行升级。
$ sudo kubeadm upgrade apply v1.18.5
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.18.5"
[upgrade/versions] Cluster version: v1.17.3
[upgrade/versions] kubeadm version: v1.18.3
[upgrade/version] FATAL: the --version argument is invalid due to these errors:
- Specified version to upgrade to "v1.18.5" is higher than the kubeadm version "v1.18.3". Upgrade kubeadm first using the tool you used to install kubeadm
Can be bypassed if you pass the --force flag
To see the stack trace of this error execute with --v=5 or higher
我失败了。由于与使用yum安装的kubeadm的版本(v1.18.3)不相容,所以失败了。
尽管Upgrade plan显示v1.18.5,但我们将选择指定v1.18.3进行升级。
$ sudo kubeadm upgrade apply v1.18.3
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.18.3"
[upgrade/versions] Cluster version: v1.17.3
[upgrade/versions] kubeadm version: v1.18.3
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler etcd]
[upgrade/prepull] Prepulling image for component etcd.
[upgrade/prepull] Prepulling image for component kube-apiserver.
[upgrade/prepull] Prepulling image for component kube-controller-manager.
[upgrade/prepull] Prepulling image for component kube-scheduler.
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-etcd
[upgrade/prepull] Prepulled image for component etcd.
[upgrade/prepull] Prepulled image for component kube-controller-manager.
[upgrade/prepull] Prepulled image for component kube-apiserver.
[upgrade/prepull] Prepulled image for component kube-scheduler.
[upgrade/prepull] Successfully prepulled the images for all the control plane components
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.18.3"...
Static pod: kube-apiserver-k8s-master hash: 6b6a0a73255ee4281de8dca07998054a
Static pod: kube-controller-manager-k8s-master hash: 4d17d776eb5d9c61bbec5d1e95adacfb
Static pod: kube-scheduler-k8s-master hash: e3025acd90e7465e66fa19c71b916366
[upgrade/etcd] Upgrading to TLS for etcd
{"level":"warn","ts":"2020-07-13T22:08:19.236+0900","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://10.20.30.10:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[upgrade/etcd] Non fatal issue encountered during upgrade: the desired etcd version for this Kubernetes version "v1.18.3" is "3.4.3-0", but the current etcd version is "3.4.3". Won't downgrade etcd, instead just continue
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests623514539"
W0713 22:08:20.478236 29486 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Renewing apiserver-etcd-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-07-13-22-08-04/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-apiserver-k8s-master hash: 6b6a0a73255ee4281de8dca07998054a
Static pod: kube-apiserver-k8s-master hash: 83c4ef266c3b5fae801e94624406195e
[apiclient] Found 1 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-07-13-22-08-04/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-controller-manager-k8s-master hash: 4d17d776eb5d9c61bbec5d1e95adacfb
Static pod: kube-controller-manager-k8s-master hash: c019bf493518b70e6417f6d40acb391a
[apiclient] Found 1 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-07-13-22-08-04/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-scheduler-k8s-master hash: e3025acd90e7465e66fa19c71b916366
Static pod: kube-scheduler-k8s-master hash: e3025acd90e7465e66fa19c71b916366
Static pod: kube-scheduler-k8s-master hash: e3025acd90e7465e66fa19c71b916366
Static pod: kube-scheduler-k8s-master hash: fcdf74fa577cf14b27fe39d482d17a2b
[apiclient] Found 1 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.18.3". Enjoy!
[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.
这次我成功了!
解封
运行kubectl uncordon命令,将主节点重新设为可调度目标。
$ kubectl uncordon k8s-master
node/k8s-master uncordoned
kubelet和kubectl的升级
将kubelet和kubectl升级至v1.18.3版本。
$ sudo yum install -y kubelet-1.18.3-0 kubectl-1.18.3-0 --disableexcludes=kubernetes
読み込んだプラグイン:fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: ftp.iij.ad.jp
* extras: ftp.iij.ad.jp
* updates: ftp.iij.ad.jp
依存性の解決をしています
--> トランザクションの確認を実行しています。
---> パッケージ kubectl.x86_64 0:1.17.3-0 を 更新
---> パッケージ kubectl.x86_64 0:1.18.3-0 を アップデート
---> パッケージ kubelet.x86_64 0:1.17.3-0 を 更新
---> パッケージ kubelet.x86_64 0:1.18.3-0 を アップデート
--> 依存性解決を終了しました。
依存性を解決しました
====================================================================================================================================================================================================================================
Package アーキテクチャー バージョン リポジトリー 容量
====================================================================================================================================================================================================================================
更新します:
kubectl x86_64 1.18.3-0 kubernetes 9.5 M
kubelet x86_64 1.18.3-0 kubernetes 21 M
トランザクションの要約
====================================================================================================================================================================================================================================
更新 2 パッケージ
総ダウンロード容量: 30 M
Downloading packages:
Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
(1/2): cd5d6980c3e1b15de222db08729eff40f7031b7fa56c71ae3e28e420ba9678cd-kubectl-1.18.3-0.x86_64.rpm | 9.5 MB 00:00:05
(2/2): d1a0216cfab2fb28e82be531327ebde9a554bb6d33e3c8313acc9bc728ba59d1-kubelet-1.18.3-0.x86_64.rpm | 21 MB 00:00:09
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
合計 3.3 MB/s | 30 MB 00:00:09
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
更新します : kubectl-1.18.3-0.x86_64 1/4
更新します : kubelet-1.18.3-0.x86_64 2/4
整理中 : kubectl-1.17.3-0.x86_64 3/4
整理中 : kubelet-1.17.3-0.x86_64 4/4
検証中 : kubelet-1.18.3-0.x86_64 1/4
検証中 : kubectl-1.18.3-0.x86_64 2/4
検証中 : kubectl-1.17.3-0.x86_64 3/4
検証中 : kubelet-1.17.3-0.x86_64 4/4
更新:
kubectl.x86_64 0:1.18.3-0 kubelet.x86_64 0:1.18.3-0
完了しました!
重启kubelet
重启Kubelet并检查版本。
$ sudo systemctl daemon-reload
$ sudo systemctl restart kubelet
$ kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 138d v1.18.3
k8s-worker01 Ready <none> 138d v1.17.3
k8s-worker02 Ready <none> 138d v1.17.3
只有Master节点是”v1.18.3″。
工人节点的升级(第一台)
接下来,将进行Worker节点的升级。由于Worker节点和Master节点有不同的任务,所以我们使用不同的提示进行区分。
kubeadm的升级
升级Worker节点上的kubeadm。
[k8s-worker01]$ sudo yum install -y kubeadm-1.18.3-0 --disableexcludes=kubernetes
読み込んだプラグイン:fastestmirror, langpacks
Determining fastest mirrors
* base: ftp.riken.jp
* extras: ftp.riken.jp
* updates: ftp.riken.jp
base | 3.6 kB 00:00:00
docker-ce-stable | 3.5 kB 00:00:00
extras | 2.9 kB 00:00:00
kubernetes/signature | 454 B 00:00:00
kubernetes/signature | 1.4 kB 00:00:00 !!!
updates | 2.9 kB 00:00:00
(1/6): base/7/x86_64/group_gz | 153 kB 00:00:00
(2/6): docker-ce-stable/x86_64/primary_db | 45 kB 00:00:00
(3/6): extras/7/x86_64/primary_db | 205 kB 00:00:00
(4/6): kubernetes/primary | 73 kB 00:00:01
(5/6): updates/7/x86_64/primary_db | 3.0 MB 00:00:01
(6/6): base/7/x86_64/primary_db | 6.1 MB 00:00:04
kubernetes 533/533
依存性の解決をしています
--> トランザクションの確認を実行しています。
---> パッケージ kubeadm.x86_64 0:1.17.3-0 を 更新
---> パッケージ kubeadm.x86_64 0:1.18.3-0 を アップデート
--> 依存性解決を終了しました。
依存性を解決しました
===================================================================================================================================================================================================================
Package アーキテクチャー バージョン リポジトリー 容量
===================================================================================================================================================================================================================
更新します:
kubeadm x86_64 1.18.3-0 kubernetes 8.8 M
トランザクションの要約
===================================================================================================================================================================================================================
更新 1 パッケージ
総ダウンロード容量: 8.8 M
Downloading packages:
Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
a23839a743e789babb0ce912fa440f6e6ceb15bc5db42dd91aa0838c994b3452-kubeadm-1.18.3-0.x86_64.rpm | 8.8 MB 00:00:02
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
更新します : kubeadm-1.18.3-0.x86_64 1/2
整理中 : kubeadm-1.17.3-0.x86_64 2/2
検証中 : kubeadm-1.18.3-0.x86_64 1/2
検証中 : kubeadm-1.17.3-0.x86_64 2/2
更新:
kubeadm.x86_64 0:1.18.3-0
完了しました!
排水(Pod的撤离)
将部署在Worker节点上的Pod进行排空,从调度目标中移除。同时,也会排空DaemonSet。
[k8s-master]$ kubectl drain k8s-worker01 --ignore-daemonsets
node/k8s-worker01 cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-cgdgk, kube-system/kube-proxy-m76p2, metallb-system/speaker-89wjx
evicting pod default/nginx-dep-5589d85476-4t4zb
evicting pod kube-system/calico-kube-controllers-77c4b7448-b55bl
evicting pod kube-system/coredns-66bff467f8-pmq95
pod/calico-kube-controllers-77c4b7448-b55bl evicted
pod/nginx-dep-5589d85476-4t4zb evicted
pod/coredns-66bff467f8-pmq95 evicted
node/k8s-worker01 evicted
kubelet的升级组成
在Master节点上执行以下命令。
[k8s-master]$ sudo kubeadm upgrade node
[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade] Upgrading your Static Pod-hosted control plane instance to version "v1.18.3"...
Static pod: kube-apiserver-k8s-master hash: 00f5c9af581644d88b6a6824cf93ee01
Static pod: kube-controller-manager-k8s-master hash: b05610bba851d38ee8c93e1d8d8451fc
Static pod: kube-scheduler-k8s-master hash: a8caea92c80c24c844216eb1d68fe417
[upgrade/etcd] Upgrading to TLS for etcd
[upgrade/etcd] Non fatal issue encountered during upgrade: the desired etcd version for this Kubernetes version "v1.18.3" is "3.4.3-0", but the current etcd version is "3.4.3". Won't downgrade etcd, instead just continue
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests174442946"
W0713 22:22:45.680253 21786 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Current and new manifests of kube-apiserver are equal, skipping upgrade
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Current and new manifests of kube-controller-manager are equal, skipping upgrade
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Current and new manifests of kube-scheduler are equal, skipping upgrade
[upgrade] The control plane instance for this node was successfully updated!
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[upgrade] The configuration for this node was successfully updated!
[upgrade] Now you should go ahead and upgrade the kubelet package using your package manager.
kubelet和kubectl的升级
在Worker节点上升级kubelet和kubectl工具。
[k8s-worker01]$ sudo yum install -y kubelet-1.18.3-0 kubectl-1.18.3-0 --disableexcludes=kubernetes
読み込んだプラグイン:fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: ftp.riken.jp
* extras: ftp.riken.jp
* updates: ftp.riken.jp
依存性の解決をしています
--> トランザクションの確認を実行しています。
---> パッケージ kubectl.x86_64 0:1.17.3-0 を 更新
---> パッケージ kubectl.x86_64 0:1.18.3-0 を アップデート
---> パッケージ kubelet.x86_64 0:1.17.3-0 を 更新
---> パッケージ kubelet.x86_64 0:1.18.3-0 を アップデート
--> 依存性解決を終了しました。
依存性を解決しました
===================================================================================================================================================================================================================
Package アーキテクチャー バージョン リポジトリー 容量
===================================================================================================================================================================================================================
更新します:
kubectl x86_64 1.18.3-0 kubernetes 9.5 M
kubelet x86_64 1.18.3-0 kubernetes 21 M
トランザクションの要約
===================================================================================================================================================================================================================
更新 2 パッケージ
総ダウンロード容量: 30 M
Downloading packages:
Delta RPMs disabled because /usr/bin/applydeltarpm not installed.
(1/2): cd5d6980c3e1b15de222db08729eff40f7031b7fa56c71ae3e28e420ba9678cd-kubectl-1.18.3-0.x86_64.rpm | 9.5 MB 00:00:03
(2/2): d1a0216cfab2fb28e82be531327ebde9a554bb6d33e3c8313acc9bc728ba59d1-kubelet-1.18.3-0.x86_64.rpm | 21 MB 00:00:06
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
合計 4.4 MB/s | 30 MB 00:00:06
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
更新します : kubectl-1.18.3-0.x86_64 1/4
更新します : kubelet-1.18.3-0.x86_64 2/4
整理中 : kubectl-1.17.3-0.x86_64 3/4
整理中 : kubelet-1.17.3-0.x86_64 4/4
検証中 : kubelet-1.18.3-0.x86_64 1/4
検証中 : kubectl-1.18.3-0.x86_64 2/4
検証中 : kubectl-1.17.3-0.x86_64 3/4
検証中 : kubelet-1.17.3-0.x86_64 4/4
更新:
kubectl.x86_64 0:1.18.3-0 kubelet.x86_64 0:1.18.3-0
完了しました!
kubelet重新启动
重启kubelet。
[k8s-worker01]$ sudo systemctl daemon-reload
[k8s-worker01]$ sudo systemctl restart kubelet
解除 Worker 节点的未调度状态,使其重新可调度。
[k8s-master]$ kubectl uncordon k8s-worker01
node/k8s-worker01 uncordoned
确认版本。
[k8s-master]$ kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 138d v1.18.3
k8s-worker01 Ready <none> 138d v1.18.3
k8s-worker02 Ready <none> 138d v1.17.3
劳动节点的升级(第二台)
按照与第一台相同的步骤进行升级。由于步骤相同,我将省略详细说明。
[k8s-master]$ kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 138d v1.18.3
k8s-worker01 Ready <none> 138d v1.18.3
k8s-worker02 Ready <none> 138d v1.18.3
Kubernetes集群的升级已经完成。
(选择1)确认业务的持续进行
我們在此次操作中進行了三個節點的滾動更新,而不中斷業務運營。
我們通過外部伺服器經由負載平衡器以每10秒間隔持續向容器發送請求,以確認實際上業務是否停止。
[client]$ while true; do echo -n "$(date +%T) " ; curl -s http://10.20.30.150:8080 ; sleep 10 ;done
22:01:42 Hello!
22:01:52 Hello!
22:02:02 Hello!
22:02:12 Hello!
22:02:23 Hello!
22:02:33 Hello!
・・・
尽管不可能将所有日志都公开,但经确认所有请求都得到了回应。可能在10秒内有时连接中断,但我认为业务没有停止就完成了升级。
(参考2)问题
在执行对k8s-worker02进行排空操作时,出现了失败。原因是由于k8s-worker02上部署的指标服务器具有本地存储,导致无法进行排空操作。
这次我们选择先删除指标服务器来解决问题。以下是详细情况。
事件
[k8s-master]$ kubectl drain k8s-worker02 --ignore-daemonsets
node/k8s-worker02 cordoned
error: unable to drain node "k8s-worker02", aborting command...
There are pending nodes to be drained:
k8s-worker02
error: cannot delete Pods with local storage (use --delete-local-data to override): kube-system/metrics-server-fbc46dc5f-nlhvs
工作节点已被排除在调度目标之外,但由于指标服务器(metrics-server-fbc46dc5f-nlhvs)具有本地存储,则导致drain失败。
处理
查看Metric服务器的详细信息
[k8s-master]$ kubectl -n kube-system describe pod metrics-server-fbc46dc5f-nlhvs
Name: metrics-server-fbc46dc5f-nlhvs
Namespace: kube-system
Priority: 0
Node: k8s-worker02/10.20.30.30
・・・
Containers:
metrics-server:
・・・
Mounts:
/tmp from tmp-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from metrics-server-token-5s57t (ro)
・・・
Volumes:
tmp-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
・・・
给EmptyDir赋值了。
鉴于即使添加了–force选项,结果仍然相同,因此假设度量服务器本身对业务没有影响,并决定先删除它。
强制排空 k8s-worker02 节点,忽略守护进程。
删除指标服务器
在删除之前,请先保存能够重新构建的清单文件。
[k8s-master]$ kubectl -n kube-system get pod metrics-server-fbc46dc5f-nlhvs -o yaml > metrics.yaml
[k8s-master]$ kubectl -n kube-system delete pod metrics-server-fbc46dc5f-nlhvs
pod "metrics-server-fbc46dc5f-nlhvs" deleted
排水
再次排水。
[k8s-master]$ kubectl drain k8s-worker02 --ignore-daemonsets
node/k8s-worker02 already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-tkcz5, kube-system/kube-proxy-6x2wd, metallb-system/speaker-vdh69
evicting pod default/nginx-dep-5589d85476-qkj8p
evicting pod default/nginx-dep-5589d85476-p4zgk
evicting pod kube-system/calico-kube-controllers-77c4b7448-xqqhv
evicting pod kube-system/coredns-66bff467f8-f4s5z
evicting pod metallb-system/controller-5c9894b5cd-4hnzc
pod/controller-5c9894b5cd-4hnzc evicted
error when evicting pod "nginx-dep-5589d85476-qkj8p" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod default/nginx-dep-5589d85476-qkj8p
error when evicting pod "nginx-dep-5589d85476-qkj8p" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
pod/calico-kube-controllers-77c4b7448-xqqhv evicted
pod/nginx-dep-5589d85476-p4zgk evicted
pod/coredns-66bff467f8-f4s5z evicted
evicting pod default/nginx-dep-5589d85476-qkj8p
pod/nginx-dep-5589d85476-qkj8p evicted
node/k8s-worker02 evicted
由于PodDisruptionBudget的影响,显示出了错误,但在这里没有问题。
参考:[Kubernetes]确认节点维护问题
由于成功进行了排水,因此之后将按照确认的步骤进行升级。
重新创建Metric服务器
升级完成后,请应用之前备份的清单文件,并重新创建指标服务器。
[k8s-master]$ kubectl apply -f metrics.yaml
pod/metrics-server-fbc46dc5f-nlhvs created
[k8s-master]$ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master 226m 11% 1659Mi 60%
k8s-worker01 174m 8% 791Mi 28%
k8s-worker02 150m 7% 754Mi 27%
总结
由于Kubernetes仅支持相同版本大约9个月,所以最好每半年进行一次升级来确保时间充裕。本次是由一个由3台服务器组成的集群,所以我们一台一台手动进行了升级,但如果是更大规模的集群,手动操作就会变得困难。如果没有自动化工具如Ansible的支持,要保持升级将会变得困难。