升级至OpenShift 4.9 – 更新准备 (Preparing to update)
根据 Red Hat OpenShift 容器平台 (OCP) 4.9 的发布说明,升级至 OCP 4.9 需要进行特定的更新准备步骤。
当从OpenShift Container Platform 4.8升级到4.9版本时需要管理员确认。
在OpenShift Container Platform 4.8.14中引入了一个要求,在将集群从OpenShift Container Platform 4.8升级到4.9之前,管理员必须提供手动确认。
这个步骤是基于OCP 4.8.14版本所实现的功能。
在这里,我们将介绍在OCP 4.8.39环境下准备更新的执行示例。
确认升级至OCP 4.9的路径。
在 OCP 4.8 的环境中,确认升级路径时,显示 Upgradeable=False,表示无法升级到 OCP 4.9。
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.8.39 True False 2d21h Cluster version is 4.8.39
$ oc get clusterversion -o jsonpath='{.items[].spec.channel}{"\n"}'
stable-4.9
$ oc adm upgrade
Cluster version is 4.8.39
Upgradeable=False
Reason: AdminAckRequired
Message: Kubernetes 1.22 and therefore OpenShift 4.9 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/6329921 for details and instructions.
Updates:
VERSION IMAGE
4.9.31 quay.io/openshift-release-dev/ocp-release@sha256:2a28b8ebb53d67dd80594421c39e36d9896b1e65cb54af81fbb86ea9ac3bf2d7
即使在这种状态下执行 oc adm upgrade 命令进行升级,处理也不会开始。
$ oc adm upgrade --to=4.9.31
Cluster version is 4.8.39
Upgradeable=False
Reason: AdminAckRequired
Message: Kubernetes 1.22 and therefore OpenShift 4.9 remove several APIs which require admin consideration. Please see the knowledge article https://access.redhat.com/articles/6329921 for details and instructions.
Updates:
VERSION IMAGE
4.9.31 quay.io/openshift-release-dev/ocp-release@sha256:2a28b8ebb53d67dd80594421c39e36d9896b1e65cb54af81fbb86ea9ac3bf2d7
准备更新至 OpenShift Container Platform 4.9
根据先前的发布说明内容,准备进行更新。关于此事项,至少有以下两处提到,但在此我们将参考官方文档。
-
- 公式ドキュメント
-
- Preparing to update to OpenShift Container Platform 4.9
-
- https://docs.openshift.com/container-platform/4.9/updating/updating-cluster-prepare.html#updating-cluster-prepare
Knowledgebase
Preparing to upgrade to OpenShift Container Platform 4.9
https://access.redhat.com/articles/6329921
审查警报以识别使用已移除API的情况
让我们确认一下文档中所提及的 APIRemovedInNextReleaseInUse 和 APIRemovedInNextEUSReleaseInUse 的情况。首先,我们查看已发出的警报。
$ ALERT_MANAGER=$(oc get route alertmanager-main -n openshift-monitoring -o jsonpath='{@.spec.host}')
$ SA=$(oc sa get-token prometheus-k8s -n openshift-monitoring)
$ curl -skH "Authorization: Bearer $SA" https://$ALERT_MANAGER/api/v1/alerts | jq -r '.data[].labels.alertname' | sort | uniq -c
1 AlertmanagerClusterFailedToSendAlerts
3 AlertmanagerFailedToSendAlerts
1 APIRemovedInNextEUSReleaseInUse
1 APIRemovedInNextReleaseInUse
1 ClusterNotUpgradeable
1 ElasticsearchClusterNotHealthy
1 KubeJobCompletion
2 KubeJobFailed
1 TargetDown
1 UpdateAvailable
1 Watchdog
我将查看 APIRemovedInNextReleaseInUse 和 APIRemovedInNextEUSReleaseInUse 的详细信息。
$ A1=APIRemovedInNextReleaseInUse
$ A2=APIRemovedInNextEUSReleaseInUse
$ curl -skH "Authorization: Bearer $SA" https://$ALERT_MANAGER/api/v1/alerts | jq -r --arg A1 $A1 --arg A2 $A2 '.data[] | .labels.alertname as $L | select($L == $A1 or $L == $A2) | .labels ,.annotations'
{
"alertname": "APIRemovedInNextReleaseInUse",
"group": "networking.k8s.io",
"prometheus": "openshift-monitoring/k8s",
"resource": "ingresses",
"severity": "info",
"version": "v1beta1"
}
{
"message": "Deprecated API that will be removed in the next version is being used. Removing the workload that is using the networking.k8s.io.v1beta1/ingresses API might be necessary for a successful upgrade to the next cluster version. Refer to `oc get apirequestcounts ingresses.v1beta1.networking.k8s.io -o yaml` to identify the workload."
}
{
"alertname": "APIRemovedInNextEUSReleaseInUse",
"group": "networking.k8s.io",
"prometheus": "openshift-monitoring/k8s",
"resource": "ingresses",
"severity": "info",
"version": "v1beta1"
}
{
"message": "Deprecated API that will be removed in the next EUS version is being used. Removing the workload that is using the networking.k8s.io.v1beta1/ingresses API might be necessary for a successful upgrade to the next EUS cluster version. Refer to `oc get apirequestcounts ingresses.v1beta1.networking.k8s.io -o yaml` to identify the workload."
}
使用APIRequestCount来识别已删除API的使用情况
使用在文件中記載的 oc get apirequestcounts 命令,我們來檢查 API 請求次數。
$ oc get apirequestcounts -o jsonpath='{range .items[?(@.status.removedInRelease!="")]}{.status.removedInRelease}{"\t"}{.metadata.name}{"\n"}{end}'
1.22 certificatesigningrequests.v1beta1.certificates.k8s.io
1.22 customresourcedefinitions.v1beta1.apiextensions.k8s.io
1.21 flowschemas.v1alpha1.flowcontrol.apiserver.k8s.io
1.22 ingresses.v1beta1.extensions
1.22 ingresses.v1beta1.networking.k8s.io
1.22 validatingwebhookconfigurations.v1beta1.admissionregistration.k8s.io
$ oc get apirequestcounts | awk '$2 ~ /^1\.2[12]$/ || $2 == "REMOVEDINRELEASE"'
NAME REMOVEDINRELEASE REQUESTSINCURRENTHOUR REQUESTSINLAST24H
certificatesigningrequests.v1beta1.certificates.k8s.io 1.22 0 0
customresourcedefinitions.v1beta1.apiextensions.k8s.io 1.22 0 0
flowschemas.v1alpha1.flowcontrol.apiserver.k8s.io 1.21 0 0
ingresses.v1beta1.extensions 1.22 14 363
ingresses.v1beta1.networking.k8s.io 1.22 6 180
validatingwebhookconfigurations.v1beta1.admissionregistration.k8s.io 1.22 0 0
从上述内容中可以看出,ingresses.v1beta1.extensions和ingresses.v1beta1.networking.k8s.io被使用(即 API 请求计数不为0)。
从上述内容可知,使用了ingresses.v1beta1.extensions和ingresses.v1beta1.networking.k8s.io(API请求计数不为0)。
使用APIRequestCount来识别哪些工作负载正在使用被移除的API。
用所提供的文档方法,尝试确认目标API请求的使用者。
※ 根据所使用的Linux发行版,column命令的选项可能会有所不同。
$ oc get apirequestcounts ingresses.v1beta1.networking.k8s.io -o jsonpath='{range .status.currentHour..byUser[*]}{..byVerb[*].verb}{","}{.username}{","}{.userAgent}{"\n"}{end}' | sort -k 2 -t, -u | column -t -s,
watch system:serviceaccount:cert-manager:cert-manager controller/v0.0.0
$ oc get apirequestcounts ingresses.v1beta1.extensions -o jsonpath='{range .status.currentHour..byUser[*]}{..byVerb[*].verb}{","}{.username}{","}{.userAgent}{"\n"}{end}' | sort -k 2 -t, -u | column -t -s,
watch system:kube-controller-manager cluster-policy-controller/v0.0.0
watch system:kube-controller-manager kube-controller-manager/v1.21.8+ed4d8fd
通过这个,我们可以知道关于使用 ingresses.v1beta1.networking.k8s 和 ingresses.v1beta1.extensions 的用户名、动作和用户代理。为了适应 OCP 4.9 的升级,我们需要采取措施来处理这些信息。需要注意的是,根据官方文档的说明,system:kube-controller-manager 可以忽略,所以这里只需要处理 system:serviceaccount:cert-manager:cert-manager。
提供管理员确认
在完成对 OCP 4.9 的升级准备工作后,通过设置管理员确认(Administrator Acknowledgment),即可开始升级至 OCP 4.9。管理员确认的设置位于以下位置,但默认值为空。
$ oc get cm admin-acks -n openshift-config
NAME DATA AGE
admin-acks 0 3d4h
$ oc -o json get cm admin-acks -n openshift-config | jq -r '.data'
null
用以下的命令来设置管理员确认。
使用以下命令来设定管理员确认。
通过以下命令,设置Administrator Acknowledgment。
$ oc -n openshift-config patch cm admin-acks --patch '{"data":{"ack-4.8-kube-1.22-api-removals-in-4.9":"true"}}' --type=merge
configmap/admin-acks patched
$ oc get cm admin-acks -n openshift-config
NAME DATA AGE
admin-acks 1 3d4h
$ oc -o json get cm admin-acks -n openshift-config | jq -r '.data'
{
"ack-4.8-kube-1.22-api-removals-in-4.9": "true"
}
在进行设置后,执行oc adm upgrade命令将不再显示Upgradeable=False的消息。
oc adm upgrade
Cluster version is 4.8.39
Updates:
VERSION IMAGE
4.9.31 quay.io/openshift-release-dev/ocp-release@sha256:2a28b8ebb53d67dd80594421c39e36d9896b1e65cb54af81fbb86ea9ac3bf2d7