Kubernetes 的水平自动缩放

3 年 ago

雅, 悟

3 minutes

此文档基于 HPA v1(v2beta1) ，目前（2021/12）已经成为不推荐使用的方法。最新内容已整理在 https://qiita.com/shmurata/items/e6bd8c56f3e4f9a8e384 ，请参考该文档。

在这篇文章中，我将总结Kubernetes的一个自动扩展功能，即水平Pod自动扩展器。

水平 Pod 自动缩放

水平Pod自动扩展器能够自动调整ReplicationController、Deployment或者ReplicaSet中的Pod数量。可通过使用CPU或自定义指标来进行扩展判定。
水平Pod自动扩展器是作为Kubernetes API对象和控制器实现的。该控制器定期比较用户设置的阈值和指标值，以调整副本数量。

水平Pod自动缩放器的功能

Horizontal Pod Autoscaler的控制器（HorizontalPodAutoscalerController）是ControllerManager中的一个控制器，它默认以30秒的间隔周期执行控制循环（收集指标值并调整副本数量的处理循环）。您可以使用ControllerManager的–horizontal-pod-autoscaler-sync-period标志更改此间隔。

该控制器处理的指标有以一个Pod为单位生成的指标（例如：Pod的CPU使用率）和一个整体的指标（例如：整个应用程序的请求数）。需注意在以一个Pod为单位生成的指标（即每个Pod的指标）的情况下，如果无法获取某些Pod的指标，必须考虑这种情况。

如果没有设置Request，HorizontalPodAutoscalerController将不执行任何操作。请参考自动缩放算法以获取详细信息。

HorizontalPodAutoscalerController有两种收集指标的方法。一种是直接访问Heapster并获取指标的方法，另一种是通过REST客户端访问的方法。在直接访问Heapster时，需要通过API服务器的ServiceProxySubResource进行访问。同时，Heapster需要在kube-system命名空间中部署和运行。关于通过REST客户端访问的方法，将在后面介绍关于自定义指标的部分中进行说明。

API 版本

HorizontalPodAutoscaler是autoscaling API Group的对象。截至2018年5月9日，只有基于CPU使用率的autoscaling是稳定的，并提供在autoscaling/v1中。基于内存使用率的自动伸缩和基于自定义指标的自动伸缩则在autoscaling/v2beta1中提供。目前对于autoscaling/v2的发布时间尚未确定。

使用方法

如果Heapster环境已经部署好了，你只需要创建以下的HorizontalPodAutoscaler对象就可以使用了。

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: WebFrontend
  namespace: default
spec:
  # オートスケールさせるターゲットを設定
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: WebFrontend
  # downscaleする最小値
  minReplicas: 2
  # upscaleする最大値
  maxReplicas: 5
  # すべてのPodの平均CPU使用率の目標値(％の値を設定する)
  targetCPUUtilizationPercentage: 70

这里重要的是 HorizontalPodAutoscalerController 会收集指标，并调整副本数量，使所有 Pod 的平均值接近设定的目标 CPU 利用率百分比（targetCPUUtilizationPercentage）。换句话说，在上述配置中，如果平均 CPU 使用率超过70％，则会进行扩容，如果没有达到 70％，则会进行缩容。 Pod 数量的计算公式如下。

TargetNumOfPods = ceil(sum(CurrentPodsCPUUtilization) / targetCPUUtilizationPercentage)

在利用时需注意的事项

当使用ReplicationController进行滚动更新时，它将被排除在HorizontalPodAutoScaler的适用范围之外。

目前我认为几乎没有人在使用HorizontalPodAutoScaler和ReplicationController，但如果在ReplicationController中使用HorizontalPodAutoScaler，需要注意。当使用kubectl rolling-update命令更新ReplicationController时，会创建新的对象，因此新对象不会绑定到HorizontalPodAutoScaler的目标上。如果要使用HorizontalPodAutoscaler，请使用Deployment。

如果出现了切割问题，请调整ControllerManager的参数。

度量衡经常会被动态评估，并可能频繁地上调或下调。这被称为切换。为了缓解这个问题，ControllerManager中具有–horizontal-pod-autoscaler-downscale-delay和–horizontal-pod-autoscaler-upscale-delay这两个参数。这些参数是用来在进行下调或上调时，等待一段时间后再次执行相同的操作。默认情况下，下调为5分钟，上调为3分钟。这些参数的设置会影响上调或下调的延迟时间，因此在进行调整时要注意。

从V2版本开始支持的功能

多元指标

现在可以根据多个指标进行自动缩放。如果设置了多个指标，将根据每个指标进行评估并选择最大的副本数量。要进行多个设置，请按以下方式进行设置。

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
spec:
  scaleTargetRef:
    kind: Deployment
    name: WebFrontend
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 80
  - type: Object
    object:
      target:
        kind: Service
        name: Frontend
      metricName: hits-per-second
      targetValue: 1k

请参考此处的定义以获取各个值的详细信息。

定制指标

现在可以通过自定义指标进行自动扩展。
为了使用这个功能，您需要进行以下配置。

APIサーバを経由して任意のサービスにアクセスするための機能です
設定の方法などはこちらを参照してください

ControllerManagerを設定する

–horizontal-pod-autoscaler-use-rest-clientsをtrueにする

如果使用minikube，上述内容已经设置好，无需进行任何特殊配置。有一个示例可以在https://github.com/kubernetes-incubator/custom-metrics-apiserver找到自定义度量服务器，可以以此为基础进行创建，但此次我们将使用预先准备好的示例进行配置尝试。

尽管内容稍显陈旧，但是我在https://docs.bitnami.com/kubernetes/how-to/configure-autoscaling-custom-metrics/找到了有关配置自定义度量的方法，所以我将参考该方法来尝试使用示例应用程序的自定义度量进行自动缩放。

# サンプルのマニフェストが含まれるリポジトリを取得
$ git clone git@github.com:luxas/kubeadm-workshop.git
$ cd kubeadm-workshop

# Prometheusのデプロイ
# この例ではPrometheusのメトリクスを利用してオートスケーリングするためPrometheusを用意します
$ kubectl apply -f demos/monitoring/prometheus-operator.yaml
$ kubectl apply -f demos/monitoring/sample-prometheus-instance.yaml.yaml

# カスタムメトリクスサーバのデプロイ
$ kubectl apply -f demos/monitoring/custom-metrics.yaml

# 確認
$ kubectl api-versions
# custom.metrics.k8s.io/v1beta1 が登録されている
$ kubectl get --raw apis/custom.metrics.k8s.io/v1beta1
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"custom.metrics.k8s.io/v1beta1","resources":[{"name":"namespaces/scrape_duration_seconds","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"services/scrape_duration_seconds","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/scrape_samples_post_metric_relabeling","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"namespaces/up","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/http_requests","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"namespaces/http_requests","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/scrape_duration_seconds","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"services/scrape_samples_post_metric_relabeling","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"services/scrape_samples_scraped","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"namespaces/scrape_samples_scraped","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"pods/scrape_samples_scraped","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"pods/http_requests","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"services/http_requests","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"pods/scrape_duration_seconds","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"namespaces/scrape_samples_post_metric_relabeling","singularName":"","namespaced":false,"kind":"MetricValueList","verbs":["get"]},{"name":"pods/scrape_samples_post_metric_relabeling","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/scrape_samples_scraped","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"jobs.batch/up","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"pods/up","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]},{"name":"services/up","singularName":"","namespaced":true,"kind":"MetricValueList","verbs":["get"]}]}

# サンプルアプリケーションのデプロイ
$ kubectl apply -f demos/monitoring/sample-metrics-app.yaml

# hpa を確認
$ kubectl get hpa
NAME                     REFERENCE                       TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
sample-metrics-app-hpa   Deployment/sample-metrics-app   866m/100   2         10        2          2h

HPA的内容被设置如下。

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
  name: sample-metrics-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: sample-metrics-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Object
    object:
      target:
        kind: Service
        name: sample-metrics-app
      metricName: http_requests
      targetValue: 100

这是设置sample-metrics-app服务的QPS为每个Pod 100QPS，以实现自动缩放。
实际测试负载后，自动缩放按以下方式执行。

NAME                     REFERENCE                       TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
sample-metrics-app-hpa   Deployment/sample-metrics-app   603/100   2         10        6          2h

请参照下面的翻译：

https://docs.bitnami.com/kubernetes/how-to/configure-autoscaling-custom-metrics/