使用RabbitMQ和Prometheus来尝试自定义指标的Kubernetes HPA
想做的事情
当队列中的消息积压时,我们希望增加消费者 Pod 的数量以提高处理速度;当队列中没有消息时,希望减少 Pod 的数量以释放资源。
思考过的事情
-
- HPAでCPUやメモリの使用量でスケール方法もあるが、直接的に待ってるQueueの数を元にスケールしてみたい
-
- HPA with custom metricsの練習
- VPAは、PodがRestartしてしまうので、長い処理をするアプリケーションの場合、Restartするときに途中までの処理が無駄になってしまう
原文:材料
汉语翻译:物料
-
- Kubernetes Horizontal Pod Autoscaler
-
- Kubernetes custom-metrics-api
-
- prometheus-adapter
- RabbitMQ monitoring
政策
-
- 使用Prometheus在RabbitMQ中获取消息就绪的度量指标
-
- 使用prometheus-adapter通过custom-metrics获取RabbitMQ的度量指标
- 设置HPA
代码
请提供一个中文的选项进行释义。
走开
准备好
将Prometheus部署
-
- 安装prometheus-operator:
使用以下命令安装prometheus-operator:
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/bundle.yaml
创建monitoring命名空间:
使用以下命令创建monitoring命名空间:
kubectl create ns monitoring
部署Prometheus:
使用以下命令在monitoring命名空间中部署Prometheus:
kubectl apply -k ../../../prometheus-operator -n monitoring
使用http://localhost:30900 检查UI
2. 部署 RabbitMQ
-
- 安装RabbitMQ运算符
kubectl apply -f https://github.com/rabbitmq/cluster-operator/releases/latest/download/cluster-operator.yml
部署RabbitMQ
rabbitmq/rabbitmq-cluster.yaml
apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
name: rabbitmq
kubectl apply -f rabbitmq/rabbitmq-cluster.yaml
部署PodMonitor(使Prometheus能够Scrape RabbitMQ)
rabbitmq/pod-monitor-rabbitmq.yaml
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
namespace: monitoring
name: rabbitmq
spec:
podMetricsEndpoints:
– interval: 15s
port: prometheus
path: /metrics
selector:
matchLabels:
app.kubernetes.io/component: rabbitmq
namespaceSelector:
any: true
kubectl apply -f rabbitmq/pod-monitor-rabbitmq.yaml
3. 部署RabbitMQ生产者。
每5分钟发送20个RabbitMQ的消息的Cron作业。
兔子消息队列生产者
kind: CronJob
metadata:
name: rabbitmq-producer
spec:
schedule: ‘*/5 * * * *’
successfulJobsHistoryLimit: 1
jobTemplate:
metadata:
name: rabbitmq-producer
spec:
template:
spec:
restartPolicy: Never
containers:
– image: nakamasato/rabbitmq-producer
name: rabbitmq-producer
env:
– name: RABBITMQ_USERNAME
valueFrom:
secretKeyRef:
name: rabbitmq-default-user
key: username
– name: RABBITMQ_PASSWORD
valueFrom:
secretKeyRef:
name: rabbitmq-default-user
key: password
– name: RABBITMQ_HOST
value: rabbitmq
– name: NUM_OF_MESSAGES
value: “20”
kubectl apply -f rabbitmq-producer-cronjob.yaml
部署RabbitMQ消费者
逐条消费RabbitMQ的消息,每个消息处理时间为10秒。
兔子消息队列消费者
kind: Deployment
metadata:
labels:
app: rabbitmq-consumer
name: rabbitmq-consumer
spec:
replicas: 1
selector:
matchLabels:
app: rabbitmq-consumer
template:
metadata:
labels:
app: rabbitmq-consumer
spec:
containers:
– image: nakamasato/rabbitmq-consumer
name: rabbitmq-consumer
env:
– name: RABBITMQ_USERNAME
valueFrom:
secretKeyRef:
name: rabbitmq-default-user
key: username
– name: RABBITMQ_PASSWORD
valueFrom:
secretKeyRef:
name: rabbitmq-default-user
key: password
– name: RABBITMQ_HOST
value: rabbitmq
– name: PROCESS_SECONDS
value: “10”
kubectl apply -f rabbitmq-consumer-deployment.yaml
5. Grafana – Grafana
由于Grafana仅用于在仪表板上查看指标,因此并非必需。
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
name: grafana
labels:
app: grafana
spec:
containers:
– name: grafana
image: grafana/grafana:latest
ports:
– name: grafana
containerPort: 3000
volumeMounts:
– mountPath: /var/lib/grafana
name: grafana-storage
volumes:
– name: grafana-storage
emptyDir: {}
以下是对grafana-deployment.yaml的改写。
grafana-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
name: grafana
labels:
app: grafana
spec:
containers:
– name: grafana
image: grafana/grafana:latest
ports:
– name: grafana
containerPort: 3000
volumeMounts:
– mountPath: /var/lib/grafana
name: grafana-storage
volumes:
– name: grafana-storage
emptyDir: {}
apiVersion: v1
kind: Service
metadata:
name: grafana
spec:
selector:
app: grafana
type: NodePort
ports:
– port: 3000
targetPort: 3000
nodePort: 32111Grafana服务
api版本: v1
种类: 服务
元数据:
名称: grafana
规范:
选择器:
应用: grafana
类型: NodePort
端口:
– 端口: 3000
目标端口: 3000
节点端口: 32111
-
- 部署 Grafana
-
- kubectl apply -f grafana-deployment.yaml, grafana-service.yaml
在http://localhost:32111上使用用户名和密码“admin”登录
导入 RabbitMQ-Overview 仪表板(10991)
这个问题
1. 部署 Prometheus 适配器。
-
- 克隆`prometheus-adapter`
git clone git@github.com:stefanprodan/k8s-prom-hpa.git && cd k8s-prom-hpa
准备证书
touch metrics-ca.key metrics-ca.crt metrics-ca-config.json
make certs
部署
kubectl create -f ./custom-metrics-api
确认RabbitMQ的指标可以通过自定义指标进行获取
kubectl get –raw “/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/rabbitmq_queue_messages_ready”| jq .
{
“kind”: “MetricValueList”,
“apiVersion”: “custom.metrics.k8s.io/v1beta1”,
“metadata”: {
“selfLink”: “/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/rabbitmq_queue_messages_ready”
},
“items”: [
{
“describedObject”: {
“kind”: “Pod”,
“namespace”: “default”,
“name”: “rabbitmq-server-0”,
“apiVersion”: “/v1”
},
“metricName”: “rabbitmq_queue_messages_ready”,
“timestamp”: “2021-03-27T12:01:15Z”,
“value”: “1274”
}
]
}
3. 部署 HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: rabbitmq-consumer
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: rabbitmq-consumer
minReplicas: 1
maxReplicas: 20
metrics:
– type: Object
object:
metric:
name: rabbitmq_queue_messages_ready
describedObject:
kind: Pod
name: rabbitmq-server-0
apiVersion: v1
target:
type: Value
averageValue: 1rabbitmq-consumer-hpa.yaml
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: rabbitmq-consumer
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: rabbitmq-consumer
minReplicas: 1
maxReplicas: 20
metrics:
– type: Object
object:
metric:
name: rabbitmq_queue_messages_ready
describedObject:
kind: Pod
name: rabbitmq-server-0
apiVersion: v1
target:
type: Value
averageValue: 1
kubectl apply -f rabbitmq-consumer-hpa.yaml
4. 通过Grafana可以确认Pod的数量根据RabbitMQ队列的数量而变化。
完成所有任务。
HPAScaleToZero: 1.16で追加されたHPAでReplicaを0までできるAlphaのFeature (https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates)