使用ICP将监控数据存储到PV中(进行helm升级)
使用IBM Cloud Private将监控数据存储到持久卷(PV)中。
我写了一个关于如何将ICP的监控数据持久化的方法,但是与直接编辑部署相比,使用helm upgrade可能更好,所以这是有关该方法的备忘录。
升级舵盘
确认当前发行的图表版本。
helm ls --tls monitoring
将当前版本的values保存到文件中。
helm get values monitoring --tls > values.yaml
如果在helm get values命令中加上-a选项,就可以明确查看未明确设置的默认值,因此可以参考这些值来修改values.yaml文件。
alertmanager:
image:
repository: ibmcom/alertmanager
service:
type: ClusterIP
persistentVolume: # 追加
enabled: true # 追加
size: 1Gi # 追加
storageClass: alertmanager # 追加
certGen:
image:
repository: ibmcom/icp-cert-gen
collectdExporter:
image:
repository: ibmcom/collectd-exporter
configmapReload:
image:
repository: ibmcom/configmap-reload
curl:
image:
repository: ibmcom/curl
elasticsearchExporter:
image:
repository: ibmcom/elasticsearch-exporter
grafana:
image:
repository: ibmcom/grafana
service:
type: ClusterIP
persistentVolume: # 追加
enabled: true # 追加
size: 1Gi # 追加
storageClass: grafana # 追加
kubeStateMetrics:
image:
repository: ibmcom/kube-state-metrics
mode: managed
nodeExporter:
image:
repository: ibmcom/node-exporter
prometheus:
etcdTarget:
enabled: true
etcdAddress:
- 172.30.1.224
etcdPort: "4001"
image:
repository: ibmcom/prometheus
service:
type: ClusterIP
persistentVolume: # 追加
enabled: true # 追加
size: 10Gi # 追加
storageClass: prometheus # 追加
router:
image:
repository: ibmcom/icp-router
subjectAlt: 18.179.83.130
准备与当前发布版本相同的版本图表。如果无法连接到互联网,可从以下位置下载文件并将图表文件放置其中。
如果能够连接到互联网,请注册ibm-charts的存储库。
$ helm repo list
NAME URL
stable https://kubernetes-charts.storage.googleapis.com
local http://127.0.0.1:8879/charts
$ helm repo add ibm-charts https://raw.githubusercontent.com/IBM/charts/master/repo/stable/
"ibm-charts" has been added to your repositories
$ helm repo list
NAME URL
stable https://kubernetes-charts.storage.googleapis.com
local http://127.0.0.1:8879/charts
ibm-charts https://raw.githubusercontent.com/IBM/charts/master/repo/stable/
$
执行`helm upgrade`时,请指定与当前图表相同的版本。
helm upgrade monitoring ibm-charts/ibm-icpmonitoring --version 1.1.1 -f values.yaml --tls
创建PV
确认处于挂起状态的持久卷索取(PersistentVolumeClaim)。
# kubectl get pvc -l release=monitoring-n kube-system
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
monitoring-grafana Pending grafana 4m
monitoring-prometheus Pending prometheus 4m
monitoring-prometheus-alertmanager Pending alertmanager 4m
#
为了在本次PV中使用hostPath,请创建一个目录。
mkdir -p /export/prometheus
mkdir -p /export/alertmanager
mkdir -p /export/grafana
chown 104:107 /export/grafana
根据PVC来创建清单文件,需要注意的是,由于图表版本的不同,访问模式可能会有所变化。
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
hostPath:
path: /export/prometheus
type: ""
storageClassName: prometheus
persistentVolumeReclaimPolicy: Retain
apiVersion: v1
kind: PersistentVolume
metadata:
name: alertmanager-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
hostPath:
path: /export/alertmanager
type: ""
storageClassName: alertmanager
persistentVolumeReclaimPolicy: Retain
apiVersion: v1
kind: PersistentVolume
metadata:
name: grafana-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
hostPath:
path: /export/grafana
type: ""
storageClassName: grafana
persistentVolumeReclaimPolicy: Retain
创建一个PV。
kubectl apply -f prometheus-pv.yaml
kubectl apply -f alertmanager-pv.yaml
kubectl apply -f grafana-pv.yaml
确认之前待定的PVC现在被绑定了。
# kubectl get pvc -n kube-system -l release=monitoring
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
monitoring-grafana Bound grafana-pv 1Gi RWO grafana 2m
monitoring-prometheus Bound prometheus-pv 100Gi RWO prometheus 2m
monitoring-prometheus-alertmanager Bound alertmanager-pv 1Gi RWO alertmanager 2m
#
确认监控的Pod已经处于运行状态。
# kubectl get po -l release=monitoring -n kube-system
NAME READY STATUS RESTARTS AGE
monitoring-exporter-6b88bcd65b-d54lh 1/1 Running 10 14d
monitoring-grafana-688d67b6b-w4zg5 2/2 Running 0 18m
monitoring-prometheus-alertmanager-7468b8ccb8-pd5jd 3/3 Running 0 18m
monitoring-prometheus-b5d998dd8-r5h2l 3/3 Running 0 18m
monitoring-prometheus-elasticsearchexporter-bc66cf47d-zfzqg 1/1 Running 10 14d
monitoring-prometheus-kubestatemetrics-855bcd8dcb-5vwkz 1/1 Running 10 14d
monitoring-prometheus-nodeexporter-k2cj9 1/1 Running 10 14d
#
在Grafana中注册数据源
由于 Pod 的重新创建,Grafana 的数据源配置已丢失,需要重新注册。
创建以下工作。
apiVersion: batch/v1
kind: Job
metadata:
labels:
app: monitoring-grafana
component: setds
name: monitoring-grafana-ds
namespace: kube-system
spec:
activeDeadlineSeconds: 300
template:
metadata:
labels:
app: monitoring-grafana
component: setds
spec:
tolerations:
- key: "dedicated"
operator: "Exists"
effect: "NoSchedule"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: beta.kubernetes.io/arch
operator: In
values:
- amd64
- ppc64le
- key: management
operator: In
values:
- "true"
containers:
- name: grafana-ds
image: "ibmcom/curl:3.6"
command: ["/opt/entry/entrypoint.sh"]
volumeMounts:
- mountPath: "/opt/ibm/monitoring/certs"
name: monitoring-client-certs
- mountPath: "/opt/ibm/monitoring/ca-certs"
name: monitoring-ca-cert
- mountPath: "/opt/entry"
name: grafana-ds-entry
volumes:
- name: monitoring-client-certs
secret:
secretName: monitoring-monitoring-client-certs
- name: monitoring-ca-cert
secret:
secretName: monitoring-monitoring-ca-cert
- name: grafana-ds-entry
configMap:
name: monitoring-grafana-ds-entry-config
defaultMode: 0744
restartPolicy: OnFailure
删除已完成的工作,并执行已创建的工作。
kubectl delete jobs -n kube-system monitoring-grafana-ds
kubectl apply -f job.yaml -n kube-system