将监控数据存储到PV上的ICP
IBM Cloud Private默认安装时不会持久化监控数据。以下是更改设置以将其存储在PV中的方法备忘。
コンポーネントバージョンIBM Cloud Privatev2.1.0.3
安装时的参数
在安装时的config.yaml参数中可以指定storageClass。但是,如果无法使用NFS等动态供应的话就无法指定,并且可能会想要在安装后进行配置更改,因此需要查找在安装后如何进行更改的方法。
monitoring:
prometheus:
scrapeInterval: 1m
evaluationInterval: 1m
retention: 24h
persistentVolume:
enabled: false
storageClass: "-"
alertmanager:
persistentVolume:
enabled: false
storageClass: "-"
grafana:
user: "admin"
password: "admin"
persistentVolume:
enabled: false
storageClass: "-"
以下是可以使用的方法。
-
- Prometheus/Alertmanager/Grafanaのdeploymentを直接編集する
- ibm-icploggingのhelmリリースのvaluesを変えてhelm upgradeする
在这里提到的是前一种方法,但我认为后一种方法更好。以下是后一种方法。
使用ICP将监测数据存储到PV中(通过helm升级)。
制作PV/PVC
创建 Prometheus/Alertmanager/Grafana 用于存储数据的 PV/PVC。本次使用 hostPath。
创建目录。Prometheus/Alertmanager进程以root用户运行,但Grafana进程以104:107的UID:GID运行,所以要保持一致。
mkdir -p /export/prometheus
mkdir -p /export/alertmanager
mkdir -p /export/grafana
chown 104:107 /export/grafana
创建清单文件。
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
hostPath:
path: /export/prometheus
type: ""
storageClassName: prometheus
persistentVolumeReclaimPolicy: Retain
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: prometheus-pvc
namespace: kube-system
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: prometheus
apiVersion: v1
kind: PersistentVolume
metadata:
name: alertmanager-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
hostPath:
path: /export/alertmanager
type: ""
storageClassName: alertmanager
persistentVolumeReclaimPolicy: Retain
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: alertmanager-pvc
namespace: kube-system
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: alertmanager
apiVersion: v1
kind: PersistentVolume
metadata:
name: grafana-pv
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
hostPath:
path: /export/grafana
type: ""
storageClassName: grafana
persistentVolumeReclaimPolicy: Retain
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana-pvc
namespace: kube-system
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: grafana
制作PV/PVC。
kubectl apply -f prometheus-pv.yaml
kubectl apply -f alertmanager-pv.yaml
kubectl apply -f grafana-pv.yaml
确认已经制作完成的PV/PVC。
# kubectl get pvc -n kube-system
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
alertmanager-pvc Bound alertmanager-pv 1Gi RWO 24s
data-logging-elk-data-0 Bound logging-datanode-172.30.1.224 20Gi RWO logging-storage-datanode 53m
grafana-pvc Bound grafana-pv 1Gi RWO 56s
helm-repo-pvc Bound helm-repo-pv 5Gi RWO helm-repo-storage 54m
image-manager-image-manager-0 Bound image-manager-172.30.1.224 20Gi RWO image-manager-storage 1h
mongodbdir-icp-mongodb-0 Bound mongodb-172.30.1.224 20Gi RWO mongodb-storage 59m
prometheus-pvc Bound prometheus-pv 10Gi RWO 1m
# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
alertmanager-pv 1Gi RWO Retain Bound kube-system/alertmanager-pvc 36s
grafana-pv 1Gi RWO Retain Bound kube-system/grafana-pvc 1m
helm-repo-pv 5Gi RWO Delete Bound kube-system/helm-repo-pvc helm-repo-storage 54m
image-manager-172.30.1.224 20Gi RWO Retain Bound kube-system/image-manager-image-manager-0 image-manager-storage 1h
logging-datanode-172.30.1.224 20Gi RWO Retain Bound kube-system/data-logging-elk-data-0 logging-storage-datanode 1h
mongodb-172.30.1.224 20Gi RWO Retain Bound kube-system/mongodbdir-icp-mongodb-0 mongodb-storage 1h
prometheus-pv 10Gi RWO Retain Bound kube-system/prometheus-pvc 1m
#
Prometheus的配置修改
将Prometheus的存储从emptyDir更改为创建的PVC。
kubectl edit deploy -n kube-system monitoring-prometheus
以下部分的中文表达方式如下:
- emptyDir: {}
name: storage-volume
改成以下的形式。
- persistentVolumeClaim:
claimName: prometheus-pvc
name: storage-volume
报警管理器配置更改
将Alertmanager的存储从emptyDir更改为创建的PVC。
kubectl edit deploy -n kube-system monitoring-prometheus-alertmanager
请把以下内容用中文表达出来,只需要一个版本:
- emptyDir: {}
name: storage-volume
进行以下更改。
- persistentVolumeClaim:
claimName: alertmanager-pvc
name: storage-volume
更改Grafana的设置
将Grafana的存储从emptyDir更改为创建的PVC。请注意名称。
kubectl edit deploy -n kube-system monitoring-grafana
以下内容的汉语释义如下:
- emptyDir: {}
name: grafana-storage
进行以下更改。
- persistentVolumeClaim:
claimName: grafana-pvc
name: grafana-storage
在Grafana中注册数据源。
由于 Pod 被重新创建,Grafana 的数据源设置已经丢失,需要重新注册。
创立以下的工作。
apiVersion: batch/v1
kind: Job
metadata:
labels:
app: monitoring-grafana
component: setds
name: monitoring-grafana-ds
namespace: kube-system
spec:
activeDeadlineSeconds: 300
template:
metadata:
labels:
app: monitoring-grafana
component: setds
spec:
tolerations:
- key: "dedicated"
operator: "Exists"
effect: "NoSchedule"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: beta.kubernetes.io/arch
operator: In
values:
- amd64
- ppc64le
- key: management
operator: In
values:
- "true"
containers:
- name: grafana-ds
image: "ibmcom/curl:3.6"
command: ["/opt/entry/entrypoint.sh"]
volumeMounts:
- mountPath: "/opt/ibm/monitoring/certs"
name: monitoring-client-certs
- mountPath: "/opt/ibm/monitoring/ca-certs"
name: monitoring-ca-cert
- mountPath: "/opt/entry"
name: grafana-ds-entry
volumes:
- name: monitoring-client-certs
secret:
secretName: monitoring-monitoring-client-certs
- name: monitoring-ca-cert
secret:
secretName: monitoring-monitoring-ca-cert
- name: grafana-ds-entry
configMap:
name: monitoring-grafana-ds-entry-config
defaultMode: 0744
restartPolicy: OnFailure
删除已完成的作业,并执行创建的作业。
kubectl delete jobs -n kube-system monitoring-grafana-ds
kubectl apply -f job.yaml -n kube-system
可以参考的链接
-
- モニタリング・サービスの構成
- IBM® Cloud Private モニター・サービス