添加监控（Prometheus、Grafana）试试看

3 年 ago

清, 宇

2 minutes

这篇文章是《2018年Kubernetes上的PostgreSQL圣诞日历》的第18天。
昨天我们尝试使用Deployment而不是StatefulSet来运行PostgreSQL，主题是“在k8s上使用Deployment的结果如何？”。

今天我们将为了监控在Kubernetes上的PostgreSQL，设置Prometheus和Grafana。

太长不看。

PrometheusとGrafanaは動いたけど、Kubeletのモニタリングでエラーが出たよ。

安装 Prometheus 和 Grafana。

前提条件 – Qian ti tiao jian

为了在Kubernetes上运行PostgreSQL，我们添加了节点以支持Prometheus和Grafana。我们将在本次使用的YAML等文件中假设以下条件。

ラベルとしてtype=node.mon.proが付けられたノード、HDDは60Gを準備

我参考的博客

我参考了这篇文章。虽然有很多Prometheus的应用案例，但我这次选择了用Helm尝试。

准备Helm

本次环境搭建使用了 Rancher2.0。因此，可以通过图形界面和 Helm 进行部署，但我们选择从命令行环境准备开始。

$ curl https://raw.githubusercontent.com/helm/helm/master/scripts/get > get_helm.sh
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  7236  100  7236    0     0  33044      0 --:--:-- --:--:-- --:--:-- 33192

$ chmod +x get_helm.sh
$ ./get_helm.sh
Downloading https://kubernetes-helm.storage.googleapis.com/helm-v2.12.0-linux-amd64.tar.gz
Preparing to install helm and tiller into /usr/local/bin
helm installed into /usr/local/bin/helm
tiller installed into /usr/local/bin/tiller

$ helm init

首先创建ServiceAccount，然后执行helm的升级操作。需要的文件从git上获取。※这里的顺序可以进行调整。

$ git clone https://github.com/tzkoba/postgres-on-k8s.git
$ cd ./postgres-on-k8s/monitoring/rbac/

$ kubectl apply -f tiller-rbac.yaml
serviceaccount/tiller created
clusterrolebinding.rbac.authorization.k8s.io/tiller created

$ helm init  --upgrade --service-account tiller
$HELM_HOME has been configured at /home/ec2-user/.helm.

Tiller (the Helm server-side component) has been upgraded to the current version.
Happy Helming!

进一步添加所需的代码库。

$ helm repo update
$ helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/

安装 Prometheus Operator。

使用已准备好的Helm，安装Prometheus Operator。虽然下面的内容没有提到，但事先请准备好一个名为”monitoring”的命名空间。

这次，我正在创建一个prometheus-operator-value.yaml文件来指定操作员的部署位置，然后通过-f选项传递给helm install命令。

$ cd ../prometheus/
$ helm install coreos/prometheus-operator --name pg-op --namespace monitoring -f prometheus-operator-value.yaml

# operatorが起動している
$ kubectl get pods --namespace monitoring
NAME                                        READY     STATUS    RESTARTS   AGE
pg-op-prometheus-operator-d8dfc7974-298q9   1/1       Running   0          1m

安装Prometheus和Grafana。

首先，我们将准备用于Prometheus和Grafana的持久化卷。在参考的博客中，他们使用了Azure的磁盘，但这次我们将使用Rook的块设备。※我也尝试了本文中提到的本地卷，但没有成功。。。

$ kubectl apply -f sc-rook-prometheus.yaml
cephblockpool.ceph.rook.io/replicapool2 created
storageclass.storage.k8s.io/rook-ceph-prometheus created

使用此StorageClass，安装Prometheus和Grafana。有关kube-prometheus-value.yaml的详细信息，请参阅此处。

$ helm install coreos/kube-prometheus --name pg --namespace monitoring -f kube-prometheus-value.yaml

我会添加一个用于外部访问的服务，并确认端口。

$ kubectl apply -f svc-prometheus.yaml
service/prometheus-svc created
$ kubectl apply -f svc-grafana.yaml
service/grafana-svc created

$ kubectl get svc -n monitoring
NAME                     TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
alertmanager-operated    ClusterIP   None            <none>        9093/TCP,6783/TCP   1h
grafana-svc              NodePort    10.43.114.173   <none>        3000:30760/TCP      11m
pg-alertmanager          ClusterIP   10.43.163.126   <none>        9093/TCP            13m
pg-exporter-kube-state   ClusterIP   10.43.246.137   <none>        80/TCP              13m
pg-exporter-node         ClusterIP   10.43.98.242    <none>        9100/TCP            13m
pg-grafana               ClusterIP   10.43.254.234   <none>        80/TCP              13m
pg-prometheus            ClusterIP   10.43.245.185   <none>        9090/TCP            13m
prometheus-operated      ClusterIP   None            <none>        9090/TCP            1h
prometheus-svc           NodePort    10.43.113.233   <none>        9090:32229/TCP      11m

可以通过上述端口访问Prometheus和Grafana的全局IP。

请问有什么问题需要解决的？

在之前添加的Prometheus中，目标之一的Kubelet出现了错误。根据这个参考，我们尝试将kubelet exporter使用的端口从https更改为http，但是出现了以下错误。

我想在这里再进行一些调查。

想要添加的东西

我们将在明天的文章中进行尝试，可以使用PostgreSQL、Rook和Prometheus进行监控。今天我们计划在已经建立的基础上添加导出器。

将下述内容以中文进行准确地表述，只需要提供一种选项：

总结

今天我在Kubernetes上构建了监控的标准工具Prometheus和Grafana。但是，由于使用了Helm，所以遇到了相当大的困难。不过，正因为如此，我稍稍开始了解了如何自定义Helm中的value.yaml配置文件。

明天及以后，能否顺利增加出口商，只有神才知道，但我希望能再坚持一段时间。

请多关照。