使用Elasticsearch和Filebeat进行Kubernetes日志记录 Part1 〜Elasticsearch和Kibana部署指南〜
首先
该文章是系列的第一部分,介绍了在集群部署的应用程序和集群本身的生产环境级别的Kubernetes日志配置。本文使用Elasticsearch作为日志后端。Elasticsearch的设置非常可扩展且具有高可靠性。
部署架构
-
- Elasticsearch データノードポッドは、安定したネットワークIDを提供するヘッドレスサービスを備えたステートフルセットとしてデプロイされます。
-
- Elasticsearch マスターノードポッドは、自動検出に役立つヘッドレスサービスを備えたレプリカセットとしてデプロイされます。
-
- Elasticsearch クライアントノードポッドは、R / Wリクエストのデータノードへのアクセスを許可する内部サービスを備えたレプリカセットとしてデプロイされます。
-
- KibanaポッドとElasticHQポッドは、Kubernetesクラスターの外部からアクセスできるサービスを備えたレプリカセットとしてデプロイされますが、サブネットワークの内部にはあります(特に必要がない限り、公開されません)。
- HPA(Horizontal Pod Auto-scaler)がクライアントノードに導入され、高負荷での自動スケーリングが可能になりました。
需要记住的重要事项:
-
- 设置ES_JAVA_OPTS环境变量。
-
- 设置CLUSTER_NAME环境变量。
-
- 为了解决分裂脑问题,设置NUMBER_OF_MASTERS环境变量,用于主节点部署。如果有3个主节点,则设置为2。
- 为了确保高可用性,如果发生工作节点故障,请在相同的Pod之间设置正确的Pod-AntiAffinity策略。
让我们从将这些服务部署到GKE集群开始。
部署主节点和无界面服务
部署下一份新的清单文件,创建主节点和无头服务。
apiVersion: v1
kind: Namespace
metadata:
name: elasticsearch
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: es-master
namespace: elasticsearch
labels:
component: elasticsearch
role: master
spec:
replicas: 3
template:
metadata:
labels:
component: elasticsearch
role: master
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: role
operator: In
values:
- master
topologyKey: kubernetes.io/hostname
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: es-master
image: quay.io/pires/docker-elasticsearch-kubernetes:6.2.4
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: my-es
- name: NUMBER_OF_MASTERS
value: "2"
- name: NODE_MASTER
value: "true"
- name: NODE_INGEST
value: "false"
- name: NODE_DATA
value: "false"
- name: HTTP_ENABLE
value: "false"
- name: ES_JAVA_OPTS
value: -Xms256m -Xmx256m
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
limits:
cpu: 2
ports:
- containerPort: 9300
name: transport
volumeMounts:
- name: storage
mountPath: /data
volumes:
- emptyDir:
medium: ""
name: "storage"
---
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-discovery
namespace: elasticsearch
labels:
component: elasticsearch
role: master
spec:
selector:
component: elasticsearch
role: master
ports:
- name: transport
port: 9300
protocol: TCP
clusterIP: None
追踪其中任一主节点的日志,将会选出主节点。这发生在主节点选择群组的领导者时。通过追踪主节点的日志,可以了解新数据和客户节点何时被添加。
root$ kubectl -n elasticsearch logs -f po/es-master-594b58b86c-9jkj2 | grep ClusterApplierService
[2018-10-21T07:41:54,958][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-9jkj2] detected_master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300}, added {{es-master-594b58b86c-lfpps}{wZQmXr5fSfWisCpOHBhaMg}{50jGPeKLSpO9RU_HhnVJCA}{10.9.124.81}{10.9.124.81:9300},{es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [3]])
如上所述,名为es-master-594b58b86c-bj7g7的es-master Pod被选举为领导者,并添加了其他两个Pod到集群中。
Elasticsearch-discovery是一个无界面服务,它默认作为Docker镜像的环境变量进行配置,并用于节点之间的发现。当然,这是可以进行覆盖的。
数据节点的部署
使用以下清单,在数据节点上部署有状态集和无头服务。
apiVersion: v1
kind: Namespace
metadata:
name: elasticsearch
---
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
name: fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
fsType: xfs
allowVolumeExpansion: true
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: es-data
namespace: elasticsearch
labels:
component: elasticsearch
role: data
spec:
serviceName: elasticsearch-data
replicas: 3
template:
metadata:
labels:
component: elasticsearch
role: data
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: role
operator: In
values:
- data
topologyKey: kubernetes.io/hostname
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: es-data
image: quay.io/pires/docker-elasticsearch-kubernetes:6.2.4
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: my-es
- name: NODE_MASTER
value: "false"
- name: NODE_INGEST
value: "false"
- name: HTTP_ENABLE
value: "false"
- name: ES_JAVA_OPTS
value: -Xms256m -Xmx256m
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
limits:
cpu: 2
ports:
- containerPort: 9300
name: transport
volumeMounts:
- name: storage
mountPath: /data
volumeClaimTemplates:
- metadata:
name: storage
annotations:
volume.beta.kubernetes.io/storage-class: "fast"
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: fast
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: Service
metadata:
name: elasticsearch-data
namespace: elasticsearch
labels:
component: elasticsearch
role: data
spec:
ports:
- port: 9300
name: transport
clusterIP: None
selector:
component: elasticsearch
role: data
对于数据节点而言,无头服务为节点提供稳定的网络ID,并有助于节点间的数据传输。
在将持久卷连接到Pod之前进行格式化非常重要。您可以通过在创建存储类时指定卷类型来执行此操作。您还可以设置标志以便可以即时扩展卷。有关更多详细信息,请参阅此处。
...
parameters:
type: pd-ssd
fsType: xfs
allowVolumeExpansion: true
...
客户端节点的部署
使用以下的宣言,部署客户端节点并创建外部服务。
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: es-client
namespace: elasticsearch
labels:
component: elasticsearch
role: client
spec:
replicas: 2
template:
metadata:
labels:
component: elasticsearch
role: client
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: role
operator: In
values:
- client
topologyKey: kubernetes.io/hostname
initContainers:
- name: init-sysctl
image: busybox:1.27.2
command:
- sysctl
- -w
- vm.max_map_count=262144
securityContext:
privileged: true
containers:
- name: es-client
image: quay.io/pires/docker-elasticsearch-kubernetes:6.2.4
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: CLUSTER_NAME
value: my-es
- name: NODE_MASTER
value: "false"
- name: NODE_DATA
value: "false"
- name: HTTP_ENABLE
value: "true"
- name: ES_JAVA_OPTS
value: -Xms256m -Xmx256m
- name: NETWORK_HOST
value: _site_,_lo_
- name: PROCESSORS
valueFrom:
resourceFieldRef:
resource: limits.cpu
resources:
limits:
cpu: 1
ports:
- containerPort: 9200
name: http
- containerPort: 9300
name: transport
volumeMounts:
- name: storage
mountPath: /data
volumes:
- emptyDir:
medium: ""
name: storage
---
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
namespace: elasticsearch
labels:
component: elasticsearch
role: client
spec:
selector:
component: elasticsearch
role: client
ports:
- name: http
port: 9200
type: LoadBalancer
在这里部署的服务的目的是从Kubernetes集群的外部访问ES集群,但它仍然位于子网内部。注释”cloud.google.com/load-balancer-type: Internal”确保了这一点。
若将应用程序的读取/写入部署在ES集群内,则可以通过http://elasticsearch.elasticsearch:9200访问Elasticsearch服务。
所有组件都部署完成后,需要确认以下事项。
1. 在使用Ubuntu容器的Kubernetes集群内部部署Elasticsearch。
root$ kubectl run my-shell --rm -i --tty --image ubuntu -- bash
root@my-shell-68974bb7f7-pj9x6:/# curl http://elasticsearch.elasticsearch:9200/_cluster/health?pretty
{
"cluster_name" : "my-es",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 7,
"number_of_data_nodes" : 2,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
-
- 使用GCP内部负载均衡器IP(在这种情况下为10.9.120.8),从集群外部部署Elasticsearch。使用curl http://10.9.120.8:9200/_cluster/health?pretty来检查健康状况,输出与上述相同。
- ES Pod的非亲和规则:
root$ kubectl -n elasticsearch get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
es-client-69b84b46d8-kr7j4 1/1 Running 0 10m 10.8.14.52 gke-cluster1-pool1-d2ef2b34-t6h9
es-client-69b84b46d8-v5pj2 1/1 Running 0 10m 10.8.15.53 gke-cluster1-pool1-42b4fbc4-cncn
es-data-0 1/1 Running 0 12m 10.8.16.58 gke-cluster1-pool1-4cfd808c-kpx1
es-data-1 1/1 Running 0 12m 10.8.15.52 gke-cluster1-pool1-42b4fbc4-cncn
es-master-594b58b86c-9jkj2 1/1 Running 0 18m 10.8.15.51 gke-cluster1-pool1-42b4fbc4-cncn
es-master-594b58b86c-bj7g7 1/1 Running 0 18m 10.8.16.57 gke-cluster1-pool1-4cfd808c-kpx1
es-master-594b58b86c-lfpps 1/1 Running 0 18m 10.8.14.51 gke-cluster1-pool1-d2ef2b34-t6h9
关于缩放的考虑事项
根据CPU阈值,可以在客户端节点上部署自动扩展器。客户端节点的示例HPA如下所示。
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: es-client
namespace: elasticsearch
spec:
maxReplicas: 5
minReplicas: 2
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: es-client
targetCPUUtilizationPercentage: 80
每次自动扩展器启动时,通过监视主节点Pod的日志,可以监视新的客户端节点Pod被添加到集群中。
对于数据节点Pod,您只需使用K8仪表板或GKE控制台增加副本数。 新创建的数据节点将自动添加到集群中,并开始复制其他节点的数据。
主节点Pod只存储集群状态信息,因此不需要自动扩展。如果要添加数据节点,请确保集群中没有偶数个主节点,并确保相应的环境变量NUMBER_OF_MASTERS已更新。
#Check logs of es-master leader pod
root$ kubectl -n elasticsearch logs po/es-master-594b58b86c-bj7g7 | grep ClusterApplierService
[2018-10-21T07:41:53,731][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-bj7g7] new_master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300}, added {{es-master-594b58b86c-lfpps}{wZQmXr5fSfWisCpOHBhaMg}{50jGPeKLSpO9RU_HhnVJCA}{10.9.124.81}{10.9.124.81:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [1] source [zen-disco-elected-as-master ([1] nodes joined)[{es-master-594b58b86c-lfpps}{wZQmXr5fSfWisCpOHBhaMg}{50jGPeKLSpO9RU_HhnVJCA}{10.9.124.81}{10.9.124.81:9300}]]])
[2018-10-21T07:41:55,162][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-bj7g7] added {{es-master-594b58b86c-9jkj2}{x9Prp1VbTq6_kALQVNwIWg}{7NHUSVpuS0mFDTXzAeKRcg}{10.9.125.81}{10.9.125.81:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [3] source [zen-disco-node-join[{es-master-594b58b86c-9jkj2}{x9Prp1VbTq6_kALQVNwIWg}{7NHUSVpuS0mFDTXzAeKRcg}{10.9.125.81}{10.9.125.81:9300}]]])
[2018-10-21T07:48:02,485][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-bj7g7] added {{es-data-0}{SAOhUiLiRkazskZ_TC6EBQ}{qirmfVJBTjSBQtHZnz-QZw}{10.9.126.88}{10.9.126.88:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [4] source [zen-disco-node-join[{es-data-0}{SAOhUiLiRkazskZ_TC6EBQ}{qirmfVJBTjSBQtHZnz-QZw}{10.9.126.88}{10.9.126.88:9300}]]])
[2018-10-21T07:48:21,984][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-bj7g7] added {{es-data-1}{fiv5Wh29TRWGPumm5ypJfA}{EXqKGSzIQquRyWRzxIOWhQ}{10.9.125.82}{10.9.125.82:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [5] source [zen-disco-node-join[{es-data-1}{fiv5Wh29TRWGPumm5ypJfA}{EXqKGSzIQquRyWRzxIOWhQ}{10.9.125.82}{10.9.125.82:9300}]]])
[2018-10-21T07:50:51,245][INFO ][o.e.c.s.ClusterApplierService] [es-master-594b58b86c-bj7g7] added {{es-client-69b84b46d8-v5pj2}{MMjA_tlTS7ux-UW44i0osg}{rOE4nB_jSmaIQVDZCjP8Rg}{10.9.125.83}{10.9.125.83:9300},}, reason: apply cluster state (from master [master {es-master-594b58b86c-bj7g7}{1aFT97hQQ7yiaBc2CYShBA}{Q3QzlaG3QGazOwtUl7N75Q}{10.9.126.87}{10.9.126.87:9300} committed version [6] source [zen-disco-node-join[{es-client-69b84b46d8-v5pj2}{MMjA_tlTS7ux-UW44i0osg}{rOE4nB_jSmaIQVDZCjP8Rg}{10.9.125.83}{10.9.125.83:9300}]]])
在主要的主节点日志中,清楚地显示了每个节点何时被添加到集群中。这对于调试问题非常有帮助。
安装Kibana和ES-HQ。
Kibana 是一款用于可视化 ES 数据的简单工具,而 ES-HQ 则有助于管理和监控 Elasticsearch 集群。在部署 Kibana 和 ES-HQ 时,请注意以下事项。
-
- 在Docker映像的环境变量中需要指定ES-Cluster的名称。
- 为了访问Kibana / ES-HQ部署,服务仅限于组织内部。也就是说,不会创建公共IP。需要使用GCP内部负载均衡器。
Kibana部署
使用以下的清单来创建Kibana的部署和服务。
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: logging
name: kibana
labels:
component: kibana
spec:
replicas: 1
selector:
matchLabels:
component: kibana
template:
metadata:
labels:
component: kibana
spec:
containers:
- name: kibana
image: docker.elastic.co/kibana/kibana-oss:6.2.2
env:
- name: CLUSTER_NAME
value: my-es
- name: ELASTICSEARCH_URL
value: http://elasticsearch.elasticsearch:9200
resources:
limits:
cpu: 200m
requests:
cpu: 100m
ports:
- containerPort: 5601
name: http
---
apiVersion: v1
kind: Service
metadata:
namespace: logging
name: kibana
annotations:
cloud.google.com/load-balancer-type: "Internal"
labels:
component: kibana
spec:
selector:
component: kibana
ports:
- name: http
port: 5601
type: LoadBalancer
ES-HQ的部署。
使用以下宣言书,进行ES-HQ的部署和服务创建。
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: es-hq
namespace: elasticsearch
labels:
component: elasticsearch
role: hq
spec:
replicas: 1
template:
metadata:
labels:
component: elasticsearch
role: hq
spec:
containers:
- name: es-hq
image: elastichq/elasticsearch-hq:release-v3.4.0
env:
- name: HQ_DEFAULT_URL
value: http://elasticsearch:9200
resources:
limits:
cpu: 0.5
ports:
- containerPort: 5000
name: http
---
apiVersion: v1
kind: Service
metadata:
name: hq
namespace: elasticsearch
labels:
component: elasticsearch
role: hq
spec:
selector:
component: elasticsearch
role: hq
ports:
- name: http
port: 5000
type: LoadBalancer
可以使用新创建的内部负载均衡器访问这两个服务。
访问http:// / app / kibana#/ home?_g =()。
Kibana仪表盘:
访问我的搜索引擎集群时,请在此网址访问:http:// /#!/clusters/my-es
总结
现在,ES后端的日志记录部署已经完成。部署完成的Elasticsearch可以在其他应用中使用。在高负载时,客户端节点需要自动扩展,并通过增加状态副本数量来添加数据节点。此外,还需要微调一些env变量,但这很容易。在下一篇博客中,我们将学习如何部署FilebeatDaemonSet以将日志发送到Elasticsearch后端。
这篇文章将成为该文章的续篇,即第二部分。第二部分将详细解释Filebeat的配置。
请预订演示,并直接咨询适合您的监控解决方案。 Elasticsearch主要用于日志监控空间,而MetricFire专注于时间序列监控。敬请期待第二部分:)