使用Fluent Bit从OpenShift发送日志

首先

我经常看到使用Fluentd在Kubernetes环境中传输日志的信息,但是关于使用比Fluentd更轻量的Fluent Bit,并且在OpenShift环境中的配置,似乎没有太多的信息可供参考,所以我将记录下我实际尝试过的结果。有关Fluent Bit的详细信息,请参考官方网站。

形成

整体构成

image.png

我们在Red Hat OpenShift on IBM Cloud上部署了Fluent Bit的Pod,并使用Daemonset配置,将整个OpenShift集群的日志传输到安装在虚拟服务器上的Fluent Bit中进行验证。

详细的构成信息 de

OpenShift(Red Hat OpenShift on IBM Cloud):版本号4.5.24_1527
有3个Worker节点(之前在这里的文章中提到的环境)
Fluent Bit:版本号1.6
虚拟服务器:RHEL7.9

尝试制作

虚拟服务器方(接收方)

请按照官方网站上的步骤进行安装,并启动服务。

设定信息在这里。仅有输入/输出部分已从默认值进行了更改。

[SERVICE]
    flush        5
    daemon       Off
    log_level    info
    parsers_file parsers.conf
    plugins_file plugins.conf
    http_server  Off
    http_listen  0.0.0.0
    http_port    2020
    storage.metrics on
[INPUT]
    Name              forward
    Port              24225
    Buffer_Chunk_Size 32MB
    Buffer_Max_Size   64MB
[OUTPUT]
    name  file
    match *
    path  /data/log/td-agent-bit    # ディレクトリ配下にタグ名のファイルが出力されます。

※端口、缓冲区大小等没有特别的意图。太小的大小会导致错误。

打开Shift的一侧(发送方)

在进行此次验证时,我们参考了以下GitHub中的信息:
fluent/fluent-bit-kubernetes-logging
fluent/fluentd-kubernetes-daemonset

为了部署Fluent Bit的Pod,需要创建以下七个资源:
– 命名空间
– 服务账户
– 集群角色
– 集群角色绑定
– 安全上下文约束
– 配置映射
– 守护进程集群

以下是每个设置和YAML文件。
– 命名空间
我们使用oc new-project fluentbittest命令创建了名为fluentbittest的命名空间。

    ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluent-bit
  namespace: fluentbittest
    ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: fluent-bit-read
rules:
- apiGroups: [""]
  resources:
  - namespaces
  - pods
  verbs: ["get", "list", "watch"]
    ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: fluent-bit-read
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluent-bit-read
subjects:
- kind: ServiceAccount
  name: fluent-bit
  namespace: fluentbittest
    SecurityContextConstraints

这是OpenShift独有的特权设置,在上述的GitHub中也有介绍。

kind: SecurityContextConstraints
apiVersion: security.openshift.io/v1
metadata:
  name: fluentbittest
allowPrivilegedContainer: true
allowHostNetwork: true
allowHostDirVolumePlugin: true
priority:
allowedCapabilities: []
allowHostPorts: true
allowHostPID: true
allowHostIPC: true
readOnlyRootFilesystem: false
requiredDropCapabilities: []
defaultAddCapabilities: []
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: MustRunAs
fsGroup:
  type: MustRunAs
supplementalGroups:
  type: RunAsAny
volumes:
  - configMap
  - downwardAPI
  - emptyDir
  - hostPath
  - persistentVolumeClaim
  - projected
  - secret
users:
  - system:serviceaccount:fluentbittest:builder
  - system:serviceaccount:fluentbittest:default
  - system:serviceaccount:fluentbittest:deployer
  - system:serviceaccount:fluentbittest:fluent-bit
    ConfigMap

由于OpenShift的容器运行时使用了cri-o,因此请在[输入]的解析器中指定cri。有关cri配置的详细信息,请查看[解析器]中的cri部分。

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: fluentbittest
  labels:
    k8s-app: fluent-bit
data:
  # Configuration files: server, input, filters and output
  # ======================================================
  fluent-bit.conf: |
    [SERVICE]
        Flush         1
        Log_Level     info
        Daemon        off
        Parsers_File  parsers.conf
        HTTP_Server   On
        HTTP_Listen   0.0.0.0
        HTTP_Port     2020

    @INCLUDE input-kubernetes.conf
    @INCLUDE filter-kubernetes.conf
    @INCLUDE output-forward.conf

  input-kubernetes.conf: |
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        Parser            cri
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10

  filter-kubernetes.conf: |
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix     kube.var.log.containers.
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off

  output-forward.conf: |
    [OUTPUT]
        Name          forward
        Match         *
        Host          ${FLUENT_FOWARD_HOST}
        Port          ${FLUENT_FOWARD_PORT}
        Retry_Limit     False

  parsers.conf: |
    [PARSER]
        Name   apache
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache2
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache_error
        Format regex
        Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

    [PARSER]
        Name   nginx
        Format regex
        Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   json
        Format json
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On

    [PARSER]
        # http://rubular.com/r/tjUt3Awgg4
        Name cri
        Format regex
        Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z

    [PARSER]
        Name        syslog
        Format      regex
        Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
        Time_Key    time
        Time_Format %b %d %H:%M:%S
    DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: fluentbittest
  labels:
    k8s-app: fluent-bit-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  selector:
    matchLabels:
      k8s-app: fluent-bit-logging
  template:
    metadata:
      labels:
        k8s-app: fluent-bit-logging
        version: v1
        kubernetes.io/cluster-service: "true"
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "2020"
        prometheus.io/path: /api/v1/metrics/prometheus
    spec:
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:1.6
        imagePullPolicy: Always
        ports:
          - containerPort: 2020
        env:
        - name: FLUENT_FOWARD_HOST
          value: "X.X.X.X"
        - name: FLUENT_FOWARD_PORT
          value: "24225"
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
        securityContext:
          privileged: true
      terminationGracePeriodSeconds: 10
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config
      serviceAccountName: fluent-bit
      tolerations:
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      - operator: "Exists"
        effect: "NoExecute"
      - operator: "Exists"
        effect: "NoSchedule"

※本文中仍然存在有关 Docker 的设置。
※如果不将下述的 SecurityContext 设置放入上述的 Daemonset 中,Fluent Bit Pod 将无法读取节点上的日志文件并会导致错误,请注意。(参考)。

        securityContext:
          privileged: true

按照上述资源的顺序,使用”oc create -f “命令创建,将会启动Pod。

$ oc get pod
NAME               READY   STATUS    RESTARTS   AGE
fluent-bit-7wj72   1/1     Running   0          53s
fluent-bit-97z7r   1/1     Running   0          53s
fluent-bit-w26z5   1/1     Running   0          53s

确认收到

我们将检查虚拟服务器是否已将传输的日志输出到文件中。

$ ls -l /data/log/td-agent-bit/
total 80592
-rw-r--r-- 1 root root     1477 Jan 25 20:18 kube.var.log.containers.calico-kube-controllers-6c4d9c955b-k8w6d_calico-system_calico-kube-controllers-355b03ba53e7dea6461944eb616be24ae6817d21acec8133e9ff6a845fc7b954.log
-rw-r--r-- 1 root root   164785 Jan 25 20:19 kube.var.log.containers.calico-node-lgrns_calico-system_calico-node-592a8f869fcf0460fd01c0114ea8c558e6e8167d92c3e0d59a609d0abb72be2e.log
-rw-r--r-- 1 root root   171322 Jan 25 20:19 kube.var.log.containers.calico-node-wjsbn_calico-system_calico-node-a8cfa55b9431bd668bcf744ed125d29747ae3c454d2c6c292556789822b79a5f.log
-rw-r--r-- 1 root root   175063 Jan 25 20:19 kube.var.log.containers.calico-node-xdxjb_calico-system_calico-node-eb24febe3c3d651e39f941fac47f70f45664d83a31371bc774c25f61253347cf.log
-rw-r--r-- 1 root root     1128 Jan 25 20:17 kube.var.log.containers.calico-typha-5c8d96f77d-sfltp_calico-system_calico-typha-d70a3dde704772db3015d654db936e4b9b1902f8312228836dcb99e17d64959e.log
:
(後略)

检查日志文件的内容(查看最上面显示的日志)。

[user@vsi ~]$ cat /data/log/td-agent-bit/kube.var.log.containers.calico-kube-controllers-6c4d9c955b-k8w6d_calico-system_calico-kube-controllers-355b03ba53e7dea6461944eb616be24ae6817d21acec8133e9ff6a845fc7b954.log
kube.var.log.containers.calico-kube-controllers-6c4d9c955b-k8w6d_calico-system_calico-kube-controllers-355b03ba53e7dea6461944eb616be24ae6817d21acec8133e9ff6a845fc7b954.log: [1611627483.220251215, {"stream":"stderr","logtag":"F","message":"2021-01-26 02:18:03.220 [INFO][1] watchercache.go 96: Watch channel closed by remote - recreate watcher ListRoot=\"/calico/resources/v3/projectcalico.org/nodes\"","kubernetes":{"pod_name":"calico-kube-controllers-6c4d9c955b-k8w6d","namespace_name":"calico-system","pod_id":"a25ffa7a-ecf7-4140-8e65-4e4f174e80f6","labels":{"k8s-app":"calico-kube-controllers","pod-template-hash":"6c4d9c955b"},"annotations":{"cni.projectcalico.org/podIP":"172.17.59.83/32","cni.projectcalico.org/podIPs":"172.17.59.83/32","k8s.v1.cni.cncf.io/network-status":"[{\n    \"name\": \"k8s-pod-network\",\n    \"ips\": [\n        \"172.17.59.83\"\n    ],\n    \"default\": true,\n    \"dns\": {}\n}]","k8s.v1.cni.cncf.io/networks-status":"[{\n    \"name\": \"k8s-pod-network\",\n    \"ips\": [\n        \"172.17.59.83\"\n    ],\n    \"default\": true,\n    \"dns\": {}\n}]"},"host":"10.240.0.4","container_name":"calico-kube-controllers","docker_id":"355b03ba53e7dea6461944eb616be24ae6817d21acec8133e9ff6a845fc7b954","container_hash":"registry.ng.bluemix.net/armada-master/calico/kube-controllers@sha256:eb456f071b19614a6a4eb149cf1eb2dc924780accc2f9b426305790e03c2403b","container_image":"registry.ng.bluemix.net/armada-master/calico/kube-controllers:v3.16.5"}}]
[user@vsi ~]$

我已确认日志已在OpenShift上传输。

赠品

因为听说Fluent Bit非常轻量,所以我特意安装了它,并与Fluentd进行了比较,想看看它的轻量性。

    比較用の環境
image.png

我们使用fluent/fluentd-kubernetes-daemonset构建了Fluentd。为了将OpenShift上的日志传输到同一虚拟服务器上,我们将比较Fluentd Pod的资源使用情况与Fluent Bit在相同环境中的验证环境。

$ oc get pod -n logtest
NAME            READY   STATUS    RESTARTS   AGE
fluentd-2cbqz   1/1     Running   0          5d19h
fluentd-4mh8f   1/1     Running   0          5d19h
fluentd-77h8b   1/1     Running   0          5d19h

这个Fluentd插件已经加入了一些过滤处理,所以虽然不是完全准确的比较,但应该可以作一个大致的比较。
以下是结果。

    結果:Fluentd
$ oc adm top po -n logtest
NAME            CPU(cores)   MEMORY(bytes)
fluentd-2cbqz   12m          112Mi
fluentd-4mh8f   15m          112Mi
fluentd-77h8b   15m          119Mi
    結果:Fluent Bit
$ oc adm top po -n fluentbittest
NAME               CPU(cores)   MEMORY(bytes)
fluent-bit-7wj72   8m           9Mi
fluent-bit-97z7r   4m           8Mi
fluent-bit-w26z5   3m           7Mi

结果来看,与Fluentd相比,Fluent Bit的CPU使用量约为1/2以下,内存使用量约为1/10以下。(由于无法进行准确比较,所以这只是一个大致的表述。。。)

如上所述,尽管Fluentd的处理内容稍微多一些,但其CPU使用量大约只有1/2左右,内存使用量也只有1/10左右,这确实很吸引人。特别是在OpenShift/Kubernetes集群环境中,由于日志数量非常庞大,日志收集器的处理压力也很大。本次配置只是简单地通过forward设置转发日志,因此如果要在生产环境中使用,需要考虑进行各种调整和验证。

广告
将在 10 秒后关闭
bannerAds