我在Kubernetes上尝试运行NATS JetStream
关于NATS JetStream的内容。
简介
在NATS JetStream的文档中,没有特别说明如何在Kubernetes上运行。如果有的话,只是提到可以使用Helm Chart。所以我通过解读Helm Chart,将其转换为简单的yaml描述。
为什么不使用Helm Chart呢?
-
- ブラックボックスのままでは、拡張性に乏しい
-
- Helmが黒魔術化する
-
- Helmが宣言的ではない
helm install –set や helm install –valuesといった手続き的なCLI操作を求められる
YAML
apiVersion: v1
kind: Namespace
metadata:
name: nats-ns
labels:
app.kubernetes.io/name: nats
app.kubernetes.io/instance: nats-js
app.kubernetes.io/version: 2.6.2
---
apiVersion: v1
kind: ConfigMap
metadata:
name: nats-js-config
namespace: nats-ns
labels:
app.kubernetes.io/name: nats
app.kubernetes.io/instance: nats-js
app.kubernetes.io/version: 2.6.2
data:
nats.conf: |
pid_file: "/var/run/nats/nats.pid"
http: 8222
server_name: $POD_NAME
jetstream {
store_dir: "/data/jetstream/store"
max_file_store: 1G
max_memory_store: 1G
}
cluster {
name: example
port: 6222
# Based on 5 replicas
routes = [
nats-route://$POD_NAME-0.nats-js.$POD_NAMESPACE.svc.cluster.local:6222,
nats-route://$POD_NAME-1.nats-js.$POD_NAMESPACE.svc.cluster.local:6222,
nats-route://$POD_NAME-2.nats-js.$POD_NAMESPACE.svc.cluster.local:6222,
nats-route://$POD_NAME-3.nats-js.$POD_NAMESPACE.svc.cluster.local:6222,
nats-route://$POD_NAME-4.nats-js.$POD_NAMESPACE.svc.cluster.local:6222,
]
cluster_advertise: $CLUSTER_ADVERTISE
connect_retries: 30
}
lame_duck_duration: 60s # it should be equal to terminationGracePeriodSeconds of Pod and default is 30s
}
---
apiVersion: v1
kind: Service
metadata:
name: nats-js
namespace: nats-ns
labels:
app.kubernetes.io/name: nats
app.kubernetes.io/instance: nats-js
app.kubernetes.io/version: 2.6.2
spec:
selector:
app.kubernetes.io/name: nats
app.kubernetes.io/instance: nats-js
clusterIP: None
ports:
- name: client
port: 4222
- name: cluster
port: 6222
- name: monitor
port: 8222
- name: metrics
port: 7777
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nats-js
namespace: nats-ns
labels:
app.kubernetes.io/name: nats
app.kubernetes.io/instance: nats-js
app.kubernetes.io/version: 2.6.2
spec:
selector:
matchLabels:
app.kubernetes.io/name: nats
app.kubernetes.io/instance: nats-js
replicas: 5
serviceName: "nats-js"
volumeClaimTemplates:
- metadata:
name: nats-js-sts-vol
spec:
accessModes:
- ReadWriteOnce
volumeMode: "Filesystem"
resources:
requests:
storage: 10Gi
template:
metadata:
labels:
app.kubernetes.io/name: nats
app.kubernetes.io/instance: nats-js
annotations:
prometheus.io/path: /metrics
prometheus.io/port: "7777"
prometheus.io/scrape: "true"
sidecar.istio.io/inject: "false" # should considerate
spec:
# Common volumes for the containers
volumes:
- name: config-volume
configMap:
name: nats-js-config
- name: pid
emptyDir: {}
# Required to be able to HUP signal and apply config reload
# to the server without restarting the pod.
shareProcessNamespace: true
#################
# #
# NATS Server #
# #
#################
terminationGracePeriodSeconds: 60
containers:
- name: nats
image: synadia/nats-server:nightly
ports:
- containerPort: 4222
name: client
- containerPort: 6222
name: cluster
- containerPort: 8222
name: monitor
- containerPort: 7777
name: metrics
command:
- "nats-server"
- "--config"
- "/etc/nats-config/nats.conf"
# Required to be able to define an environment variable
# that refers to other environment variables. This env var
# is later used as part of the configuration file.
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: CLUSTER_ADVERTISE
value: $(POD_NAME).nats-js.$(POD_NAMESPACE).svc
volumeMounts:
- name: config-volume
mountPath: /etc/nats-config
- name: pid
mountPath: /var/run/nats
- name: nats-js-sts-vol
mountPath: /data/jetstream
# Liveness/Readiness probes against the monitoring
#
livenessProbe:
httpGet:
path: /
port: 8222
initialDelaySeconds: 10
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /
port: 8222
initialDelaySeconds: 10
timeoutSeconds: 5
resources:
requests:
cpu: 300m
memory: 600Mi
limits:
cpu: 300m
memory: 600Mi
# Gracefully stop NATS Server on pod deletion or image upgrade.
#
lifecycle:
preStop:
exec:
# Using the alpine based NATS image, we add an extra sleep that is
# the same amount as the terminationGracePeriodSeconds to allow
# the NATS Server to gracefully terminate the client connections.
#
command: ["/bin/sh", "-c", "/nats-server -sl=ldm=/var/run/nats/nats.pid && /bin/sleep 60"]
当你使用”kubectl apply -f nats.yaml” 命令进行 Apply,它的含义是
└─(17:03:54)──> kubectl logs -f nats-js-0 -c nats --namespace nats-ns ──(火,1207)─┘
[8] 2021/12/07 08:03:49.354395 [INF] Starting nats-server
[8] 2021/12/07 08:03:49.355813 [INF] Version: 2.6.6
[8] 2021/12/07 08:03:49.355852 [INF] Git: [893b4154]
[8] 2021/12/07 08:03:49.355907 [INF] Name: nats-js-0
[8] 2021/12/07 08:03:49.356005 [INF] Node: yLCaJyhQ
[8] 2021/12/07 08:03:49.356124 [INF] ID: NAZZFGATIGMG57CFFJXGXNJPE3QBIM5ZARZWFZKPODA7BYGFCAGJBJKV
[8] 2021/12/07 08:03:49.356239 [INF] Using configuration file: /etc/nats-config/nats.conf
[8] 2021/12/07 08:03:49.446026 [INF] Starting JetStream
[8] 2021/12/07 08:03:49.451437 [INF] _ ___ _____ ___ _____ ___ ___ _ __ __
[8] 2021/12/07 08:03:49.451678 [INF] _ | | __|_ _/ __|_ _| _ \ __| /_\ | \/ |
[8] 2021/12/07 08:03:49.451711 [INF] | || | _| | | \__ \ | | | / _| / _ \| |\/| |
[8] 2021/12/07 08:03:49.451726 [INF] \__/|___| |_| |___/ |_| |_|_\___/_/ \_\_| |_|
[8] 2021/12/07 08:03:49.451896 [INF]
[8] 2021/12/07 08:03:49.452182 [INF] https://docs.nats.io/jetstream
[8] 2021/12/07 08:03:49.452206 [INF]
[8] 2021/12/07 08:03:49.452218 [INF] ---------------- JETSTREAM ----------------
[8] 2021/12/07 08:03:49.455665 [INF] Max Memory: 953.67 MB
[8] 2021/12/07 08:03:49.456175 [INF] Max Storage: 953.67 MB
[8] 2021/12/07 08:03:49.456358 [INF] Store Directory: "/data/jetstream/store/jetstream"
[8] 2021/12/07 08:03:49.456399 [INF] -------------------------------------------
[8] 2021/12/07 08:03:49.459252 [INF] Starting JetStream cluster
[8] 2021/12/07 08:03:49.459362 [INF] Creating JetStream metadata controller
[8] 2021/12/07 08:03:49.545305 [INF] JetStream cluster recovering state
[8] 2021/12/07 08:03:49.656808 [INF] Starting http monitor on 0.0.0.0:8222
[8] 2021/12/07 08:03:49.657967 [INF] Listening for client connections on 0.0.0.0:4222
[8] 2021/12/07 08:03:49.744362 [INF] Server is ready
[8] 2021/12/07 08:03:49.745196 [INF] Cluster name is example
[8] 2021/12/07 08:03:49.745468 [INF] Listening for route connections on 0.0.0.0:6222
[8] 2021/12/07 08:03:49.754961 [ERR] Error trying to connect to route (attempt 1): lookup for host "$POD_NAME-1.nats-js.$POD_NAMESPACE.svc.cluster.local": lookup $POD_NAME-1.nats-js.$POD_NAMESPACE.svc.cluster.local: no such host
[8] 2021/12/07 08:03:49.754984 [ERR] Error trying to connect to route (attempt 1): lookup for host "$POD_NAME-4.nats-js.$POD_NAMESPACE.svc.cluster.local": lookup $POD_NAME-4.nats-js.$POD_NAMESPACE.svc.cluster.local: no such host
[8] 2021/12/07 08:03:49.755098 [ERR] Error trying to connect to route (attempt 1): lookup for host "$POD_NAME-0.nats-js.$POD_NAMESPACE.svc.cluster.local": lookup $POD_NAME-0.nats-js.$POD_NAMESPACE.svc.cluster.local: no such host
[8] 2021/12/07 08:03:49.755142 [ERR] Error trying to connect to route (attempt 1): lookup for host "$POD_NAME-2.nats-js.$POD_NAMESPACE.svc.cluster.local": lookup $POD_NAME-2.nats-js.$POD_NAMESPACE.svc.cluster.local: no such host
[8] 2021/12/07 08:03:49.755243 [ERR] Error trying to connect to route (attempt 1): lookup for host "$POD_NAME-3.nats-js.$POD_NAMESPACE.svc.cluster.local": lookup $POD_NAME-3.nats-js.$POD_NAMESPACE.svc.cluster.local: no such host
在启动时返回找不到集群的错误。
当然,其他集群会稍后启动。
└─(17:14:44)──> kubectl get pods --namespace nats-ns 130 ↵ ──(火,1207)─┘
NAME READY STATUS RESTARTS AGE
nats-js-0 1/1 Running 0 11m
nats-js-1 1/1 Running 0 10m
nats-js-2 1/1 Running 0 10m
nats-js-3 1/1 Running 0 10m
nats-js-4 1/1 Running 0 10m
当时间过去后,只要所有的Pod都能够启动起来,就没有问题了。
在正式部署时,要注意这一点。
我想要确认一下实际是否发生了移动。
首先,进行端口转发。
└─(17:22:59)──> kubectl port-forward svc/nats-js 4222:4222 --namespace nats-ns 1 ↵ ──(火,1207)─┘
Forwarding from 127.0.0.1:4222 -> 4222
Forwarding from [::1]:4222 -> 4222
接下来,请订阅。
└─(16:38:58)──> nats sub sample ──(火,1207)─┘
17:24:28 Subscribing on sample
[#1] Received on "sample"
hello nats world
出版
└─(16:38:58)──> nats pub sample "hello nats world" 127 ↵ ──(火,1207)─┘
17:24:37 Published 16 bytes to "sample"
观察一下
└─(16:38:58)──> nats sub sample ──(火,1207)─┘
17:24:28 Subscribing on sample
[#1] Received on "sample"
hello nats world