使用起搏器来构建Redis的两个节点集群

2 年 ago

雅, 悟

4 minutes

翻译:
为了什么

絶賛本番運用中の構成です。

形成

とてもシンプルな図ですが以下のような構成を想定。

起搏器的准备。

まずはpacemakerのインストールと準備です。

安装

pcs,pacemaker,fence-agents-allをインストールする。
pcsはpacemakerのクライアント。

## 多分HighAvailabilityでイケると思う。
dnf install pcs pacemaker fence-agents-all --enablerepo=HighAvailability

## リポジトリがhaで定義されてる場合もある。
dnf install pcs pacemaker fence-agents-all --enablerepo=ha

设定

事前チェックでfirewalldが稼働してるかチェックしておきます。

## 事前の確認
[root@redis1 ~]# systemctl status firewalld

● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor >
   Active: inactive (dead)
     Docs: man:firewalld(1)

## firewalld起動
[root@redis1 ~]# systemctl start firewalld.service

## 自動起動の設定
[root@redis1 ~]# systemctl enable firewalld

Created symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service → /usr/lib/systemd/system/firewalld.service.
Created symlink /etc/systemd/system/multi-user.target.wants/firewalld.service → /usr/lib/systemd/system/firewalld.service.

対象マシンそれぞれで実行すること。

firewall-cmd --permanent --add-service=high-availability
firewall-cmd --add-service=high-availability

如果显示出 “FirewallD is not running”，请启动 firewalld。

参考的唯一选项

执行pcsd服务的启动。

ノードそれぞれで実行すること。

systemctl start pcsd.service
systemctl enable pcsd.service

设定pacemaker执行用户的密码。

dnf(yum)でインストールした場合には自動でユーザー作成されているのでcat /etc/passwd等で確認しておく。

passwd hacluster

在集群中进行身份验证

設定したhaclusterユーザーを認証して相互に通信可能にします。

## クラスタサーバの認証設定 [片系(reids1)で実施]

[root@redis1 ~]# pcs host auth reids_test1 addr=192.168.111.45 redis_test2 addr=192.168.111.46
Username: hacluster
Password:
reids_test1: Authorized
reids_test2: Authorized

创建集群

クラスターにノードを追加します。ノード追加時にIPアドレスが必要になるのでIPアドレスを確認しておきましょう。

[root@redis1 ~]# pcs cluster setup redis-cluster redis1 addr=192.168.111.45 redis2 addr=192.168.111.46

Destroying cluster on hosts: 'redis1', 'redis2'...
redis1: Successfully destroyed cluster
redis2: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'redis1', 'redis2'
redis1: successful removal of the file 'pcsd settings'
mp-cache02: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'redis1', 'redis2'
redis1: successful distribution of the file 'corosync authkey'
redis1: successful distribution of the file 'pacemaker authkey'
redis2: successful distribution of the file 'corosync authkey'
redis2: successful distribution of the file 'pacemaker authkey'
Sending 'corosync.conf' to 'redis1', 'redis2'
redis1: successful distribution of the file 'corosync.conf'
redis2: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.

启动集群

クラスター全体を起動する場合には–allオプションを付けると全ノードでクラスター起動します。

[root@redis1 ~]# pcs cluster start
Starting Cluster...

pcs configコマンドで確認すると以下のような出力が得られる（以下2ノードの場合）

[root@redis1 ~]# pcs config
Cluster Name: redis_cluster
Corosync Nodes:
 reids1 redis2
Pacemaker Nodes:
 redis1 reids2

Resources:

Stonith Devices:
Fencing Levels:

Location Constraints:
Ordering Constraints:
Colocation Constraints:
Ticket Constraints:

Alerts:
 No alerts defined

Resources Defaults:
  No defaults set
Operations Defaults:
  No defaults set

Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: redis_test_cluster
 dc-version: 2.1.0-8.el8-7c3f660707
 have-watchdog: false

Tags:
 No tags defined

Quorum:
  Options:

STONITH可以翻译为“断网风险探测和隔离”。

Masterが正常だったら、フェイルオーバー時にクラスターから外れたノードは手動での調査と復旧を行う想定なので今回はstonithを無効化しておきます。

把它无效化

pcs property set stonith-enabled=false

核实corosync的设置

/etc/corosync/corosync.confを確認

[root@redis1 ~]# cat /etc/corosync/corosync.conf
totem {
    version: 2
    cluster_name: redis_test_cluster
    transport: knet
    crypto_cipher: aes256
    crypto_hash: sha256
}

nodelist {
    node {
        ring0_addr: 192.168.111.45
        name: redis1
        nodeid: 1
    }

    node {
        ring0_addr: 192.168.111.46
        name: redis2
        nodeid: 2
    }
}

quorum {
    provider: corosync_votequorum
    two_node: 1
}

logging {
    to_logfile: yes
    logfile: /var/log/cluster/corosync.log
    to_syslog: yes
    timestamp: on
}

向起搏器添加资源

仮想IPとredisのリソースを追加していきます。

添加虚拟IP

事前准备

nicの確認をip aで確認しておく。

[n-kashimoto@redis1 ~]$ sudo ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eno8303: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master nm-bond state UP group default qlen 1000
    link/ether d0:8e:79:ca:60:99 brd ff:ff:ff:ff:ff:ff permaddr d0:8e:79:ca:60:98
3: eno8403: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master nm-bond state UP group default qlen 1000
    link/ether d0:8e:79:ca:60:99 brd ff:ff:ff:ff:ff:ff
4: nm-bond: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether d0:8e:79:ca:60:99 brd ff:ff:ff:ff:ff:ff
    inet 192.168.111.45/24 brd 192.168.111.255 scope global noprefixroute nm-bond
       valid_lft forever preferred_lft forever

设定

nicに対して仮想IPを割り当てます。

[root@redis1 ~]# pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.111.44 cidr_netmask=24 nic=nm-bond op monitor interval=10s

確認

我会在PCS资源配置中进行确认。

[root@redis1 ~]# pcs resource config
 Resource: VirtualIP (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: cidr_netmask=24 ip=192.168.111.44 nic=m-bond
  Operations: monitor interval=10s (VirtualIP-monitor-interval-10s)
              start interval=0s timeout=20s (VirtualIP-start-interval-0s)
              stop interval=0s timeout=20s (VirtualIP-stop-interval-0s)

添加Redis资源

起動・終了はpacemakerが制御するのでsystemdでdisableにしておくこと

设定

リソースエージェントはocf:heartbeat:redisを使います。

pcs resource create redis-clone ocf:heartbeat:redis op start timeout=300s op monitor interval=5s op monitor role=Master interval=3s op monitor role=Slave interval=15s promotable

加以限制。

redis-cloneの昇格が終わる前に仮想IPが動くと意図しないトランザクションが流れるので困る。

# 念の為なんかあったら削除しておく
[root@redis1 ~]# pcs constraint colocation remove VirtualIP redis-clone
# ここから追加
[root@redis1 ~]# pcs constraint colocation add redis-clone with VirtualIP INFINITY
# 確認
[root@redis1 ~]# pcs constraint
Location Constraints:
Ordering Constraints:
  start VirtualIP then start redis-clone (kind:Mandatory)
Colocation Constraints:
  redis-clone with VirtualIP (score:INFINITY)
Ticket Constraints:

ちゃんとMasterと同じノードに移動するように変更する。

pcs constraint colocation add VirtualIP with master redis-clone INFINITY

確認するとwith-rsc-role:Masterと出力されているのが分かりますね。

[root@redis1 ~]# pcs constraint
Location Constraints:
Ordering Constraints:
Colocation Constraints:
  VirtualIP with redis-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master)
Ticket Constraints:

重新启动

滅多に–allは使わないのでクラスター全体を再起動する時ぐらい使う

pcs cluster stop --all
pcs cluster start --all

请使用redis-benchmark或redis-cli等工具对VirtualIP进行测试并确认其行为。

翻译: 为了什么

形成