CentOSインストール検証PaceMaker

16519 ワード

Linux

デュアルマシン信頼関係の確立は必須ではない

ノード間の認証を追加-そのうちの1台で

を実行する.

デュアルマシン

の構成

排他ボリュームグループ

をアクティブ化

リソースグループ

を作成する

pcs statusでリソース起動失敗

が表示されます.

他のリソースの追加を続行

検証

参考:Cluster Software Installationhttps://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_administration/ch-startup-HAAA#s1-clusterinstall-HAAA

# yum install pcs pacemaker fence-agents-all

実際にcorosyncが多くインストールされています[root@server2 ~]# yum install -y fence-agents-all corosync pacemaker pcs
インストール後のチェック

[root@server3 ~]# rpm -q pacemaker
pacemaker-1.1.20-5.el7_7.1.x86_64
[root@server3 ~]# grep hacluster /etc/passwd
hacluster:x:189:189:cluster user:/home/hacluster:/sbin/nologin

ホスト名を設定[root@server2 ~]# hostnamectl set-hostname server4.example.com
vi/etc/hosts 192.168.122.143 server3.example.com s3 192.168.122.58 server4.example.com s4
デュアルマシン信頼関係の構築は必須ではありません
[root@server3~]#ssh-keygenデフォルトリターン[root@server3~]#ssh-copy-id s 4公開鍵を対端にコピーする[root@server3~]#ssh s 4認証パスワードなしで対端にログイン
2台のサーバがpcsd systemctl start pcsd systemctl enable pcsdを起動
ノード間の認証を追加-いずれかのノードで実行

[root@server3 ~]# pcs cluster auth server3.example.com server4.example.com
Username: root
Password: 
Error: s3: Username and/or password is incorrect
Error: Unable to communicate with s4
[root@server3 ~]#
[root@server3 ~]# pcs cluster auth server3.example.com server4.example.com
Username: hacluster
Password: 
Error: Unable to communicate with server4.example.com
server3.example.com: Authorized
[root@server3 ~]#

rootを使用できないユーザーは、公式サイトを参照してファイアウォールの構成を追加できません[root@server3 .ssh]# firewall-cmd --permanent --add-service=high-availability [root@server3 .ssh]# firewall-cmd --add-service=high-availability
またpcsdサービスを再起動するにはhaclusterパスワードを変更する必要があり、2台のマシンで変更する必要があります

[root@server4 ~]# pcs cluster auth server3.example.com server4.example.com
Username: hacluster
Password: 
server4.example.com: Authorized
server3.example.com: Authorized

追加に成功した後、公式サイトによると、以前は手動でhaclusterユーザーのログイン権限を追加したため、手動で削除した.

# usermod  -s /sbin/nologin hacluster

デュアルマシンの構成
1台でpcs cluster setup--start--name mytest_を実行cluster server3.example.com server4.example.comは2台の機器で同じ出力を見ることができます

[root@server3 ~]# pcs cluster status
Cluster Status:
 Stack: corosync
 Current DC: server3.example.com (version 1.1.20-5.el7_7.1-3c4c782f70) - partition with quorum
 Last updated: Wed Sep 25 23:30:26 2019
 Last change: Wed Sep 25 23:28:19 2019 by hacluster via crmd on server3.example.com
 2 nodes configured
 0 resources configured

PCSD Status:
  server3.example.com: Online
  server4.example.com: Online
[root@server3 ~]#

電源を切って休む.再起動時にpcs cluster statusが起動していないことを示し、いずれの機器でも自動起動を追加[root@server4 ~]# pcs cluster enable --all server3.example.com: Cluster Enabled server4.example.com:Closter Enabledは手動でデュアルマシンを起動する必要がある[root@server4~]#pcs cluster start 4機のみで起動すると、2台ともonlineであることがわかります.ただし、3機でpcs cluster statusを実行すると未起動と表示されます.reboot server 3を再起動するとpcs状態正常PCSD Status:server 4が表示される.example.com: Online server3.example.com: Online
fencing構成をスキップし、Chapter 5を参照する.Fencing: Configuring STONITH https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_reference/ch-fencing-haar#s1-stonithlist-HAAR

[root@server3 ~]# pcs stonith list|grep -i virt
fence_virt - Fence agent for virtual machines
fence_xvm - Fence agent for virtual machines

次の資料を参照して、メインスタンバイのウェブサービスChapter 2を作成します.An active/passive Apache HTTP Server in a Red Hat High Availability Cluster
準備リソース:フローティングIPアドレス1つ、共有ハードディスク1つvirt-managerのserver 3に電球を点灯し、Disk 2の経路は/var/lib/libvirt/images/hd 4 ks-cloneである.rawは、server 4に同じファイルを追加します.[root@server4~]#fdisk-l Disk/dev/vdb:209 MB,2097,15200 bytes,409600 sectorsがそのまま使えるようになりました.
任意のノードで実際のテストserver 3実行fdisk-lはvdbディスクが見えないので、4で実行し、既存のextパーティションを削除する.

[root@server4 ~]# pvcreate /dev/vdb
WARNING: ext3 signature detected on /dev/vdb at offset 1080. Wipe it? [y/n]: y
  Wiping ext3 signature on /dev/vdb.
  Physical volume "/dev/vdb" successfully created.
[root@server4 ~]#

s 3を再起動してもvdbが見えないので、2台のマシンのshareableをチェックして、電源を切ってから再起動します.

[root@server3 ~]# vgcreate my_vg /dev/vdb
  Volume group "my_vg" successfully created
[root@server3 ~]# lvcreate -L 200 my_vg -n my_lv
  Volume group "my_vg" has insufficient free space (49 extents): 50 required.
[root@server3 ~]# lvs     200M     ，lvs   
[root@server3 ~]# lvcreate -L 190 my_vg -n my_lv
  Rounding up size to full physical extent 192.00 MiB
  Logical volume "my_lv" created.
[root@server3 ~]# mkfs.ext4 /dev/my_vg/my_lv

このとき4機ではvdbとpvしか見えず、vgscanを使ってもpvは見えません
2.2. Web Server Configuration 2台ともyum install-y httpd wgetをインストールエージェントがapacheの状態を検出できるように/etc/http/conf/http.confはまた、新しい構成SetHandler server-status Require localエージェントがsystemdをサポートしない必要があり、reload Apache/etc/logrotateをサポートするために以下の変更が必要である.d/http削除行/bin/systemctl reload httpd.サービス>/dev/null 2>/dev/null|trueを/usr/sbin/http-f/etc/http/conf/http.に置き換える.conf -c “PidFile/var/run/httpd.pid” -k graceful >/dev/null 2>/dev/null || true
どちらか一方で実行

# mount /dev/my_vg/my_lv /var/www/
# mkdir /var/www/html
# mkdir /var/www/cgi-bin
# mkdir /var/www/error
# restorecon -R /var/www
# cat </var/www/html/index.html

Hello

END
# umount /var/www

ENDを削除する前のマイナス記号効果と同じものをテストしました.
排他ボリュームグループのアクティブ化
2.3. Exclusive Activation of a Volume Group in a Cluster Clusterは、デュアルマシンソフトウェア以外でボリュームグループ/etc/lvm/lvmをアクティブにすることを禁止することを要求する.conf volume_List構成のvgは自動的にアクティブになり、デュアルマシンで使用するボリュームグループを含めるべきではありません.

# lvmconf --enable-halvm --services --startstopservices

このコマンドは、次のパラメータを変更し、lvmetad locking_を停止します.type is set to 1デフォルトuse_lvmetad is set to 0のデフォルト構成の1 grep-E"locking_type|use_lvmetad"/etc/lvm/lvm.conf vg名#vgs--noheadings-o vg_を表示nameがこんなにパラメータを作ったなんて
ボリュームグループへのアクセスを防止するためにinitramを再構築し、再起動(スキップ)します.

# dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)

Hネイティブ起動に必要なドライバfのみをインストール既存ファイルを上書きカーネルを更新した場合は、再起動してから上記コマンドを実行してください.2台とも手作業でボリュームグループをアクティブ化[root@server4 ~]# vgchange -a n my_vg 0 logical volume(s) in volume group “my_vg” now active
リソースグループの作成
2.4. Creating the Resources and Resource Group with the pcs Command 4リソース(ボリュームグループLVM、ファイルシステムFilessystem、フローティングアドレスIPaddr 2、アプリケーション)は、リソースグループapachegroupを構成し、同じマシンボリュームグループLVMで動作することを保証します.

[root@server3 ~]# pcs resource create my_lvm LVM volgrpname=my_vg \
 exclusive=true --group apachegroup
Assumed agent name 'ocf:heartbeat:LVM' (deduced from 'LVM')

PCS statusでリソースの起動に失敗したことがわかります

 Resource Group: apachegroup
     my_lvm	(ocf::heartbeat:LVM):	FAILED (Monitoring)[ server3.example.com server4.example.com ]

Failed Resource Actions:
* my_lvm_monitor_0 on server3.example.com 'unknown error' (1): call=5, status=complete, exitreason='The volume_list filter must be initialized in lvm.conf for exclusive activation without clvmd'

ヒントによるclvmdが停止する場合はlvm.conf構成volume_Listパラメータ、これはガイドブックで言及されていますが、無視されています./etc/lvm/lvm.conf volume_List=[]マシンには他のvgがないため、構成値は空ですが、このパラメータが表示される必要があります.
次のコマンドはpcs resource restart my_を解決していません.lvm再起動リソースpcs resource disable my_lvm停止リソースpcs resource enable my_lvm起動リソースpcs resource cleanup my_lvmクリアリソース障害pcs cluster stop--allすべてのデュアルノードpcs cluster start--allすべてのデュアルノードreboot pcs resource showを起動リソースを停止し、showを省略できます

 Resource Group: apachegroup
     my_lvm	(ocf::heartbeat:LVM):	Stopped

Initramバックアップを再作成すると、2台のマシンコアがまだ異なることがわかりました

[root@server3 ~]# cp -p /boot/initramfs-3.10.0-1062.1.1.el7.x86_64.img /boot/initramfs-3.10.0-1062.1.1.el7.x86_64.img.bakok1
[root@server4 ~]# cp -p /boot/initramfs-3.10.0-957.el7.x86_64.img /boot/initramfs-3.10.0-957.el7.x86_64.img.bakok1
[root@server3 ~]# dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)
[root@server3 ~]# reboot
```text
      
      ，        ，     
```text
pcs resource delete my_lvm
pcs resource create my_lvm LVM volgrpname=my_vg \
> exclusive=false --group apachegroup

日誌を見る

[root@server3 ~]# journalctl -xe
Sep 26 17:13:54 server3.example.com pengine[3312]:    error: Resource start-up disabled since no STONITH resources have been defined
Sep 26 17:13:54 server3.example.com pengine[3312]:    error: Either configure some or disable STONITH with the stonith-enabled option
Sep 26 17:13:54 server3.example.com pengine[3312]:    error: NOTE: Clusters with shared data need STONITH to ensure data integrity
Sep 26 17:13:54 server3.example.com pengine[3312]:   notice: Removing my_lvm from server3.example.com
Sep 26 17:13:54 server3.example.com pengine[3312]:   notice: Removing my_lvm from server4.example.com
[root@server3 ~]# pcs property show --all
stonith-enabled: true
[root@server3 ~]# pcs property set stonith-enabled=false    1   
[root@server3 ~]# pcs property show --all |grep stonith-enabled
 stonith-enabled: false

リソースを再追加し、正常に起動しました

[root@server3 ~]# pcs resource create my_lvm LVM volgrpname=my_vg \
>  exclusive=true --group apachegroup
Assumed agent name 'ocf:heartbeat:LVM' (deduced from 'LVM')
[root@server3 ~]# pcs status
[root@server3 ~]# pcs resource
 Resource Group: apachegroup
     my_lvm	(ocf::heartbeat:LVM):	Started server3.example.com
   lvdisplay    LV Status              available

追加リソースの追加を続行
pcs resource create my_fs Filesystem device="/dev/my_vg/my_lv"directory="/var/www"fstype="ext 4"--group apachegroup dfを使用してマウントを表示
pcs resource create VirtualIP IPaddr2 ip=192.168.122.30 cidr_Netmask=24--group apachegroup ip adを使用してフローティングIPを表示
pcs resource create Website apache configfile="/etc/httpd/conf/httpd.conf" statusurl=“http://127.0.0.1/server-status” --group apachegroup
検証＃ケンショウ＃
ステータスチェック

[root@server3 ~]# pcs status
Cluster name: mytest_cluster
Stack: corosync
Current DC: server3.example.com (version 1.1.20-5.el7_7.1-3c4c782f70) - partition with quorum
Last updated: Thu Sep 26 17:27:24 2019
Last change: Thu Sep 26 17:26:27 2019 by root via cibadmin on server3.example.com

2 nodes configured
4 resources configured

Online: [ server3.example.com server4.example.com ]

Full list of resources:

 Resource Group: apachegroup
     my_lvm	(ocf::heartbeat:LVM):	Started server3.example.com
     my_fs	(ocf::heartbeat:Filesystem):	Started server3.example.com
     VirtualIP	(ocf::heartbeat:IPaddr2):	Started server3.example.com
     Website	(ocf::heartbeat:apache):	Started server3.example.com

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@server3 ~]#

アクセスアプリケーションfirefoxアクセスhttp://192.168.122.30ハローを表示[xy@xycto ~]$ curl http://192.168.122.30Helloこのとき2台のマシンsystemctl status httpdの状態でActive:inactive(dead)が起動していません
逆変換テスト(1)再起動[root@server3~]#reboot秒をserver 4(2)に切り替えてプロセスを停止し、デュアルマシンソフトウェアを聴く

[root@server4 ~]# ps -ef|grep httpd
root      7389     1  0 17:36 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
apache    7391  7389  0 17:36 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
apache    7392  7389  0 17:36 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
apache    7393  7389  0 17:36 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
apache    7394  7389  0 17:36 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
apache    7395  7389  0 17:36 ?        00:00:00 /sbin/httpd -DSTATUS -f /etc/httpd/conf/httpd.conf -c PidFile /var/run/httpd.pid
root      7642  3468  0 17:36 pts/0    00:00:00 grep --color=auto httpd
[root@server4 ~]# kill -9 7389

プログラムは自動的に再起動して、4回停止してすべて切り替えていないで、ただ1つの警告があります

Failed Resource Actions:
* Website_monitor_10000 on server4.example.com 'not running' (7): call=42, status=complete, exitreason='',
* 
    last-rc-change='Thu Sep 26 17:37:11 2019', queued=0ms, exec=0ms

表示構成ではモニタ間隔が10秒、kill再起動間隔は3秒と推定されます.[root@server4~]#pcs config Operations:monitor interval=10 s timeout=20 s(Website-monitor-interval-10 s)

# mv /sbin/httpd /sbin/httpdbak

その後killプロセスを行い、すぐに切り替え、1秒を推定します.
ログを見て、プログラムは比較的に知能的で、実際に100000回再起動していません

Sep 26 17:54:08 server3.example.com apache(Website)[9380]: ERROR: apache httpd program not found
Sep 26 17:54:08 server3.example.com apache(Website)[9396]: ERROR: environment is invalid, resource considered stopped
Sep 26 17:54:08 server3.example.com lrmd[3325]:   notice: Website_monitor_10000:9320:stderr [ ocf-exit-reason:apache httpd program not found ]
Sep 26 17:54:08 server3.example.com lrmd[3325]:   notice: Website_monitor_10000:9320:stderr [ ocf-exit-reason:environment is invalid, resource considered stopped ]
Sep 26 17:54:08 server3.example.com crmd[3332]:   notice: server3.example.com-Website_monitor_10000:41 [ ocf-exit-reason:apache httpd program not found
ocf-exit-reason:environment is invalid, resource considered stopped
 ]
...
Sep 26 17:54:09 server3.example.com pengine[3331]:  warning: Processing failed start of Website on server3.example.com: not installed
Sep 26 17:54:09 server3.example.com pengine[3331]:   notice: Preventing Website from re-starting on server3.example.com: operation start failed 'not installed' (5)
Sep 26 17:54:09 server3.example.com pengine[3331]:  warning: Forcing Website away from server3.example.com after 1000000 failures (max=1000000)

s 4サーバのデュアルマシンを停止した後、pcs cluster stopを復元できない可能性があります.

2 nodes configured
4 resources configured

Online: [ server3.example.com ]
OFFLINE: [ server4.example.com ]

Full list of resources:

 Resource Group: apachegroup
     my_lvm	(ocf::heartbeat:LVM):	Started server3.example.com
     my_fs	(ocf::heartbeat:Filesystem):	Started server3.example.com
     VirtualIP	(ocf::heartbeat:IPaddr2):	Started server3.example.com
     Website	(ocf::heartbeat:apache):	Stopped

Failed Resource Actions:
* Website_start_0 on server3.example.com 'not installed' (5): call=44, status=complete, exitreason='environment is invalid, resource considered stopped',

手動パージステータス[root@server3~]#pcs resource cleanup Websiteはまだ起動できません

Website_start_0 on server3.example.com‘unknown error’(1):call=63,status=Timed Out,exitreason=’,rebootが再起動すると、すべてのリソースがstopped journalctlのためにログをチェックし、あまり役に立たず、肝心なログをよく見つけられなかった.

Sep 26 18:13:53 server3.example.com LVM(my_lvm)[3460]: WARNING: LVM Volume my_vg is not available (stopped)
Sep 26 18:13:53 server3.example.com crmd[3321]:   notice: Result of probe operation for my_lvm on server3.example.com: 7 (not running)
Sep 26 18:13:53 server3.example.com crmd[3321]:   notice: Initiating monitor operation my_fs_monitor_0 locally on server3.example.com
Sep 26 18:13:53 server3.example.com Filesystem(my_fs)[3480]: WARNING: Couldn't find device [/dev/my_vg/my_lv]. Expected /dev/??? to exist

4機のデュアルマシンを起動する必要がある[root@server4~]#pcs cluster start自動リカバリ.
まとめ:1、httpdファイルの破損の問題は、回復後も異常が発生する可能性がありますので、rebootしたほうがいいです.2、1台のマシンがデュアルマシンソフトウェアを停止した後、もう1台のマシンを再起動しないでください.
(3)フローティングIPアドレスを削除[root@server4~]#ip addr del 192.168.122.30/24 dev eth 0も本機で自動的に引き上げ[root@server4~]#ip link set down dev eth 0はkvmまたは仮想マシンインタフェースでのみ操作されますが、s 4は切り替えられません.3マシンでは切り替えに成功し、Webアクセスも正常です.

Online: [ server3.example.com ]
OFFLINE: [ server4.example.com ]

Full list of resources:

 Resource Group: apachegroup
     my_lvm	(ocf::heartbeat:LVM):	Started server3.example.com
     my_fs	(ocf::heartbeat:Filesystem):	Started server3.example.com
     VirtualIP	(ocf::heartbeat:IPaddr2):	Started server3.example.com
     Website	(ocf::heartbeat:apache):	Started server3.example.com
[xy@xycto ~]$ curl http://192.168.122.30

Hello

s 4のNICを起動するとifconfig eth 4 upはs 4のフローティングアドレスを自動的に削除し、デュアルマシンに追加し、2台のマシンはpcs statusと同じように表示されます.

Binary Tree ZigZag Level Order Traversal leetcode java

luceneインデックスの追加とクエリー