实验参考 http://blog.csdn.net/qq_19175749/article/details/51607210
一、相关软件介绍
1. Heartbeat介绍
官方站点:http://linux-ha.org/wiki/Main_Page
heartbeat可以资源(VIP地址及程序服务)从一台有故障的服务器快速的转移到另一台正常的服务器提供服务,heartbeat和keepalived相似,heartbeat可以实现failover功能,但不能实现对后端的健康检查
heartbeat和keepalived应用场景及区别
很多网友说为什么不使用keepalived而使用长期不更新的heartbeat,下面说一下它们之间的应用场景及区别
1、对于web,db,负载均衡(lvs,haproxy,nginx)等,heartbeat和keepalived都可以实现
2、lvs最好和keepalived结合,因为keepalived最初就是为lvs产生的,(heartbeat没有对RS的健康检查功能,heartbeat可以通过ldircetord来进行健康检查的功能)
3、mysql双主多从,NFS/MFS存储,他们的特点是需要数据同步,这样的业务最好使用heartbeat,因为heartbeat有自带的drbd脚本
总结:无数据同步的应用程序高可用可选择keepalived,有数据同步的应用程序高可用可选择heartbeat
2. DRBD介绍
官方站点: http://www.drbd.org/users-guide-8.4/
DRBD(DistributedReplicatedBlockDevice)是一个基于块设备级别在远程服务器直接同步和镜像数据的软件,用软件实现的、无共享的、服务器之间镜像块设备内容的存储复制解决方案。它可以实现在网络中两台服务器之间基于块设备级别的实时镜像或同步复制(两台服务器都写入成功)/异步复制(本地服务器写入成功),相当于网络的RAID1,由于是基于块设备(磁盘,LVM逻辑卷),在文件系统的底层,所以数据复制要比cp命令更快。DRBD已经被MySQL官方写入文档手册作为推荐的高可用的方案之一
3. MySQL介绍
官方站点: http://www.mysql.com/
MySQL是一个开放源码的小型关联式数据库管理系统。目前MySQL被广泛地应用在Internet上的中小型网站中。由于其体积小、速度快、总体拥有成本低,尤其是开放源码这一特点,许多中小型网站为了降低网站总体拥有成本而选择了MySQL作为网站数据库。
二、前期环境准备
1. 架构拓扑
架构说明:
一主多从最常用的架构,多个从库可以使用lvs来提供读的负载均衡
解决一主单点的问题,当主库宕机后,可以实现主库宕机后备节点自动接管,所有的从库会自动和新的主库进行同步,实现了mysql主库的热备方案
2. 系统环境
系统环境 |
|
系统 |
Red Hat Enterprise Linux Server release 6.5 |
系统位数 |
x86_64 |
内核版本 |
2.6.32-431.el6.x86_64 |
软件环境 |
|
heartbeat |
heartbeat-3.0.4 |
drbd |
drbd-8.4.4 |
mysql |
mysql-5.5.32 |
3. 部署环境
角色 |
IP |
VIP |
eth0: 192.168.12.1/24 (提供对外服务地址) |
master1 |
eth0: 192.168.12.55/24 (内网) eth2: 10.1.12.55/24 (心跳线) eth3: 10.2.12.55/24(DRBD千兆数据传输) |
master2 |
eth0: 192.168.12.56/24 (内网) eth2: 10.1.12.56/24 (心跳线) eth3: 10.2.12.56/24(DRBD千兆数据传输) |
slave1 |
eth0: 192.168.12.55/24(暂时放在master1上) |
说明:从库通过主库的VIP进行主从同步replication |
|
需求: 1、主库master1宕机后master2自动接管VIP以及所有从库 2、在master2接管时,不影响从库的主从同步replication |
4. 主库服务器数据分区信息
磁盘 |
容量 |
分区 |
挂载点 |
说明 |
/dev/sdb |
18G |
/dev/sdb2 |
/data/ |
存放数据 |
2G |
/dev/sdb1 |
metadata分区 |
存放drbd同步的状态信息 |
|
注意 1、metadata分区一定不能格式化建立文件系统(sdb2存放drbd同步的状态信息) 2、分好的分区不要进行挂载 3、生产环境DRBDmetadata分区一般可设置为1-2G,数据分区看需求给最大 4、在生产环境中两块硬盘一样大 |
5. 环境准备
## 主机名、ip修改,关闭防火墙,重启一次系统:
# hostname master1
# vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=master1
#### 如下:
[root@master1 ~]# hostname
master1
[root@master1 ~]# ip add|egrep "eth0|eth2|eth3"|grep inet
inet 192.168.12.55/24 brd 192.168.12.255 scope global eth0
inet 10.1.12.55/24 brd 10.1.12.255 scope global eth2
inet 10.2.12.55/24 brd 10.2.12.255 scope global eth3
[root@master1 ~]# fdisk -l|grep sdb
Disk /dev/sdb: 21.5 GB, 21474836480 bytes
[root@master1 ~]# chkconfig iptables off
[root@master1 ~]# reboot
[root@master2 ~]# hostname
master2
[root@master2 ~]# ip add|egrep "eth0|eth2|eth3"|grep inet
inet 192.168.12.56/24 brd 192.168.12.255 scope global eth0
inet 10.1.12.56/24 brd 10.1.12.255 scope global eth2
inet 10.2.12.56/24 brd 10.2.12.255 scope global eth3
[root@master2 ~]# fdisk -l|grep sdb
Disk /dev/sdb: 21.5 GB, 21474836480 bytes
[root@master2 ~]# chkconfig iptables off
[root@master2 ~]# reboot
三、heartbeat安装部署
1. 配置服务器间心跳连接路由及hosts
======master1
[root@master1 ~]# route add -host 10.1.12.56 dev eth2 《到对端心跳路由》
[root@master1 ~]# route add -host 10.2.12.56 dev eth3 《到对端DRBD数据路由》
[root@master1 ~]# route -n|grep "10\."|sort
10.1.12.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2
10.1.12.56 0.0.0.0 255.255.255.255 UH 0 0 0 eth2
10.2.12.0 0.0.0.0 255.255.255.0 U 0 0 0 eth3
10.2.12.56 0.0.0.0 255.255.255.255 UH 0 0 0 eth3
## hosts配置选择心跳ip以及drbd-ip
[root@master1 ~]# echo "10.1.12.55 master1
10.2.12.55 master1
10.1.12.56 master2
10.2.12.56 master2" >> /etc/hosts
[root@master1 ~]# tail -4 /etc/hosts
10.1.12.55 master1
10.2.12.55 master1
10.1.12.56 master2
10.2.12.56 master2
======master2
[root@master2 ~]# route add -host 10.1.12.55 dev eth2
[root@master2 ~]# route add -host 10.2.12.55 dev eth3
[root@master2 ~]# route -n|grep "10\."|sort
10.1.12.0 0.0.0.0 255.255.255.0 U 0 0 0 eth2
10.1.12.55 0.0.0.0 255.255.255.255 UH 0 0 0 eth2
10.2.12.0 0.0.0.0 255.255.255.0 U 0 0 0 eth3
10.2.12.55 0.0.0.0 255.255.255.255 UH 0 0 0 eth3
## hosts配置选择心跳ip
[root@master2 ~]# echo "10.1.12.55 master1
10.2.12.55 master1
10.1.12.56 master2
10.2.12.56 master2" >> /etc/hosts
[root@master2 ~]# tail -2 /etc/hosts
10.1.12.55 master1
10.2.12.55 master1
10.1.12.56 master2
10.2.12.56 master2
2. 安装heartbeat
## 如果开启防火墙,需要放行udp 694端口 主备一致
[root@master2 heartbeat-3.0.4-RPM]# cat setup.sh
#!/bin/bash
cd `dirname $0`
echo `pwd`
echo install lib64ltdl7
rpm -ivf lib64ltdl7-2.2.6-6.1mdv2009.1.x86_64.rpm
echo install perl-TimeDate
rpm -ivf perl-TimeDate-1.16-13.el6.noarch.rpm
echo install PyXML
rpm -ivf PyXML-0.8.4-19.el6.x86_64.rpm
echo install cluster-glue-libs
rpm -ivf cluster-glue-libs-1.0.5-6.el6.x86_64.rpm
echo install cluster-glue
rpm -ivf cluster-glue-1.0.5-6.el6.x86_64.rpm
echo install resource-agents
rpm -ivf resource-agents-3.9.5-24.el6_7.1.x86_64.rpm
echo install heartbeat
rpm -ivf heartbeat-3.0.4-2.el6.x86_64.rpm heartbeat-libs-3.0.4-2.el6.x86_64.rpm
echo Done
exit 0
[root@master2 heartbeat-3.0.4-RPM]# sh setup.sh
[root@master2 heartbeat-3.0.4-RPM]# rpm -qa|grep heartbeat
heartbeat-libs-3.0.4-2.el6.x86_64
heartbeat-3.0.4-2.el6.x86_64
3. 配置heartbeat
## 主备两端配置文件ha.cf/authkeys/sharesources 完全一致
3.1 配置ha.cf文件
[root@master1 ~]# vim /etc/ha.d/ha.cf
#log configure
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local1
#options configure
keepalive 2
deadtime 30
warntime 10
initdead 120
#bcast eth2
mcast eth2 225.0.0.55 694 1 0
#node configure
auto_failback on
node master1
node master2
crm no
3.2 配置authkeys文件
[root@master1 ~]# vi /etc/ha.d/authkeys
auth 1
1 sha1 47e9336850f1db6fa58bc470bc9b7810eb397f06
[root@master1 ~]# chmod 600 /etc/ha.d/authkeys
3.3 配置haresource文件
master1 IPaddr::192.168.12.1/24/eth0
#master1 IPaddr::192.168.12.1/24/eth0 drbddisk::data Filesystem::/dev/drbd1::/data::ext4 mysqld
说明:
drbddisk::data <==启动drbd data资源,相当于执行/etc/ha.d/resource.d/drbddisk data stop/start操作
Filesystem::/dev/drbd1::/data::ext4 <==drbd分区挂载到/data目录,相当于执行/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext4 stop/start <==相当于系统中执行mount /dev/drbd1 /data
mysql <==启动mysql服务脚本,相当于/etc/init.d/mysql stop/start
4. 启动heartbeat,并测试
4.1 启动master1,查看vip
[root@master1 ~]# /etc/init.d/heartbeat start
## 注意看日志
[root@master1 ~]# tail -200f /var/log/ha-log
[root@master1 ~]# tail -200f /var/log/ha-debug
[root@master1 ~]# ip add|grep 192
inet 192.168.12.55/24 brd 192.168.12.255 scope global eth0
inet 192.168.12.1/24 brd 192.168.12.255 scope global secondary eth0
[root@master1 ~]# chkconfig --list heartbeat
heartbeat 0:off 1:off 2:on 3:on 4:on 5:on 6:off
4.2 启动master2,做切换测试
[root@master2 ~]# /etc/init.d/heartbeat start
[root@master2 ~]# ip add|grep 192
inet 192.168.12.56/24 brd 192.168.12.255 scope global eth0
## 模拟故障
[root@master1 ~]# /etc/init.d/heartbeat stop
[root@master1 ~]# ip add|grep 192
inet 192.168.12.55/24 brd 192.168.12.255 scope global eth0
[root@master2 ~]# ip add|grep 192
inet 192.168.12.56/24 brd 192.168.12.255 scope global eth0
inet 192.168.12.1/24 brd 192.168.12.255 scope global secondary eth0
说明:master1宕机后,vip地址漂移到master2节点上,master2成为主节点
## master1抢占vip
[root@master1 ~]# /etc/init.d/heartbeat start
[root@master1 ~]# ip add|grep 192
inet 192.168.12.55/24 brd 192.168.12.255 scope global eth0
inet 192.168.12.1/24 brd 192.168.12.255 scope global secondary eth0
[root@master2 ~]# ip add|grep 192
inet 192.168.12.56/24 brd 192.168.12.255 scope global eth0
说明:master1启动后,vip地址漂移回master1节点上,master1成为主节点。原因是/etc/ha.d/ha.cf 参数中auto_failback on 这里建议改为off
四、DRBD安装部署
4.1 新添加的磁盘分区
## 这里已经分好区,如下:(2G给drbd存状态信息,18G用于存data数据)
[root@master1 ~]# fdisk -l|grep sdb
Disk /dev/sdb: 21.5 GB, 21474836480 bytes
/dev/sdb1 1 262 2104483+ 83 Linux
/dev/sdb2 263 2610 18860310 83 Linux
[root@master2 ~]# fdisk -l|grep sdb
Disk /dev/sdb: 21.5 GB, 21474836480 bytes
/dev/sdb1 1 262 2104483+ 83 Linux
/dev/sdb2 263 2610 18860310 83 Linux
## 格式化sdb2分区,sdb1分区为meta data分区,不要格式化
[root@master1 ~]# mkfs.ext4 /dev/sdb2
[root@master2 ~]# mkfs.ext4 /dev/sdb2
4.2 DRBD编译安装
## 主备节点两端配置文件完全一致
# yum -y install kernel-devel kernel-headers flex
# tar zxvf drbd-8.4.4.tar.gz
# cd drbd-8.4.4
# ./configure --prefix=/usr/local/drbd --with-km
# make KDIR=/usr/src/kernels/2.6.32-431.el6.x86_64
# make install
# mkdir -p /usr/local/drbd/var/run/drbd
# cp /usr/local/drbd/etc/rc.d/init.d/drbd /etc/rc.d/init.d/
# chkconfig --add drbd
# chkconfig --list drbd
drbd 0:off 1:off 2:on 3:on 4:on 5:on 6:off
4.3 DRBD模块安装
回到刚刚解压drbd的目录,然后
# cd drbd
# make clean
rm -rf .tmp_versions Module.markers Module.symvers modules.order
rm -f *.[oas] *.ko .*.cmd .*.d .*.tmp *.mod.c .*.flags .depend .kernel*
rm -f compat/*.[oas] compat/.*.cmd
# make KDIR=/usr/src/kernels/2.6.32-431.el6.x86_64
# cp drbd.ko /lib/modules/2.6.32-431.el6.x86_64/kernel/lib/
# depmod
# modprobe drbd
# 查看模块是否加载成功
# lsmod |grep drbd
drbd 340519 0
libcrc32c 1246 1 drbd
## 配置开机自动加载
# echo "modprobe drbd >/dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules
[root@master1 ~]# more /etc/sysconfig/modules/drbd.modules
modprobe drbd >/dev/null 2>&1
4.4 配置DRBD
## 主备节点两端配置文件完全一致
[root@master1 ~]# vi /usr/local/drbd/etc/drbd.conf
global {
# minor-count 64;
# dialog-refresh 5; # 5 seconds
# disable-ip-verification;
usage-count no;
}
common {
protocol C;
disk {
on-io-error detach;
#size 454G;
no-disk-flushes;
no-md-flushes;
}
net {
sndbuf-size 512k;
# timeout 60; # 6 seconds (unit = 0.1 seconds)
# connect-int 10; # 10 seconds (unit = 1 second)
# ping-int 10; # 10 seconds (unit = 1 second)
# ping-timeout 5; # 500 ms (unit = 0.1 seconds)
max-buffers 8000;
unplug-watermark 1024;
max-epoch-size 8000;
# ko-count 4;
# allow-two-primaries;
cram-hmac-alg "sha1";
shared-secret "hdhwXes23sYEhart8t";
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
# data-integrity-alg "md5";
# no-tcp-cork;
}
syncer {
rate 120M;
al-extents 517;
}
}
resource data {
on master1 {
device /dev/drbd1;
disk /dev/sdb2;
address 10.2.12.55:7788;
meta-disk /dev/sdb1 [0];
}
on master2 {
device /dev/drbd1;
disk /dev/sdb2;
address 10.2.12.56:7788;
meta-disk /dev/sdb1 [0];
}
}
4.5 初始化meta分区
## 主备节点两端配置文件完全一致
[root@master1 ~]# drbdadm create-md data
Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
4.6 启动drbd
## 主备都需要
[root@master1 ~]# drbdadm up all
[root@master1 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master1, 2018-03-16 11:44:26
1: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r----s
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:18860312
4.7 初始化设备同步(设置主节点,覆盖备节点,保持数据一致)
## 主节点操作
[root@master1 ~]# drbdadm -- --overwrite-data-of-peer primary data
[root@master1 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master1, 2018-03-16 11:44:26
1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:10354688 nr:0 dw:0 dr:10358428 al:0 bm:632 lo:1 pe:0 ua:4 ap:0 ep:1 wo:d oos:8505624
[==========>.........] sync'ed: 55.0% (8304/18416)M
finish: 0:01:25 speed: 99,360 (96,772) K/sec
[root@master2 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master2, 2016-10-29 23:25:51
1: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
ns:0 nr:15407104 dw:15406080 dr:0 al:0 bm:940 lo:2 pe:7 ua:1 ap:0 ep:1 wo:d oos:3454232
[===============>....] sync'ed: 81.7% (3372/18416)M
finish: 0:00:35 speed: 98,360 (96,288) want: 102,400 K/sec
[root@master1 ~]# chkconfig --list drbd
drbd 0:off 1:off 2:on 3:on 4:on 5:on 6:off
4.8 挂载drbd分区到data数据目录
[root@master1 ~]# drbdadm primary all
[root@master1 ~]# mkdir /data
[root@master1 ~]# mount /dev/drbd1 /data/
[root@master1 ~]# df -hT|grep drbd
/dev/drbd1 ext4 18G 172M 17G 1% /data
4.9 测试drbd
## 正常状态
[root@master1 ~]# cp drbd-8.4.4.tar.gz /data/
[root@master1 ~]# echo "test drbd switch .....">/data/test.txt
[root@master1 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master1, 2018-03-16 11:44:26
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:18861122 nr:0 dw:812 dr:18861341 al:4 bm:1152 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master2 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master2, 2016-10-29 23:25:51
1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:0 nr:18861146 dw:18861146 dr:0 al:0 bm:1152 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
说明:master1为主节点,master2为备节点
## 模拟master1宕机
[root@master1 ~]# umount /dev/drbd1
[root@master1 ~]# drbdadm down all
[root@master1 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master1, 2018-03-16 11:44:26
[root@master2 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master2, 2016-10-29 23:25:51
1: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r-----
ns:0 nr:18861178 dw:18861178 dr:0 al:0 bm:1152 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master2 ~]# drbdadm primary all
[root@master2 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master2, 2016-10-29 23:25:51
1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
ns:0 nr:18861178 dw:18861178 dr:668 al:0 bm:1152 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master2 ~]# mkdir /data
[root@master2 ~]# mount /dev/drbd1 /data/
[root@master2 ~]# df -hT|grep drbd
/dev/drbd1 ext4 18G 173M 17G 2% /data
[root@master2 ~]# ll -lrht /data/
total 716K
drwx------ 2 root root 16K Mar 16 2018 lost+found
-rw-r--r-- 1 root root 693K Mar 16 2018 drbd-8.4.4.tar.gz
-rw-r--r-- 1 root root 23 Mar 16 2018 test.txt
说明:master1宕机后,master2可以升级为主节点,挂载drbd分区继续使用
#### 还原drbd到master1
[root@master1 ~]# /etc/init.d/drbd start
[root@master1 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master1, 2018-03-16 11:44:26
1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:0 nr:4 dw:4 dr:0 al:0 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master2 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master2, 2016-10-29 23:25:51
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:4 nr:18861178 dw:18861182 dr:1027 al:1 bm:1153 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master2 ~]# /etc/init.d/drbd stop
Stopping all DRBD resources:
.
[root@master2 ~]# cat /proc/drbd
cat: /proc/drbd: No such file or directory
[root@master2 ~]# df -hT|grep drbd
[root@master2 ~]# /etc/init.d/drbd start
[root@master2 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master2, 2016-10-29 23:25:51
1: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master1 ~]# drbdadm primary all
[root@master1 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master1, 2018-03-16 11:44:26
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:0 nr:8 dw:8 dr:668 al:0 bm:1 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master1 ~]# mount /dev/drbd1 /data/
[root@master1 ~]# df -hT|grep drbd
/dev/drbd1 ext4 18G 173M 17G 2% /data
4.10 配置haresource启动drbd
[root@master1 ~]# chkconfig --list drbd
drbd 0:off 1:off 2:on 3:on 4:on 5:on 6:off
[root@master1 ~]# chkconfig drbd off
[root@master1 ~]# vi /etc/ha.d/haresources
master1 IPaddr::192.168.12.1/24/eth0 drbddisk::data Filesystem::/dev/drbd1::/data::ext4
[root@master1 ~]# cp /root/drbd-8.4.4/scripts/drbddisk /etc/ha.d/resource.d/
[root@master1 ~]# /etc/init.d/heartbeat stop
[root@master2 ~]# chkconfig drbd off
[root@master2 ~]# vi /etc/ha.d/haresources
master1 IPaddr::192.168.12.1/24/eth0 drbddisk::data Filesystem::/dev/drbd1::/data::ext4
[root@master2 ~]# cp /root/drbd-8.4.4/scripts/drbddisk /etc/ha.d/resource.d/
[root@master2 ~]# /etc/init.d/heartbeat stop
[root@master1 ~]# /etc/init.d/heartbeat start
[root@master2 ~]# /etc/init.d/heartbeat start
## 测试切换
[root@master1 ~]# ip add|grep 192
inet 192.168.12.55/24 brd 192.168.12.255 scope global eth0
inet 192.168.12.1/24 brd 192.168.12.255 scope global secondary eth0
[root@master1 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master1, 2018-03-16 11:44:26
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:4 nr:0 dw:4 dr:693 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master1 ~]# df -hT|grep /drbd
/dev/drbd1 ext4 18G 173M 17G 2% /data
[root@master1 ~]# /etc/init.d/heartbeat stop
[root@master1 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master1, 2018-03-16 11:44:26
1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:8 nr:4 dw:12 dr:1027 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master1 ~]# ip add|grep 192
inet 192.168.12.55/24 brd 192.168.12.255 scope global eth0
[root@master2 ~]# ip add|grep 192
inet 192.168.12.56/24 brd 192.168.12.255 scope global eth0
inet 192.168.12.1/24 brd 192.168.12.255 scope global secondary eth0
[root@master2 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master2, 2016-10-29 23:25:51
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:4 nr:8 dw:12 dr:693 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master2 ~]# df -hT|grep drbd
/dev/drbd1 ext4 18G 173M 17G 2% /data
五、MySQL安装部署
## master1安装mysql,master2只安装到初始化之前,slave从库也配置在master1上,这里暂时不配置slave从库
1. 安装cmake工具
# ls -lrht cmake-2.8.8.tar.gz
-rw-r--r-- 1 root root 5.5M Oct 20 04:50 cmake-2.8.8.tar.gz
# tar xf cmake-2.8.8.tar.gz
# cd cmake-2.8.8
# ./configure
# gmake
# gmake install
# which cmake
/usr/local/bin/cmake
2. 安装ncurses-devel依赖
# yum -y install ncurses-devel
# yum -y install bison
3. 开始安装MySQL
3.1 创建用户
----u 指定用户uid ----s 指定用户登录所使用shell ----M 不要创建用户登录目录 ----g 指定用户所属组 ----G指定用户附加组
# groupadd -g 1200 mysql
# useradd mysql -s /sbin/nologin -M -g mysql -u 1200
---hosts配置(这里不用配置,因为后续mysql是通过vip提供服务)
# tail -4 /etc/hosts
10.1.12.55 master1
10.2.12.55 master1
10.1.12.56 master2
10.2.12.56 master2
3.2 解压编译
# tar zxf mysql-5.5.32.tar.gz
# cd mysql-5.5.32
# cmake . -DCMAKE_INSTALL_PREFIX=/app/mysql-5.5.32 \
-DMYSQL_DATADIR=/app/mysql-5.5.32/data \
-DMYSQL_UNIX_ADDR=/app/mysql-5.5.32/tmp/mysql.sock \
-DDEFAULT_CHARSET=utf8 \
-DDEFAULT_COLLATION=utf8_general_ci \
-DEXTRA_CHARSETS=gbk,gb2312,utf8,ascii \
-DENABLED_LOCAL_INFILE=ON \
-DWITH_INNOBASE_STORAGE_ENGINE=1 \
-DWITH_FEDERATED_STORAGE_ENGINE=1 \
-DWITH_BLACKHOLE_STORAGE_ENGINE=1 \
-DWITHOUT_EXAMPLE_STORAGE_ENGINE=1 \
-DWITHOUT_PARTITION_STORAGE_ENGINE=1 \
-DWITH_FAST_MUTEXES=1 \
-DWITH_ZLIB=bundled \
-DENABLED_LOCAL_INFILE=1 \
-DWITH_READLINE=1 \
-DWITH_EMBEDDED_SERVER=1 \
-DWITH_DEBUG=0
3.3 安装mysql
# make && make install
3.4 创建一个Link文件
# ln -s /app/mysql-5.5.32/ /app/mysql
# ll /app/
total 4
lrwxrwxrwx 1 root root 18 Oct 30 05:06 mysql -> /app/mysql-5.5.32/
drwxr-xr-x 13 root root 4096 Oct 30 05:05 mysql-5.5.32
3.5 配置环境变量
# echo 'export PATH=/app/mysql/bin:$PATH' >> /etc/profile
tail -5 /etc/profile
source /etc/profile
echo $PATH
3.6 配置my.cnf文件
## 放到drbd存储上
# vi /data/master/my.cnf
[mysqld]
socket = /data/master/mysql.sock
port = 3306
pid-file = /data/master/mysql.pid
datadir = /data/master/data
basedir = /app/mysql
user = mysql
server-id=1
[client]
port = 3306
socket = /data/master/mysql.sock
[mysql]
no-auto-rehash
4. master1上初始化并启动MySQL
## 注意:数据库存放数据的目录一定要在drbd分区
## master1上操作
[root@master1 ~]# df -hT|grep drbd
/dev/drbd1 ext4 18G 173M 17G 2% /data
[root@master1 ~]# cd /app/mysql/scripts/
# ./mysql_install_db --basedir=/app/mysql --datadir=/data/master/data --user=mysql
# chown -R mysql. /data/master/
## 启动mysql
# mysqld_safe --defaults-file=/data/master/my.cnf &
# netstat -nltpd |grep mysql
## 修改mysql登录密码
# /app/mysql/bin/mysqladmin -u root password '111111' -S /data/master/mysql.sock
## 登录测试
# mysql -uroot -p111111 -S /data/master/mysql.sock
5. 配置mysql启停脚本
## 注意master1、master2都需要添加,并且需要添加到ha启动目录下
# vi /etc/ha.d/resource.d/mysqld
#!/bin/bash
port=3306
mysql_user="root"
mysql_pwd="111111"
cmdpath="/app/mysql/bin/"
mysql_sock="/data/master/mysql.sock"
#startup function
function_start_mysql()
{
if [ ! -e ${mysql_sock} ];then
printf "Starting MySQL...\n"
/bin/sh ${cmdpath}/mysqld_safe --defaults-file=/data/master/my.cnf 2>&1 >/dev/null &
else
printf "MySQL is running...\n"
exit
fi
}
#stop function
function_stop_mysql()
{
if [ ! -e "$mysql_sock" ];then
printf "MySQL is stopped...\n"
exit
else
printf "Stoping MySQL...\n"
# echo " ${cmdpath}mysqladmin -u${mysql_user} -p${mysql_pwd} -S /mysqldata/${port}/mysql${port}.sock shutd
own "
${cmdpath}mysqladmin -u${mysql_user} -p${mysql_pwd} -S /data/master/mysql.sock shutdown
fi
}
#restart function
function_restart_mysql()
{
printf "Restarting MySQL...\n"
function_stop_mysql
sleep 2
function_start_mysql
}
case $1 in
start)
function_start_mysql
;;
stop)
function_stop_mysql
;;
restart)
function_restart_mysql
;;
*)
printf "Usage:/mysqldata/${port}/mysql {start|stop|restart}\n"
esac
## 测试启停
# chmod a+x /etc/ha.d/resource.d/mysqld
# /etc/ha.d/resource.d/mysqld start
# netstat -nltpd |grep mysql
5. master2上启动MySQL
## 先停master1上的mysql服务
# /etc/ha.d/resource.d/mysqld stop
## 停master1上的高可用,使得drbd磁盘挂载到master2上
[root@master1 ~]# /etc/init.d/heartbeat stop
Stopping High-Availability services: Done.
[root@master1 ~]# /etc/init.d/heartbeat start
Starting High-Availability services: INFO: Resource is stopped
Done.
## master2上检查
[root@master2 ~]# ip add|grep 192
inet 192.168.12.56/24 brd 192.168.12.255 scope global eth0
inet 192.168.12.1/24 brd 192.168.12.255 scope global secondary eth0
[root@master2 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master2, 2016-10-29 23:25:51
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:20 nr:41760 dw:41780 dr:2096 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master2 ~]# df -hT|grep drbd
/dev/drbd1 ext4 18G 202M 17G 2% /data
## master2拷贝mysql启停服务并修改
# scp /etc/ha.d/resource.d/mysqld master2:/etc/ha.d/resource.d/mysqld
## 启停测试
[root@master2 ~]# /etc/ha.d/resource.d/mysqld start
Starting MySQL.. [ OK ]
[root@master2 ~]# netstat -nltpd|grep mysql
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 41571/mysqld
[root@master2 ~]# /etc/ha.d/resource.d/mysqld stop
Shutting down MySQL. [ OK ]
6. 配置haresource启动MySQL
=======master1、master2都需要操作
## 先停高可用
# /etc/init.d/heartbeat stop
## 将mysqld服务移动到ha服务中
# chkconfig --list mysqld
service mysqld supports chkconfig, but is not referenced in any runlevel (run 'chkconfig --add mysqld')
# ll /etc/ha.d/resource.d/|grep mysql
-rwxr-xr-x 1 root root 1152 Oct 30 11:35 mysqld
## 修改启动参数
# vi /etc/ha.d/haresources
master1 IPaddr::192.168.12.1/24/eth0 drbddisk::data Filesystem::/dev/drbd1::/data::ext4 mysqld
## 启动高可用
# /etc/init.d/heartbeat start
# tail -200f /var/log/ha-log
# tail -200f /var/log/ha-debug
## 检查
[root@master1 ~]# ip add|grep 192
inet 192.168.12.55/24 brd 192.168.12.255 scope global eth0
inet 192.168.12.1/24 brd 192.168.12.255 scope global secondary eth0
[root@master1 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master1, 2018-03-16 11:44:26
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:41952 nr:884 dw:42836 dr:9467 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master1 ~]# df -hT|grep drbd
/dev/drbd1 ext4 18G 202M 17G 2% /data
[root@master1 ~]# netstat -nltpd|grep mysqld
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 54535/mysqld
## 测试切换
[root@master1 ~]# /etc/init.d/heartbeat stop
[root@master1 ~]# ip add|grep 192
inet 192.168.12.55/24 brd 192.168.12.255 scope global eth0
[root@master1 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master1, 2018-03-16 11:44:26
1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:42100 nr:996 dw:43096 dr:9801 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master1 ~]# df -hT|grep drbd
[root@master1 ~]# netstat -nltpd|grep mysql
[root@master2 ~]# ip add|grep 192
inet 192.168.12.56/24 brd 192.168.12.255 scope global eth0
inet 192.168.12.1/24 brd 192.168.12.255 scope global secondary eth0
[root@master2 ~]# cat /proc/drbd
version: 8.4.4 (api:1/proto:86-101)
GIT-hash: 74402fecf24da8e5438171ee8c19e28627e1c98a build by root@master2, 2016-10-29 23:25:51
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:1044 nr:42100 dw:43144 dr:13859 al:6 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@master2 ~]# df -hT|grep drbd
/dev/drbd1 ext4 18G 202M 17G 2% /data
[root@master2 ~]# netstat -nltpd|grep mysqld
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 43020/mysqld
六、MySQL slave部署
1. 主库准备
## 现在mysql的HA已经配置好,当前节点在master1上
## 简单配置优化
[root@master1 ~]# mysql -uroot -p111111 -S /data/master/mysql.sock
mysql> drop database test;
mysql> select user,host from mysql.user;
mysql> delete from mysql.user where user='';
mysql> delete from mysql.user where host='master1';
mysql> delete from mysql.user where host='::1';
mysql> select user,host from mysql.user;
+------+-----------+
| user | host |
+------+-----------+
| root | 127.0.0.1 |
| root | localhost |
+------+-----------+
2 rows in set (0.00 sec)
mysql> flush privileges;
## 修改主库参数
[root@master1 ~]# vi /data/master/my.cnf
[mysqld]
server-id = 1
log-bin=/data/master/mysql-bin
## 重启mysql
[root@master1 ~]# /etc/ha.d/resource.d/mysqld restart
[root@master1 ~]# netstat -nltpd |grep mysql
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 43242/mysqld
[root@master1 ~]# ll /data/master/|grep mysql-bin
-rw-rw---- 1 mysql mysql 126 Mar 19 16:03 mysql-bin.000001
-rw-rw---- 1 mysql mysql 107 Mar 19 16:03 mysql-bin.000002
-rw-rw---- 1 mysql mysql 60 Mar 19 16:03 mysql-bin.index
## 主库创建同步帐号rep
# mysql -uroot -p111111 -S /data/master/mysql.sock
mysql> grant replication slave on *.* to 'rep'@'192.168.12.%' identified by '123456';
mysql> flush privileges;
mysql> select user,host from mysql.user;
## 测试连接,从库通过vip进行同步
# mysql -urep -p123456 -h292.168.12.1 -P3306
2. 从库初始化
## 从库安装到master1的本地磁盘
[root@master1 ~]# mkdir -p /mysql/slave
[root@master1 ~]# cd /app/mysql/scripts/
# ./mysql_install_db --basedir=/app/mysql --datadir=/mysql/slave/data --user=mysql
3. 配置从库my.cnf
# vi /mysql/slave/my.cnf
[mysqld]
socket = /mysql/slave/mysql.sock
port = 3307
pid-file = /mysql/slave/mysql.pid
datadir = /mysql/slave/data
basedir = /app/mysql
user = mysql
server-id=2
[client]
port = 3307
socket = /mysql/slave/mysql.sock
[mysql]
no-auto-rehash
# chown -R mysql. /mysql/slave
## 启动mysql
# mysqld_safe --defaults-file=/mysql/slave/my.cnf &
# netstat -nltpd |grep mysql
tcp 0 0 0.0.0.0:3307 0.0.0.0:* LISTEN 62750/mysqld
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 62462/mysqld
4. 从库change master
## 从库通过vip连接主库
[root@master1 ~]# mysql -uroot -p -S /mysql/slave/mysql.sock
mysql> change master to
master_host='192.168.12.1',
master_port=3306,
master_user='rep',
master_password='123456';
mysql> start slave;
mysql> show slave status\G
## 测试主从同步
[root@master1 ~]# mysql -uroot -p111111 -S /data/master/mysql.sock
mysql> create database shaw_db;
mysql> create table shaw_db.t_user as select * from mysql.user;
Query OK, 3 rows affected (0.03 sec)
Records: 3 Duplicates: 0 Warnings: 0
## 从库查看数据
mysql> show databases like 'shaw%';
+------------------+
| Database (shaw%) |
+------------------+
| shaw_db |
+------------------+
mysql> select count(*) from shaw_db.t_user;
+----------+
| count(*) |
+----------+
| 7 |
+----------+
## 这里注意,为什么这里主库插入3条数据,而从库插入7条数据,因为从库初始化后直接change master同步的主库数据,而mysql库中user表里面的数据并没有清理,因此查询的数据会比主库多
mysql> select user,host from shaw_db.t_user;
+------+--------------+
| user | host |
+------+--------------+
| root | localhost |
| root | master1 |
| root | 127.0.0.1 |
| root | ::1 |
| | localhost |
| | master1 |
| rep | 192.168.12.% |
+------+--------------+
7 rows in set (0.00 sec)
5. 主库高可用切换,查看从库状态
## 目前主库在master1上,切换ha到master2上
[root@master1 ~]# /etc/init.d/heartbeat stop
## 此时主库已经切换到master2上,查看当前从库状态
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Reconnecting after a failed master event read
Master_Host: 192.168.12.1
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 537
Relay_Log_File: mysql-relay-bin.000003
Relay_Log_Pos: 683
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 537
Relay_Log_Space: 985
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 2003
Last_IO_Error: error reconnecting to master 'rep@192.168.12.1:3306' - retry-time: 60 retries: 86400
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)
## 等待60s后,从库自动和master2同步了
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.12.1
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 107
Relay_Log_File: mysql-relay-bin.000005
Relay_Log_Pos: 253
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 107
Relay_Log_Space: 555
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)
## 此时在master2上创建一些表 看是否能同步到从库上
[root@master2 ~]# mysql -uroot -p111111 -S /data/master/mysql.sock
mysql> use shaw_db;
mysql> create table t_zhong (id int,name varchar(20));
mysql> insert into t_zhong values(1,'test'),(2,'test2'),(3,'test3');
## 从库查看数据
mysql> select * from shaw_db.t_zhong;
+------+-------+
| id | name |
+------+-------+
| 1 | test |
| 2 | test2 |
| 3 | test3 |
+------+-------+
七、高可用脑裂问题及解决方案
7.1 导致裂脑发生的原因
1、高可用服务器之间心跳链路故障,导致无法相互检查心跳
2、高可用服务器上开启了防火墙,阻挡了心跳检测
3、高可用服务器上网卡地址等信息配置不正常,导致发送心跳失败
4、其他服务配置不当等原因,如心跳方式不同,心跳广播冲突,软件BUG等
7.2 防止裂脑一些方案
1、加冗余线路
2、检测到裂脑时,强行关闭心跳检测(远程关闭主节点,控制电源的电路fence)
3、做好脑裂的监控报警
4、报警后,备节点在接管时设置比较长的时间去接管,给运维人员足够的时间去处理(人为处理)
5、启动磁盘锁,正在服务的一方锁住磁盘,裂脑发生时,让对方完全抢不走"共享磁盘资源"
磁盘锁存在的问题:
使用锁磁盘会有死锁的问题,如果占用共享磁盘的一方不主动"解锁"另一方就永远得不到共享磁盘,假如服务器节点突然死机或崩溃,就不可能执行解锁命令,备节点也就无法接管资源和服务了,有人在HA中设计了智能锁,正在提供服务的一方只在发现心跳全部断开时才会启用磁盘锁,平时就不上锁