高可用集群之keepalived

文摘   2024-08-20 19:09   北京  

点击上方“IT那活儿”公众号--专注于企业全栈运维技术分享,不管IT什么活儿,干就完了!!!   



高可用集群(HA)都面临脑裂的问题。作为一个整体相互配合的系统,由于失去联系,各自认为对方为故障节点,从而分裂为两个单独的系统。脑裂节点会争抢资源,导致系统混乱,数据错误。特别是对于有状态的应用的高可用,如数据库,必须严格控制脑裂。




keepalived脑裂

1.1 keepalived脑裂

当Keepalived的BACKUP主机在收不到MASTER主机报文后就会切换成为master,如果是它们之间的通信线路出现问题,无法接收到彼此的组播通知。但是两个节点实际都处于正常工作状态,这时两个节点均为master强行绑定虚拟IP,导致不可预料的后果,这就是脑裂。
高可用首先要解决的就是脑裂问题。

1.2 解决方案

1)两台keepalived可以直接连通
添加更多的检测手段,比如冗余的心跳线(两块网卡做健康监测),ping对方等。
尽量减少"裂脑"发生机会(治标不治本,只是提高了检测到的概率)
2)三台keepalived
算法保证,比如采用投票机制(keepalived没有实现)
3)四台机器
设置仲裁机制。两方都不可靠,那就依赖第三方。比如启用共享磁盘锁,ping网关等(针对不同的手段还需具体分析)

keepalived介绍
keepalived是集群管理中保证集群高可用的一个服务软件,其功能类似于heartbeat,用来防止单点故障。

2.1 VRRP协议

keepalived是以VRRP协议为实现基础的,VRRP全称Virtual Router Redundancy Protocol,即虚拟路由冗余协议。可以认为是实现路由器高可用的协议

2.2 工作原理

将N台提供相同功能的路由器组成一个路由器组,这个组里面有一个master和多个backup,master上面有一个对外提供服务的vip(该路由器所在局域网内其他机器的默认路由为该vip),master会发组播,当backup收不到vrrp包时就认为master宕掉了,这时就需要根据VRRP的优先级来选举一个backup当master。这样的话就可以保证路由器的高可用了。

2.3 核心三模块

  • core模块为keepalived的核心,负责主进程的启动、维护以及全局配置文件的加载和解析;
  • check负责健康检查,包括常见的各种检查方式;
  • vrrp模块是来实现VRRP协议的。


keepalived使用

3.1 keepalived+nginx集群部署

环境:
  • web1: 11.11.11.137
  • web2: 11.11.11.138
  • vip: 11.11.11.222
  • client: 11.11.11.140
1)准备工作
先在web1、web2上部署nginx,并且关闭selinux和firewalld:
yum -y install nginx
echo "<h1>137<h1>" >/usr/share/nginx/html/index.html
sed -ri /^SELINUX=/cSELINUX=disabled /etc/selinux/config && setenforce 0
systemctl stop firewalld && systemctl disable firewalld
2)在web1、web2部署keepalived
yum -y install keepalived

cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak

vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
       notification_email {
               root@localhost
               }
       notification_email_from keepalived@localhost
       smtp_server 127.0.0.1
       smtp_connect_timeout 30
       router_id Director1
}

#
vrrp_script chk_nginx {
#        script "/etc/keepalived/ck_ng.sh"
#        interval 2
#        weight -5
#        fall 3
#}
vrrp_instance VI_1 {
       state MASTER
       interface ens33
       mcast_src_ip 11.11.11.137
       virtual_router_id 51
       priority 100
       advert_int 1
       authentication {
               auth_type PASS
               auth_pass 1234
               }
       virtual_ipaddress {
               11.11.11.222/24
               }
#        track_script {
#                chk_nginx
#        }
}
3)web2 keepalived.conf修改
scp /etc/keepalived/keepalived.conf root@11.11.11.138:/etc/keepalived/
修改:
state MASTER改为 state BACKUP
   mcast_src_ip 11.11.11.137 改为 mcast_src_ip 11.11.11.138
   priority 100 改为priority 99
4)测试vip绑定
curl 11.11.11.222   # vip
     <h1>137<h1>
web1的keepalived服务停掉 或者 关闭web1服务器网络 (模仿web1服务器故障):
curl 11.11.11.222
 <h1>138<h1> # 自动切换到web2页面
5)配置keepalived守护nginx
如果其他原因导致nginx停止服务,但是keepalived服务依旧在工作,此时客户端也访问不到网站。
为了避免这种情况,keepalived支持使用脚本实现对nginx进行守护。
cat /etc/keepalived/ck_ng.sh # 守护脚本
#!/bin/bash
#检查nginx进程是否存在
counter=$(ps -C nginx --no-heading|wc -l)
if [ "${counter}" = "0" ]; then
   #尝试启动一次nginx,停止5秒后再次检测
   systemctl restart nginx
   sleep 5
   counter=$(ps -C nginx --no-heading|wc -l)
   if [ "${counter}" = "0" ]; then
       #如果启动没成功,就杀掉keepalive触发主备切换
       systemctl stop keepalived
   fi
fi

chmod a+x /etc/keepalived/ck_ng.sh # 可执行权限

vim /etc/keepalived/keepalived.conf
 ! Configuration File for keepalived # 默认规则 必须要首行写
 # 第一部分:全局定义块
 global_defs {
         notification_email {
                 root@localhost # 指定keepalived在发生切换时需要发送email地址
                 }
         notification_email_from keepalived@localhost
         smtp_server 127.0.0.1
         smtp_connect_timeout 30
         router_id Director1 # 运行keepalived机器的一个标识 集群内唯一
 }
 # 健康检查
 vrrp_script chk_nginx { # nginx守护脚本
         script "/etc/keepalived/ck_ng.sh"      # 检查脚本 绝对路径
         interval 2 # 检查频率 每2s检查一次
         weight -5 # 失败三次 权值减5
         fall 3
 }
 # 实例配置
 vrrp_instance VI_1 { # 实例VI_1
         state MASTER # 主 keepalived
         interface ens33 # 监听网卡
         mcast_src_ip 11.11.11.137 # 心跳源地址 host ip
         virtual_router_id 51 # 虚拟路由编号 主从一致
         priority 100 # 优先级 权值
         advert_int 1 # 心跳间隔 可以是毫秒
         authentication { # 认证 防止其他设备加入该组
                 auth_type PASS
                 auth_pass 1234
                 }
         virtual_ipaddress { # vip
                 11.11.11.222/24
                 }
         track_script { # 监控nginx服务 脚本
                 chk_nginx # 名字和 vrrp_script 一致
         }
 }

3.2 keepalived+lvs集群部署

环境:
  • lvs1: 11.11.11.137
  • lvs2: 11.11.11.138
  • vip: 11.11.11.222
  • web1: 11.11.11.139
  • web2: 11.11.11.140
  • client: 11.11.11.136
1)配置两台lvs+keepalived(lvs1、lvs2)
yum -y install keepalived ipvsadm #2台同时安装
vim /etc/keepalived/keepalived.conf
 ! Configuration File for keepalived
 global_defs {
         notification_email {
                 root@localhost
                 }
         notification_email_from keepalived@localhost
         smtp_server 127.0.0.1
         smtp_connect_timeout 30
         router_id Director1
 }

 vrrp_instance VI_1 {
         state MASTER
         interface ens33
         virtual_router_id 51
         priority 100      #权值 1-255任意数字
         advert_int 1
         authentication {
                 auth_type PASS
                 auth_pass 1111
                 }
         virtual_ipaddress {
                 11.11.11.222/24 dev ens33 #vip
                 }
 }
 virtual_server 11.11.11.222 80 {
         delay_loop 3       #轮询时间间隔
         lb_algo rr #轮询模式
         lb_kind DR #lvs模式
         protocol TCP
         real_server 11.11.11.139 80 {
                 weight 1
                 TCP_CHECK {
                         connect_timeout 3
                         }
                 }
         real_server 11.11.11.140 80 {
                 weight 1
                 TCP_CHECK {
                         connect_timeout 3
                         }
                 }
 }

scp /etc/keepalived/keepalived.conf root@node-3:/etc/keepalived/
需要更改:
  •  state MASTER  改为 state BACKUP;
  •  priority 100  改为 priority 80  小于100。
2)配置2台web服务器
yum -y install nginx
echo web1111111111111 >/usr/share/nginx/html/index.html # 设置主页 直观看出实验结果

ifconfig lo:0 11.11.11.222/32       # 绑定vip

# 配置ARP
echo 1 >/proc/sys/net/ipv4/conf/all/arp_ignore
echo 1 >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo 2 >/proc/sys/net/ipv4/conf/all/arp_announce
echo 2 >/proc/sys/net/ipv4/conf/lo/arp_announce
注:上面配置参数为临时参数,重启机器后失效。
如果要配置永久参数,操作如下(web1 web2 都要配置):
cp /etc/sysconfig/network-scripts/{ifcfg-lo,ifcfg-lo:0}
vim /etc/sysconfig/network-scripts/ifcfg-lo:0
DEVICE=lo:0
IPADDR=11.11.11.222
NETMASK=255.255.255.255
ONBOOT=yes

vim /etc/sysctl.conf
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.lo.arp_ignore = 1
net.ipv4.conf.lo.arp_announce = 2

systemctl start nginx

3)client端测试

curl 11.11.11.222

ipvsadm -L # 查看lvs转发
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP node-2:http rr
-> node-4:http Route 1 0 2

->
 11.11.11.140:http Route 1 0 3

ip a (关闭master的keepalived)
vip 11.11.11.222 在master上 如果master宕机 vip会跳到backup

#
关闭web1的nginx
client端 curl 11.11.11.222

3.3 keepalived+lvs双主集群部署

~~**双主集群部署在单主集群的基础上进行部署**~~、
环境:
  • lvs1: 11.11.11.137
  • lvs2: 11.11.11.138
  • vip: 11.11.11.222
  • vip: 11.11.11.223
  • web1: 11.11.11.139
  • web2: 11.11.11.140
  • web3: 11.11.11.141
  • web4: 11.11.11.142
  • client: 11.11.11.136
1)在原来配置的基础上新增一套主备配置(在同一个配置文件中)
vip: 11.11.11.222 
实例:VI_1  lvs1为MASTER lvs2为BACKUP。
vip: 11.11.11.223 
实例:VI_2  lvs1为BACKUP lvs2为MASTER。
2)lvs1配置文件
vim /etc/keepalived/keepalived.conf
  • lvs1为MASTER lvs2为BACKUP 的配置:
! Configuration File for keepalived
 global_defs {
         notification_email {
                 root@localhost
                 }
         notification_email_from keepalived@localhost
         smtp_server 127.0.0.1
         smtp_connect_timeout 30
         router_id Director1
         }

 vrrp_instance VI_1 {
         state MASTER
         interface ens33
         virtual_router_id 51
         priority 100
         advert_int 2
         authentication {
                 auth_type PASS
                 auth_pass 1111
                 }
         virtual_ipaddress {
                 11.11.11.222/24 dev ens33
                 }
         }

 virtual_server 11.11.11.222 80 {
         delay_loop 3
         lb_algo rr
         lb_kind DR
         protocol TCP
         real_server 11.11.11.139 80 {
                 weight 1
                 TCP_CHECK {
                         connect_timeout 3
                         }
                 }
         real_server 11.11.11.140 80 {
                 weight 1
                 TCP_CHECK {
                         connect_timeout 3
                         }
                 }
     }
  • lvs1为BACKUP lvs2为MASTER 的配置:
vrrp_instance VI_2 {
         state BACKUP
         interface ens33
         virtual_router_id 55
         priority 90
         advert_int 2
         authentication 
{
                 auth_type PASS
                 auth_pass 1234
                 }
         virtual_ipaddress {
                 11.11.11.223/24 dev ens33
                 }
         }
 virtual_server 11.11.11.223 80 {
         delay_loop 3
         lb_algo rr
         lb_kind DR
         protocol TCP
         real_server 11.11.11.139 80 {
                 weight 1
                 TCP_CHECK {
                         connect_timeout 3
                         }
                 }
         real_server 11.11.11.140 80 {
                 weight 1
                 TCP_CHECK {
                         connect_timeout 3
                         }
                 }
 }
3)lvs2配置文件
vim /etc/keepalived/keepalived.conf
  • lvs1为MASTER lvs2为BACKUP 的配置:
! Configuration File for keepalived
global_defs {
       notification_email {
               root@localhost
               }
       notification_email_from keepalived@localhost
       smtp_server 127.0.0.1
       smtp_connect_timeout 30
       router_id Director1
       }

vrrp_instance VI_1 {
       state BACKUP
       interface ens33
       virtual_router_id 51
       priority 90
       advert_int 2
       authentication {
               auth_type PASS
               auth_pass 1111
               }
       virtual_ipaddress {
               11.11.11.222/24 dev ens33
               }
       }

virtual_server 11.11.11.222 80 {
       delay_loop 3
       lb_algo rr
       lb_kind DR
       protocol TCP
       real_server 11.11.11.139 80 {
               weight 1
               TCP_CHECK {
                       connect_timeout 3
                       }
               }
       real_server 11.11.11.140 80 {
               weight 1
               TCP_CHECK {
                       connect_timeout 3
                       }
               }
   }
  • lvs1为BACKUP lvs2为MASTER 的配置:
vrrp_instance VI_2 {
       state MASTER

       interface ens33
       virtual_router_id 55
       priority 100
       advert_int 2
       authentication 
{
               auth_type PASS
               auth_pass 1234
               }
       virtual_ipaddress {
               11.11.11.223/24 dev ens33
               }
       }
virtual_server 11.11.11.223 80 {
       delay_loop 3
       lb_algo rr
       lb_kind DR
       protocol TCP
       real_server 11.11.11.141 80 {
               weight 1
               TCP_CHECK {
                       connect_timeout 3
                       }
               }
       real_server 11.11.11.142 80 {
               weight 1
               TCP_CHECK {
                       connect_timeout 3
                       }
               }
}
4)在两台web主机在新增一个网卡绑定vip
ifconfig lo:1 11.11.11.223/32    #原来网卡为lo:0  vip为11.11.11.222

注:

  • 要有两个实例 VI_1 和 VI_2;
  • 两套集群virtual_router_id要不同,每套主备的virtual_router_id相同;
  • 要有两个vip;
  • 每个MASTER的priority要比BACKUP高。

END


本文作者:赵建强(上海新炬中北团队)

本文来源:“IT那活儿”公众号

数据库杂记
数据库技术专家,PostgreSQL ACE,SAP HANA,Sybase ASE/ASA,Oracle,MySQL,SQLite各类数据库, SAP BTP云计算技术, 以及陈式太极拳教学倾情分享。出版过三本技术图书,武术6段。
 最新文章