標籤:
前幾天網友來信說幫忙實現這樣一個架構:只有兩台機器,需要實現其中一台死機之後另一台能接管這台機器的服務,並且在兩台機器正常服務時,兩台機器都能用上。於是設計了如下的架構。
架構簡介
此架構主要是由keepalived實現雙機高可用,維護了一個外網VIP,一個內網VIP。正常情況時,外網VIP和內網VIP都綁定在server1伺服器,web請求發送到server1的Nginx,nginx對於靜態資源請求就直接在本機檢索並返回,對於PHP的動態請求,則負載平衡到server1和server2。對於SQL請求,會將此類請求發送到Atlas mysql中介軟體,Atlas接收到請求之後,把涉及寫操作的請求發送到內網VIP,讀請求操作發送到server2,這樣就實現了讀寫分離。
當主伺服器server1宕機時,keepalived檢測到後,立即把外網VIP和內網VIP綁定到server2,並把server2的mysql切換成主庫。此時由於外網VIP已經轉移到了server2,web請求將發送給server2的nginx。nginx檢測到server1宕機,不再把請求轉寄到server1的php-fpm。之後的sql請求照常發送給本地的atlas,atlas把寫操作發送給內網VIP,讀操作發送給server2 mysql,由於內網VIP已經綁定到server2了,server2的mysql同時接受寫操作和讀操作。
當主伺服器server1恢複後,keepalived不搶佔server2的VIP,繼續正常服務。我們可以把server1的mysql切換成主,也可以切換成從。
架構要求
要實現此架構,需要三個條件:
- 伺服器可以設定內網ip,並且設定的內網IP互連;
- 伺服器可以隨意綁定IDC分配給我們使用的外網IP,即外網IP沒有綁定MAC地址;
- MySQL伺服器支援GTID,即MySQL-5.6.5以上版本。
環境說明
server1
eth0: 10.96.153.110(對外IP)eth1: 192.168.3.100(對內IP)
server2
eth0: 10.96.153.114(對外IP)eth1: 192.168.3.101(對內IP)
系統都是CentOS-6。
對外VIP: 10.96.153.239對內VIP: 192.168.3.150
hosts設定
/etc/hosts:192.168.3.100 server1192.168.3.101 server2
Nginx PHP MySQL安裝
這幾個軟體的安裝推薦使用EZHTTP來完成。
Nginx配置
Server1配置
http {[...] upstream php-server { server 192.168.3.101:9000; server 127.0.0.1:9000; keepalive 100; }[...] server { [...] location ~ \.php$ { fastcgi_pass php-server; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } [...] }[...]}
Server2配置
http {[...] upstream php-server { server 192.168.3.100:9000; server 127.0.0.1:9000; keepalive 100; }[...] server { [...] location ~ \.php$ { fastcgi_pass php-server; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include fastcgi_params; } [...] }[...]}
這兩個配置主要的作用是設定php請求的負載平衡。
MySQL配置
mysql util安裝
我們需要安裝mysql util裡的主從組態工具來實現主從切換。
cd /tmpwget http://dev.mysql.com/get/Downloads/MySQLGUITools/mysql-utilities-1.5.3.tar.gztar xzf mysql-utilities-1.5.3.tar.gzcd mysql-utilities-1.5.3python setup.py buildpython setup.py install
mysql my.cnf配置
server1:
[mysql][...]protocol=tcp[...][...][mysqld][...]# BINARY LOGGING #log-bin = /usr/local/mysql/data/mysql-binexpire-logs-days = 14sync-binlog = 1binlog-format=ROWlog-slave-updates=truegtid-mode=onenforce-gtid-consistency =truemaster-info-repository=TABLErelay-log-info-repository=TABLEsync-master-info=1server-id=1report-host=server1report-port=3306[...]
server2:
[mysql][...]protocol=tcp[...][mysqld][...]# BINARY LOGGING #log-bin = /usr/local/mysql/data/mysql-binexpire-logs-days = 14sync-binlog = 1binlog-format=ROWlog-slave-updates=truegtid-mode=onenforce-gtid-consistency =truemaster-info-repository=TABLErelay-log-info-repository=TABLEsync-master-info=1server-id=2report-host=server2report-port=3306[...]
這兩個配置主要是設定了binlog和啟用gtid-mode,並且需要設定不同的server-id和report-host。
開放root帳號遠程許可權:
我們需要在兩台mysql伺服器設定root帳號遠端存取許可權。
mysql> grant all on *.* to ‘root‘@‘192.168.3.%‘ identified by ‘Xp29at5F37‘ with grant option;mysql> grant all on *.* to ‘root‘@‘server1‘ identified by ‘Xp29at5F37‘ with grant option;mysql> grant all on *.* to ‘root‘@‘server2‘ identified by ‘Xp29at5F37‘ with grant option;mysql> flush privileges;
設定mysql主從
在任意一台執行如下命令:
mysqlreplicate --master=root:[email protected]:3306 --slave=root:[email protected]:3306 --rpl-user=rpl:o67DhtaW# master on server1: ... connected.# slave on server2: ... connected.# Checking for binary logging on master...# Setting up replication...# ...done.
顯示主從關係
mysqlrplshow --master=root:[email protected] --discover-slaves-login=root:Xp29at5F37# master on server1: ... connected.# Finding slaves for master: server1:3306# Replication Topology Graphserver1:3306 (MASTER)|+--- server2:3306 - (SLAVE)
檢查主從狀態
mysqlrplcheck --master=root:[email protected] --slave=root:[email protected]# master on server1: ... connected.# slave on server2: ... connected.test Description Status---------------------------------------------------------------------------Checking for binary logging on master [pass]Are there binlog exceptions? [pass]Replication user exists? [pass]Checking server_id values [pass]Checking server_uuid values [pass]Is slave connected to master? [pass]Check master information file [pass]Checking InnoDB compatibility [pass]Checking storage engines compatibility [pass]Checking lower_case_table_names settings [pass]Checking slave delay (seconds behind master) [pass]# ...done.
在server2建立主從切換指令碼
vi /data/sh/mysqlfailover.sh#!/bin/bashmysqlrpladmin --slave=root:[email protected]:3306 failoverchmod +x /data/sh/mysqlfailover.sh
Keepalived配置
keepalived安裝(兩台都裝)
yum -y install keepalivedchkconfig keepalived on
keepalived配置(server1)
vi /etc/keepalived/keepalived.confvrrp_sync_group VG_1 {group {inside_networkoutside_network}} vrrp_instance inside_network {state BACKUPinterface eth1virtual_router_id 51priority 101advert_int 1authentication {auth_type PASSauth_pass 3489}virtual_ipaddress {192.168.3.150/24}nopreempt} vrrp_instance outside_network {state BACKUPinterface eth0virtual_router_id 50priority 101advert_int 1authentication {auth_type PASSauth_pass 3489}virtual_ipaddress {10.96.153.239/24}nopreempt}
keepalived配置(server2)
vrrp_sync_group VG_1 {group {inside_networkoutside_network}} vrrp_instance inside_network {state BACKUPinterface eth1virtual_router_id 51priority 100advert_int 1authentication {auth_type PASSauth_pass 3489}virtual_ipaddress {192.168.3.150}notify_master /data/sh/mysqlfailover.sh} vrrp_instance outside_network {state BACKUPinterface eth0virtual_router_id 50priority 100advert_int 1authentication {auth_type PASSauth_pass 3489}virtual_ipaddress {10.96.153.239/24}}
此keepalived配置需要注意的是:
- 兩台server的state都設定為backup,server1增加nopreempt配置,並且server1 priority比server2高,這樣用來實現當server1從宕機恢複時,不搶佔VIP;
- server2設定notify_master /data/sh/mysqlfailover.sh,意味著server2接管server1後,執行這個指令碼,以把server2的mysql提升為主。
Atlas設定
atlas安裝
到這裡下載最新版本,https://github.com/Qihoo360/Atlas/releases
cd /tmpwget https://github.com/Qihoo360/Atlas/releases/download/2.2.1/Atlas-2.2.1.el6.x86_64.rpmrpm -i Atlas-2.2.1.el6.x86_64.rpm
atlas配置
cd /usr/local/mysql-proxy/confcp test.cnf my.cnfvi my.cnf
調整如下參數,
proxy-backend-addresses = 192.168.3.150:3306proxy-read-only-backend-addresses = 192.168.3.101:3306pwds = root:qtyU1btXOo074Itvx0UR9Q==event-threads = 8
注意:
proxy-backend-addresse
設定為內網VIP
proxy-read-only-backend-addresses
設定為server2的IP
root:qtyU1btXOo074Itvx0UR9Q==
設定資料庫的使用者和密碼,密碼是通過/usr/local/mysql-proxy/bin/encrypt Xp29at5F37
產生。更詳細參數解釋請查看,Atlas配置詳解。
啟動atlas
/usr/local/mysql-proxy/bin/mysql-proxy --defaults-file=/usr/local/mysql-proxy/conf/my.cnf
之後程式裡配置mysql就配置127.0.0.1:1234就好。
server1主宕機測試
測試keepalived是否工作正常,我們來類比server1宕機。在server1上執行shutdown關機命令。此時我們登入server2,執行ip addr命令,輸出如下:
1: lo: mtu 16436 qdisc noqueue state UNKNOWNlink/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host loinet6 ::1/128 scope hostvalid_lft forever preferred_lft forever2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000link/ether 00:0c:29:81:9d:42 brd ff:ff:ff:ff:ff:ffinet 10.96.153.114/24 brd 10.96.153.255 scope global eth0inet 10.96.153.239/24 scope global secondary eth0inet6 fe80::20c:29ff:fe81:9d42/64 scope linkvalid_lft forever preferred_lft forever3: eth1: mtu 1500 qdisc pfifo_fast state UP qlen 1000link/ether 00:0c:29:81:9d:4c brd ff:ff:ff:ff:ff:ffinet 192.168.3.101/24 brd 192.168.3.255 scope global eth1inet 192.168.3.150/32 scope global eth1inet6 fe80::20c:29ff:fe81:9d4c/64 scope linkvalid_lft forever preferred_lft forever我們看到對外VIP 10.96.153.239和對內IP 192.168.3.150已經轉移到server2了,證明keepalived運行正常。
測試是否自動切換了主從,登入server2的mysql伺服器,執行show status;命令,如下:
mysql> show slave statusGEmpty set (0.00 sec)
我們發現從狀態已經為空白,證明已經切換為主了。
測試server1是否搶佔VIP,為什麼要測試這個呢?如果server1恢複之後搶佔了VIP,而我們的Atlas裡後端設定的是VIP,這樣server1啟動之後,sql的寫操作就會向server1的mysql發送,而server1的mysql資料是舊於server2的,所以這樣會造成資料不一致,這個是非常重要的測試。
我們先來啟動server1,之後執行ip addr,輸出如下:
1: lo: mtu 16436 qdisc noqueue state UNKNOWNlink/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00inet 127.0.0.1/8 scope host loinet6 ::1/128 scope hostvalid_lft forever preferred_lft forever2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000link/ether 00:0c:29:f1:4f:4e brd ff:ff:ff:ff:ff:ffinet 10.96.153.110/24 brd 10.96.153.255 scope global eth0inet6 fe80::20c:29ff:fef1:4f4e/64 scope linkvalid_lft forever preferred_lft forever3: eth1: mtu 1500 qdisc pfifo_fast state UP qlen 1000link/ether 00:0c:29:f1:4f:58 brd ff:ff:ff:ff:ff:ffinet 192.168.3.100/24 brd 192.168.3.255 scope global eth1inet6 fe80::20c:29ff:fef1:4f58/64 scope linkvalid_lft forever preferred_lft forever
我們看到,server1並沒有搶佔VIP,測試正常。不過另人鬱悶的是,在虛擬機器的環境並沒有測試成功,不知道為什麼。
如何恢複server1
設定server1 mysql為從,server1從宕機中恢複之後,mysql的資料已經舊於server2的資料了,這時我們先設定server1 mysql為從。
mysqlreplicate --master=root:[email protected]:3306 --slave=root:[email protected]:3306 --rpl-user=rpl:o67DhtaW# master on server2: ... connected.# slave on server1: ... connected.# Checking for binary logging on master...# Setting up replication...# ...done.
看到提示是設定成功了。
擷取server1 mysql資料資料同步情況,server1 mysql剛從宕機恢複,有可能資料遠遠落後於server2 mysql,所以我們先查看它們之間的資料同步情況。登入server1 mysql,執行如下sql:
mysql> show slave statusG*************************** 1. row ***************************Slave_IO_State: Waiting for master to send eventMaster_Host: server2Master_User: rplMaster_Port: 3306Connect_Retry: 60Master_Log_File: mysql-bin.000004Read_Master_Log_Pos: 2894Relay_Log_File: mysql-relay-bin.000002Relay_Log_Pos: 408Relay_Master_Log_File: mysql-bin.000004Slave_IO_Running: yesSlave_SQL_Running: Yes
我們記下Read_Master_Log_Pos的值為2894,登入server2 mysql,執行如下sql:
mysql> show master statusG*************************** 1. row ***************************File: mysql-bin.000004Position: 2894Binlog_Do_DB:Binlog_Ignore_DB:Executed_Gtid_Set: 9347e042-9044-11e4-b4f0-000c29f14f4e:1-7,f5bbfc15-904a-11e4-b519-000c29819d42:1-61 row in set (0.00 sec)
記下Position的值,並與Read_Master_Log_Pos比較,如果這兩個值非常相近或相等,說明資料已經同步得差不多了,可以進行切換操作;如果差得很遠,需要等待它們同步完成。
屏蔽mysql寫操作
我們需要在切換時先禁止sql的寫操作,如果不這樣做,就會在切換時造成資料不一致的問題。屏蔽寫操作我們在Atlas上操作。在server2執行登入Atlas命令:
mysql -h127.0.0.1 -P2345 -uuser -ppwdmysql> SELECT * FROM backends;+-------------+--------------------+-------+------+| backend_ndx | address | state | type |+-------------+--------------------+-------+------+| 1 | 192.168.3.150:3306 | up | rw || 2 | 192.168.3.101:3306 | up | ro |+-------------+--------------------+-------+------+2 rows in set (0.00 sec)
執行SELECT * FROM backends;後我們看到backend id為1,所以我們執行SET OFFLINE 1;設定此後端下線。
mysql> SET OFFLINE 1;+-------------+--------------------+---------+------+| backend_ndx | address | state | type |+-------------+--------------------+---------+------+| 1 | 192.168.3.150:3306 | offline | rw |+-------------+--------------------+---------+------+1 row in set (0.00 sec)
mysql> SELECT * FROM backends;+-------------+--------------------+---------+------+| backend_ndx | address | state | type |+-------------+--------------------+---------+------+| 1 | 192.168.3.150:3306 | offline | rw || 2 | 192.168.3.101:3306 | up | ro |+-------------+--------------------+---------+------+2 rows in set (0.00 sec)
這時用戶端就無法寫入資料了。
恢複server1 mysql為主
mysqlrpladmin --master=root:[email protected]:3306 --new-master=root:[email protected]:3306 --demote-master --discover-slaves-login=root:Xp29at5F37 switchover# Discovering slaves for master at server2:3306# Discovering slave at server1:3306# Found slave: server1:3306# Checking privileges.# Performing switchover from master at server2:3306 to slave at server1:3306.# Checking candidate slave prerequisites.# Checking slaves configuration to master.# Waiting for slaves to catch up to old master.# Stopping slaves.# Performing STOP on all slaves.# Demoting old master to be a slave to the new master.# Switching slaves to new master.# Starting all slaves.# Performing START on all slaves.# Checking slaves for errors.# Switchover complete.
再次檢查是否恢複成功.
mysqlrplcheck --master=root:[email protected] --slave=root:[email protected]# master on server1: ... connected.# slave on server2: ... connected.Test Description Status---------------------------------------------------------------------------Checking for binary logging on master [pass]Are there binlog exceptions? [pass]Replication user exists? [pass]Checking server_id values [pass]Checking server_uuid values [pass]Is slave connected to master? [pass]Check master information file [pass]Checking InnoDB compatibility [pass]Checking storage engines compatibility [pass]Checking lower_case_table_names settings [pass]Checking slave delay (seconds behind master) [pass]# ...done.
設定VIP回到server1,在server2機器上執行:
/etc/init.d/keepalived restart
然後在兩台機器分別執行ip addr查看ip綁定狀態。
設定server2 atlas後端上線
server2上執行mysql -h127.0.0.1 -P2345 -uuser -ppwd
登入,然後執行SET ONLINE 1;
設定上線(這裡1是後端的id,可以使用SELECT * FROM backends;
查看)
mysql> SET ONLINE 1;
+-------------+--------------------+---------+------+
| backend_ndx | address | state | type |
+-------------+--------------------+---------+------+
| 1 | 192.168.3.150:3306 | unknown | rw |
+-------------+--------------------+---------+------+
1 row in set (0.00 sec)
mysql> SELECT * FROM backends;+-------------+--------------------+-------+------+| backend_ndx | address | state | type |+-------------+--------------------+-------+------+| 1 | 192.168.3.150:3306 | up | rw || 2 | 192.168.3.101:3306 | up | ro |+-------------+--------------------+-------+------+2 rows in set (0.00 sec)
到這裡server1就恢複為主了。
雙機高可用、負載平衡、MySQL(讀寫分離、主從自動切換)架構設計