Real-time backup configuration of data is realized through Rsync+inotify _linux

Source: Internet
Author: User
Tags auth inotify rsync

First, the advantages and disadvantages of rsync

Compared with the traditional CP and tar backup methods, Rsync has the advantages of high security, fast backup, support incremental backup, etc., through rsync can solve the real-time requirements of the data backup, such as regular backup file server data to the remote server, the local disk regularly do data mirroring.
With the expansion of the application system, the security and reliability of the data also put forward better requirements, rsync in high-end business system also gradually exposed a lot of deficiencies, first, rsync synchronized data, need to scan all files after the comparison, to carry out the differential transmission. If the number of files reaches millions or even millions of levels, scanning all files will be time-consuming. And what is changing is often a small part of it, which is a very inefficient way. Second, rsync can not real-time monitoring, synchronization of data, although it could be triggered by the Linux daemon process synchronization, but the two trigger action must have a time difference, which leads to server and client data may be inconsistent, not in the application of the failure to fully recover data. Based on the above reasons, the rsync+inotify combination appears!

Second, the first knowledge inotify

Inotify is a powerful, fine-grained, asynchronous file system event monitoring mechanism, the Linux kernel from 2.6.13, joined the Inotify support, through the Inotify can monitor the file system to add, delete, modify, move and so on a variety of subtle events, using this kernel interface, Third-party software can monitor the file system under the various changes in files, and inotify-tools is such a third-party software.
In the above section, we mentioned that rsync can implement trigger-type file synchronization, but the crontab daemon way of triggering, synchronized data and actual data will be different, and inotify can monitor the various changes in the file system, when the file has any changes, will trigger rsync synchronization, This just solves the problem of real-time synchronization data.

Installation of inotify Tools Inotify-tools

Because the INotify feature requires support from the Linux kernel, Before installing Inotify-tools, verify that the Linux system kernel is up to 2.6 13 or more, if the Linux kernel is lower than 2.6.13, you need to recompile the kernel to join the inotify support, or you can use the following method to determine whether the kernel supports inotify:
[Root@localhost webdata]# Uname-r
2.6.18-164.11.1.el5pae
[Root@localhost webdata]# Ll/proc/sys/fs/inotify
Total 0
-rw-r--r--1 root 0 04-13 19:56 max_queued_events
-rw-r--r--1 root 0 04-13 19:56 max_user_instances
-rw-r--r--1 root 0 04-13 19:56 max_user_watches
If you have the above three output, the system already supports inotify by default, and then you can start installing Inotify-tools.
You can download the appropriate Inotify-tools version to http://inotify-tools.sourceforge.net/, and then start compiling the installation:
[Root@localhost ~]# tar zxvf inotify-tools-3.14.tar.gz
Root@localhost ~]# CD inotify-tools-3.14
[Root@localhost inotify-tools-3.14]#./configure
[Root@localhost inotify-tools-3.14]# make
[Root@localhost inotify-tools-3.14]# make install
[Root@localhost inotify-tools-3.14]# ll/usr/local/bin/inotifywa*
-rwxr-xr-x 1 root root 37264 04-14 13:42/usr/local/bin/inotifywait
-rwxr-xr-x 1 root root 35438 04-14 13:42/usr/local/bin/inotifywatch
After the Inotify-tools installation is complete, inotifywait and Inotifywatch Two instructions are generated, where inotifywait is used to wait for a specific event on a file or set of files that can monitor any file and directory settings. and the entire directory tree can be monitored recursively.
Inotifywatch is used to collect monitored file system statistics, including how much inferior information occurs for each inotify event.

Iv. inotify Related parameters

INotify defines the following interface parameters that can be used to limit the size of the inotify consumption kernel memory. Because these parameters are memory parameters, you can adjust their size in real time according to the application requirements. Here is a brief introduction to each.
/proc/sys/fs/inotify/max_queued_evnets
Represents the maximum number of events that can be queued in inotify instance when the call is invoked, and an event that exceeds this value is discarded, but the In_q_overflow event is triggered. \ inotify_init
/proc/sys/fs/inotify/max_user_instances
Represents the maximum number of inotify instatnces that can be created by each real user ID.
/proc/sys/fs/inotify/max_user_watches
Represents the maximum number of directories that can be monitored by each inotify instatnces. If you monitor a large number of files, you need to increase the size of this value appropriately, for example:
echo 30000000 >/proc/sys/fs/inotify/max_user_watches

Five, inotifywait related parameters

Inotifywait is a monitoring wait event that can be used with shell scripts, and here are some common parameters:
-m, or--monitor, indicates that the event listening state is always maintained.
-r, or--recursive, represents a recursive query directory.
-q, or--quiet, indicates that the monitor event is printed out.
-e, that is,--event, through this parameter can specify the events to monitor, common events are modify, delete, create, attrib and so on.
Please refer to man inotifywait for more details.

Six, rsync+inotify enterprise application case

Case description

This is a CMS content publishing system, the back end of the load-balanced cluster deployment scheme, there is a load scheduling node and three service nodes and a content publishing node composition, the content publishing node is responsible for the user published data generated static page, while the static Web page to the three service nodes, The load dispatching node is responsible for dispatching the user request to the corresponding service node according to the load algorithm and realizing the user access. The Web page data that the user requests to access at the front end is always current and consistent.

Solution

In order to ensure the data consistency and real-time of the user access, it is necessary to ensure that the data of three service nodes and content publishing nodes is always consistent, which needs to be realized through the file synchronization tool, which uses rsync and also ensures that the data is real-time, which requires inotify, That is, use inotify to monitor the content publishing node file changes, if the file changes, then start rsync, the file real-time synchronization to three service nodes.

System environment

All servers here are Linux operating systems, System kernel version and node information as shown in table 1:

Table 1



1 Installing rsync and Inotify-tools

Inotify-tools is a tool for monitoring file system changes, so it must be installed on the content Publishing node, the service node does not need to install inotify-tools, and you need to install rsync on Web1, WEB2, web3, webserver nodes, Because the installation is very simple, this is not the story.
In this case, the Content publishing node (the server) acted as the Rsync client, while three service nodes served as the role of the rsync server, and the whole process of data synchronization was actually a process of pushing data from the client to the service side. This is just the opposite of the case we have described above.

2 Configuring rsync on three service nodes

Here is a three service node of the rsync configuration file for reference, the reader can modify according to the actual situation.

The WEB1 node rsyncd.conf configuration is as follows:

Copy Code code as follows:

UID = nobody
GID = Nobody
Use chroot = no
Max connections = 10
Strict modes = yes
PID file =/var/run/rsyncd.pid
Lock file =/var/run/rsync.lock
Log file =/var/log/rsyncd.log
[Web1]
Path =/web1/wwwroot/
Comment = Web1 file
Ignore errors
Read Only = no
Write only = no
Hosts allow = 192.168.12.134
Hosts deny = *
List = False
UID = root
GID = root
Auth users = Web1user
Secrets file =/etc/web1.pass
The WEB2 node rsyncd.conf configuration is as follows:
UID = nobody
GID = Nobody
Use chroot = no
Max connections = 10
Strict modes = yes
PID file =/var/run/rsyncd.pid
Lock file =/var/run/rsync.lock
Log file =/var/log/rsyncd.log
[WEB2]
Path =/web2/wwwroot/
Comment = web2 file
Ignore errors
Read Only = no
Write only = no
Hosts allow = 192.168.12.134
Hosts deny = *
List = False
UID = root
GID = root
Auth users = Web2user
Secrets file =/etc/web2.pass
The WEB3 node rsyncd.conf configuration is as follows:
UID = nobody
GID = Nobody
Use chroot = no
Max connections = 10
Strict modes = yes
PID file =/var/run/rsyncd.pid
Lock file =/var/run/rsync.lock
Log file =/var/log/rsyncd.log
[WEB3]
Path =/web3/wwwroot/
Comment = web3 file
Ignore errors
Read Only = no
Write only = no
Hosts allow = 192.168.12.134
Hosts deny = *
List = False
UID = root
GID = root
Auth users = Web3user
Secrets file =/etc/web3.pass

After the three service node rsyncd.conf file configuration completes, start the rsync daemon, and then add the Rsync service to the boot file:
echo "/usr/local/bin/rsync--daemon" >>/etc/rc.local
So far, three Web service nodes have been configured to complete.

3 Configure Content Publishing node

The main task of configuring the Content publishing node is to synchronize the generated static Web pages to three service nodes in the cluster in real time, a process that can be done with a shell script that is roughly as follows:
#!/bin/bash
host1=192.168.12.131
host2=192.168.12.132
host3=192.168.12.133
src=/web/wwwroot/
Dst1=web1
Dst2=web2
Dst3=web3
User1=web1user
User2=web3user
User3=web3user
/usr/local/bin/inotifywait-mrq--timefmt '%d/%m/%y%h:%m '--format '%T%w%f%e '-e modify,delete,create,attrib $src \
| While read files
Todo
/USR/BIN/RSYNC-VZRTOPG--delete--progress--password-file=/etc/server.pass $src $user 1@ $host 1:: $DST 1
/USR/BIN/RSYNC-VZRTOPG--delete--progress--password-file=/etc/server.pass $src $user 2@ $host 2:: $DST 2
/USR/BIN/RSYNC-VZRTOPG--delete--progress--password-file=/etc/server.pass $src $user 3@ $host 3:: $DST 3
echo "${files} was rsynced" >>/tmp/rsync.log 2>&1
Done

The script is explained as follows:

--TIMEFMT: Specifies the output format for the time.
--format: Specifies the details of the change file.
These two parameters are generally used in conjunction with the output by specifying the output format, similar to the following:
15/04/10 00:29/web/wwwroot/ixdba.shdelete,isdir was rsynced
15/04/10 00:30/web/wwwroot/index.htmlmodify was rsynced
15/04/10 00:31/web/wwwroot/pcre-8.02.tar.gzcreate was rsynced
The role of this script is to monitor the file directory changes through the inotify, which triggers the synchronization of rsync, because this process is a kind of active triggering operation through the system kernel, so, compared to those traversing the entire directory scanning way, the efficiency is much higher.
Sometimes it happens that when you write a large file to a inotify-monitored directory (this is/web/wwwroot/), it takes a while to write to the large file, and inotify constantly prints out the updated information about the file. This will continue to trigger rsync to perform synchronous operations, the use of a large number of system resources, so in this case, the ideal thing to do is wait for the file after writing to trigger Rsync synchronization. In this case, you can modify the INotify monitoring event, which is: "-e close_write,delete,create,attrib."
Next, name the script inotifyrsync.sh, place it in the/web/wwwroot directory, and then give the executable permission to run in the background:
chmod 755/web/wwwroot/inotifyrsync.sh
/web/wwwroot/inotifyrsync.sh &
Finally, add this script to the system self-boot file:
echo "/web/wwwroot/inotifyrsync.sh &" >>/etc/rc.local
This completes all the configuration work for the content publishing node.

4 test rsync+inotify Real-time sync function

  After all configurations are complete, you can add, delete, or modify a file in the/web/wwwroot directory of the Web Publishing node, and then go to the directory of three service nodes to see if the file changes with the/web/wwwroot directory of the Web publishing node. If you see three service nodes corresponding to the directory file with the Content publishing node directory file synchronization changes, then our business system is configured successfully.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.