Real-time data synchronization in Linux under Rsync+inotify

Source: Internet
Author: User
Tags inotify rsync

First, rsync


1. What is rsync


Rsync is a remote data synchronization tool that enables data synchronization within or across hosts. With the service runtime listening on the TCP 873 port, the rsync algorithm can reach only the part of the file change, not the entire transmission, so the speed is quite fast, good performance.

So rsync can usually be used as a backup tool.



1.1 Rsync Basic Features:


1. Can be mirrored to save the entire directory tree or file system

2. It is easy to maintain the original file permissions, time, soft and hard links, etc.; (via some parameters of rsync, such as-a)

3. High data transfer efficiency

4. The use of RCP, SSH for secure data transfer, of course, can also be connected via a direct socket

5. Support for anonymous transmissions



1.2 rsync command syntax:


The command format of rsync can be divided into the following types:

rsync [OPTION] ... SRC Destrsync [OPTION] ... SRC [[Email Protected]]host:destrsync [OPTION] ... [[email protected]] HOST:SRC Destrsync [OPTION] ... [[email protected]] HOST::SRC Destrsync [OPTION] ... SRC [[Email protected]]host::D estrsync [OPTION] ... rsync://[[email protected]]host[:P ort]/src [DEST]


With the above several command formats, you can summarize the following 4 different working modes of rsync


1.3 rsync Working mode:


The first mode: Shell mode, also known as local mode:

Connect through the shell using a UNIX socket. This mode is used when the host name of the source or destination path is followed by a colon delimiter, which can be used directly after Rsync is installed, regardless of startup.


Second mode: Remote shell mode, can use SSH protocol to host its remote transmission process

The connection via the Remote Shell Program (SSH) can be made via the rsync parameter-e ssh, similar to the local mode.


Third mode: List mode, listing only the content in the source,-NV

Just list the synced content and not actually perform the copy operation


The fourth mode: Service mode, at this time rsync work as a daemon, can accept the client's data synchronization request

is also a more commonly used mode, when the host name of the source path or destination path contains two colons, or use Rsync://url when using this mode, without a remote shell, but must be on a machine to start rsync daemon, the default port 873, here can be via rsync-- Daemon uses a standalone process, or manages the rsync daemon process through the XINETD super process.


When Rsync runs as a daemon, it requires a user identity. If you want to enable chroot, you must run daemon as root, listen to the port, or set the file owner, and if you do not enable chroot, you can run daemon without using the root user, but the user must have read and write data, logs, and lock for the corresponding module. Permissions for file. When Rsync is running in daemon mode, it also needs a configuration file--rsyncd.conf. You do not have to restart Rsync daemon After you modify this configuration, because each client connection will be re-read the file.


We generally refer to the Dest remote server as the rsync server, and the end of the rsync command is called the client.



Options for the 1.4 rsync command:

-N: Synchronous test, does not perform a true synchronization process

-V: Verbose output mode

-Q: Silent mode, no output information

-c:checksum, turn on the check function

-R: Recursive replication


Note: In the rsync command, if the source path is a directory and the end of the replication path contains/, the contents of the directory are copied, not the directory itself, and if there is no/at the end of the path, the directory itself and all files in the directory are synchronized.


Whether the end of the destination path is/does not matter


eg

# rsync/etc/yum.repo.d/tmp/test at the end of the source path does not contain/

Then copy all the files under/ETC/YUM.REPO.D to/tmp/test/.


# rsync/etc/yum.repo.d//tmp/test at the end of the source path contains/

The YUM.REPO.D folder itself and all the files within it are copied

At this point the directory structure of the past synchronization is:/TMP/TEST/YUM.REPO.D


-A: Archive, preserving the original properties of the file

-P: Preserve file permissions (RWX)

-T: Preserve file timestamps

-L: Preserve Symbolic links

-G: Reserved belong to Group

-O: Reserved owner


So-A is quite used with-rptlgo


-D: Keep device files


-e ssh: using SSH as the transport bearer


-Z: Post-compression transfer


--progress: Show progress bar

--stats: Shows how to perform compression and transfer


2. Synchronization test:


Knowing how rsync works and command interpretation, you can use rsync to do a synchronous test.


2.1 Native Folder synchronization:

# rsync-a--progress/etc/fstab/tmp/back


You see the list and rate of/etc/fstab files transferred to/tmp/back, and then run it again to see the Sending incremental file list



The following questions need to be considered:


Deleting a/etc/fstab file does not synchronize the deletion of/tmp/back/fstab unless the--delete option is added

Any changes in the properties such as file access time, read and write permissions, and the contents of the file will be considered modified

The target directory is not synchronized if the file is newer than the source directory

The end of the source path has a different meaning: there is a slash, just copy the files in the directory, without a slash, not only to copy the files in the directory, but also to copy the directory itself



2.2 Remote server Synchronization


rsync transfer between the server files, need to have a service that is open rsync, and this service requires two configuration files, indicating the current running user name and user group, this user name and user group in the change of file permissions and related content when useful, or sometimes prompted permission problems. The configuration file also explains the security of the module, the module Management Service, each module name is its own definition, can add the user name password authentication, can also verify the IP, set the directory is writable, etc., different modules for synchronizing different requirements of the directory.



2.2.1 Setting rsync server Side

# yum-y Install xinetd

# Chkconfig Rsync on


2.2.2 Providing the configuration file for rsync

/etc/rsyncd.conf

This configuration file is divided into two sections:

Global Configuration segment: one

Shared configuration segments: multiple, customizable, defined as [Shared_name]



2.2.3 Configuration Example:


# Global Settingsuid = Nobodygid = nobodyuse Chrootmax connections = 10strict modes = yes strict mode pid file =/VAR/RUN/RSYNCD.P Idlog file =/var/log/rsyncd.log# Directory to be synced[tools]path =/dataignore Errors = yes whether to allow ignore error read Only = no writ E only = nohost enable = 172.16.0.0/16 Whitelist List host deny = * List = True allow files to be listed? uid = root This share is run as root, and the shared configuration here can overwrite the Glob Configuration of the inside of Al gid = Rootauth users=jingmingsecrets file=/etc/rsyncd.secrets


Here configure a socket to transfer files, [tools] to customize a shared directory configuration, specify the directory to be synchronized path, authorized users, password files, which day to allow the server IP synchronization sent, etc., the configuration file parameters can be defined by the individual

# Man Rsyncd.conf

View Manuals


The password file/etc/rsynced.secrets in the format:

Username:password

The content is clear text, one line a user, password. The password here is independent of the system user and is custom set

The user defined here is to correspond to the/etc/rsyncd.conf auth users


Modify Permissions:

# chmod 600/etc/rsyncd.secrets


Ensure that no one can access this file except root




2.2.4 Start rsync Backend Service

Modify/etc/xinetd.d/rsync file, disable change to No

# Service XINETD Restart


You can also use

#/usr/bin/rsync--daemon--config=/etc/rsyncd.conf


To prevent rsync from writing too many useless logs to/var/log/message (easily stuffed to miss important information) it is recommended to comment out/etc/xinetd.conf success

# log_on_success = PID HOST DURATION EXIT


If you are using a firewall, add a rule to release TCP 873 ports


# iptables-a input-p tcp-m State--state new-m TCP--dport 87-j accept# iptables-l# netstat-tnl | grep 873



It is recommended to turn off SELinux, which may cause synchronization errors due to strong access control.


2.3 Client Test synchronization

# Synchronize the local/root/directory to the/data directory on the remote 192.168.181.100 (specified in the [Tools] section of the shared configuration) # Rsync-ar--progress/root/[email protected]::tools


#将远程的/data directory sync to local/tmp/back# rsync-ar--progress [email protected]:tools/tmp/back


You can also specify a password file to the client, you can directly copy without interaction:

# rsync-arp--progress--password-file=/etc/rsync_client.pwd [email protected]::tools/tmp

Password file only record password, remember to set permissions 600

# echo "passowd_for_jingming" >/etc/rsync_client# chmod 600/etc/rsync_client



From the above two commands can be seen, actually here the concept of server and client is very vague, rsync Daemon are run on the remote 192.168.18.100, the first command is the local active push directory to remote, remote server is used to back up, the second command is to take the local initiative to remotely request files, local server for backup, can also be considered as a process of local server recovery.



Limitations of 2.4 rsync:


Compared with the traditional methods of CP and Tar Backup, rsync has the advantages of high security, fast backup, support of incremental backup, and so on, rsync can solve the requirement of low-real-time data backup, such as regular backup file server data to remote server, regular data mirroring for local disk, etc.


With the expansion of application system scale, the security and reliability of data also put forward better requirements, rsync in the high-end business system also gradually exposed a lot of shortcomings, first, rsync synchronization data, need to scan all files after the comparison, for differential transmission. If the number of files reaches millions or even tens of thousands of levels, scanning all the files will be very time consuming. And what is changing is often a small part of it, which is a very inefficient way. Secondly, rsync cannot monitor and synchronize data in real time, although it can trigger synchronization by crontab mode, but the two trigger action must have a time difference, which leads to inconsistent service and client data, and can not recover data completely when application failure occurs. Based on the above reasons, the rsync+inotify combination appeared!




Second, Inotify-tools


2.1 What is inotity


Inotiy is a powerful, granular, asynchronous file system event monitoring mechanism introduced by the Linux kernel from 2.6.13, allowing the monitoring program to open a standalone file descriptor and monitoring one or more files for the event set, such as open, close, move, rename, delete, Create or change properties.


CentOS6 has OST support:

Use the # ll/proc/sys/fs/inotify command to see if there are three output messages, if not indicated, not supported:


# ll/proc/sys/fs/inotify/total 0-rw-r--r--1 root root 0 19:35 max_queued_events-rw-r--r--1 root root 0 21 19 : max_user_instances-rw-r--r--1 root root 0 19:35 max_user_watches



/proc/sys/fs/inotify/max_queued_evnets represents the maximum value assigned to the number of events that can be queued in inotify instance when Inotify_init is called, an event that exceeds this value is discarded, but triggers in_ Q_overflow event.

/proc/sys/fs/inotify/max_user_instances represents the maximum number of inotify instatnces that each real user ID can create.

/proc/sys/fs/inotify/max_user_watches represents the maximum number of directories that can be monitored per inotify instatnces. If you monitor a large number of files, you need to increase the size of this value appropriately, depending on the situation.




2.2 What is Inotify-tools:


Inotify-tools is a set of C's Development interface library functions for Linux under the INotify file Monitoring tool, as well as a series of command-line tools that can be used to monitor file system events. Inotify-tools is written in C, except that it requires the kernel to support inotify, and does not depend on others. Inotify-tools provides two tools, one is inotifywait, it is used to monitor file or directory changes, and the second is Inotifywatch, it is used to count the number of file system access



2.2.1 Inotify-tools Compile and install

# Tar XF inotify-tools-3.13.tar.gz# cd inotify-tools-3.13#./configure# make && make install



By default, the inotifywait and Inotifywatch commands are released to/usr/local/bin





2.3 Inotifywait Usage Examples:


Monitor changes to the/root/tmp directory file:

#/usr/local/bin/inotifywait-mrq--timefmt '%y/%m/%d-%h:%m:%s '--format '%T%w%f ' #-E Modify,delete,create,move,attrib /root/tmp/



The above command indicates that the file changes of the/root/tmp directory and its subdirectories are continuously monitored, and the listening events include files being modified, deleted, created, moved, property changed, and displayed to the screen. After executing the above command, the file created or modified under/ROOT/TMP will have the information output:


[Email protected] ~]# inotifywait-mrq--timefmt '%y/%m/%d-%h:%m:%s '--format '%T%w%f ' >-e Modify,delete,create,mov e,attrib/tmp/backup2015/08/21-19:42:29/tmp/backup/1.txt2015/08/21-19:42:39/tmp/backup/2.txt2015/08/21-19:42:39 /tmp/backup/2.txt2015/08/21-19:42:47/tmp/backup/. 3.txt.swp2015/08/21-19:42:47/tmp/backup/. 3.txt.swx2015/08/ 21-19:42:47/tmp/backup/. 3.txt.swx2015/08/21-19:42:47/tmp/backup/. 3.txt.swp2015/08/21-19:42:47/tmp/backup/. 3. txt.swp2015/08/21-19:42:47/tmp/backup/. 3.txt.swp2015/08/21-19:42:51/tmp/backup/. 3.txt.swp2015/08/21-19:42:54/ tmp/backup/. 3.txt.swp



Three, rsync combination inotify-tools complete real-time synchronization


The core momentum of this step is to create a script on the client rsync.sh, suitable for inotifywait monitoring changes in the local directory, triggering rsync to transfer the changed files to the remote backup server.

To get closer to the actual combat, we require some subdirectories to be out of sync, such as/tmp/backup/log and temp files



3.1 Creating a list of excluded files that are not synchronized


There are two ways to exclude files or directories that do not need to be synchronized.

The first is that inotify monitors the entire directory, adding an exclusion option to rsync, simple;

The second is the INotify exclusion section of the directory, and Rsync also to join the exclusion option, you can reduce unnecessary network bandwidth and CPU consumption. We choose the second type.



3.1.1 Inotifywait exclusion


This is done on the client side, assuming that all files in the/tmp/src/mail/2014/and/tmp/src/mail/2015/cache/directories are not synchronized, so no monitoring is required, and other files and directories under/tmp/src/are synchronized. (In fact, for open temporary files, can not listen to modify time and change to monitor Close_write)


The Inotifywait Exclusion Monitoring directory has--exclude <pattern> and--fromfile <file> two formats and can be used at the same time, but the primary one may be regular, while the latter can only be a specific directory or file.


# vim/etc/inotify_exclude.lst:/tmp/src/pfd@/tmp/src/2014


If the format you want to exclude is more complex, you must use regular, which can only be added to the inotifywait option

such as--exclude ' (. */*\.log|. */*\.SWP) $|^/tmp/src/mail/(2014|201.*/cache.*) '

Represents the exclusion of 2014 directories below/tmp/src/mail/, and all files or directories with the cache under the 201* directory, and all files in the/TMP/SRC directory that end with. log or. SWP.



3.1.2 Rsync Exclusions


If you use inotifywait to exclude the monitoring directory, you must also use rsync to exclude the corresponding directory, otherwise as long as there is trigger synchronization operation, will inevitably cause the directory should not be synchronized synchronization. Similar to Inotifywait, Rsync's synchronization also has the--exclude and--exclude-from two kinds of notation.


Individuals are also accustomed to be excluded from the synchronization of the directory written in a separate file list, easy to manage.

When using--include-from=file, exclude file lists with absolute paths, but the contents of file are relative paths, such as:

/etc/rsyncd.d/rsync_exclude.lst:mail/2014/mail/201*/201*/201*/.?? *mail?? *src/*.html*src/js/src/ext3/src/2014/20140[1-9]/src/201*/201*/201*/.?? *membermail/membermail?? *membermail/201*/201*/201*/.?? *


Excluded sync content includes 2014 directories under mail, like temporary or hidden files under 2015/201501/20150101/, and so on.



3.2 The client synchronizes to the remote script:


The following is a completed synchronization script, please crop as needed, rsync.sh

#rsync  auto sync script with inotify#2015-08-21 richie jing# variablescurrent_date=$ (date +%y%m%d_%h%m%s) source_path=/tmp/src/log_file=/var/log/rsync_client.log# Rsyncrsync_server=192.168.18.100rsync_user=jingmingrsync_pwd=/etc/rsync_client.pwdrsync_module=toolsinotify_ Exclude= ' (. */*\.log|. */*\.SWP) $|^/tmp/src/mail/(2014|20.*/.*che.*) ' rsync_exclude= '/etc/rsyncd.d/rsync_exclude.lst ' #rsync   Client pwd checkif [ ! -e ${rsync_pwd} ];then    echo  -e  "rsync client passwod file ${rsync_pwd} does not exist!"     exit 0fi#inotify_functioninotify_fun () {    /usr/bin/ inotifywait -mrq --timefmt  '%y/%m/%d-%h:%m:%s '  --format  '%t %w %f '             --exclude ${inotify_exclude}  -e  modify,delete,create,move,attrib ${source_path}           | while read  file      do          / usr/bin/rsync -auvrtzopgp --exclude-from=${rsync_exclude} --progress --bwlimit=200 -- password-file=${rsync_pwd} ${source_path} ${rsync_user}@${rsync_server}::${rsync_module}        done} #inotify  loginotify_fun >> ${log_file} 2>& 1 &



The--bwlimit=200 is used to limit the transfer rate by up to 200kb, because in practice it is found that if the rate limit is not made, it can result in significant CPU consumption.


Run the script on the client

#./rsync.sh

To synchronize the directory in real time.



Blog reference:

http://segmentfault.com/a/1190000002427568

Thanks to the Yumbo Lord for sharing

This article is from the "Richier" blog, make sure to keep this source http://richier.blog.51cto.com/1447532/1686847

Real-time data synchronization in Linux under Rsync+inotify

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.