This is my first open-source software sersync, mainly used for server synchronization, web images, and other functions. Developed based on boost1.41.0, inotify API, and rsync command.
Test environment: centos and ubuntu.
Source Code address:
Http://code.google.com/p/sersync/
Currently, inotify-tools is the most widely used synchronization program version, and openduckbill, an open-source Google Project (dependent on inotify-tools), both of which are written in scripting language, the same idea is to use inotify and rsync commands. Compared with the above two projects, the advantages of this project are:
1. sersync is written in C ++ and filters out temporary files and repeated file operations generated by the Linux System File System (as I will mention later). When we use rsync for synchronization, this reduces runtime consumption and network resources. Therefore, it is faster.
2. compared with the above two projects, sersync configuration is very simple: Download the source code in the http://code.google.com/p/sersync/downloads/List (divided into 32 version, with 64-bit version ), the binary file compiled by me in the bin directory can be used directly with the XML file in the bin directory.
3. In addition, compared with other open-source scripts, this project uses multiple threads for synchronization, especially when synchronizing large files, ensuring that multiple servers are synchronized in real time.
4. This project comes with an error handling mechanism. The failed queue re-fails the failed files. If the failed files still fail, the files that fail to be synchronized are re-synchronized every 10 hours.
5. This project comes with the crontab function. You only need to enable it in the XML configuration file, and you can synchronize it all at once according to your requirements.
6. This project comes with socket and HTTP Protocol extensions to meet your needs for secondary development.
Basic Architecture:
Inotify and rsync are used to synchronize servers in real time. inotify is used to monitor file system events. Rsync is a widely used synchronization algorithm. Its advantage is to operate only different parts of files, therefore, its advantage is much higher than that of using the mounted file system for image synchronization.
Design Analysis
As shown in, threads in a thread group are the daemon threads waiting for the thread queue. When there is data in the queue, the daemon threads in the thread group wake up one by one, when there are too many inotify events in the queue, all of them will wake up and work together. The purpose of this design is to simultaneously process multiple inotify events and re-utilize the server's concurrency (number of cores x 2 + 2 ).
It is called a thread group thread because each thread creates a sub-thread based on the number of servers when working. The sub-thread can ensure that all files are synchronized with each server at the same time, when the file to be synchronized is large, this design ensures that each remote server can simultaneously obtain the file to be synchronized.
The service thread has three roles. The first is to process the synchronization failed files and synchronize these files again. For the files that fail to be synchronized again, the rsync_fail_log.sh script is generated to record the failed events. Execute the script once every 10 hours and clear the script at the same time. The third role of the service thread is the crontab function, which can synchronize all paths at intervals.
The queue is created to filter repeated inotify information generated in a short period of time. For example, when deleting a folder, inotify will simultaneously generate an event for deleting files and folders in the folder, when a Folder deletion event is generated in the filter queue, all the file deletion events that have previously been added to the queue are filtered out, so that only one event is generated to reduce the synchronization burden. Temporary files and repeated operations are generated during file modification.
Example:
When we perform WQ operations in a test file of VI, the following events are generated:
Even if you start with "." and "~ "The World at the end is filtered out. There are still three operations on the test file, namely Delete, create, and save. By filtering the queue, only one event is left, to some extent, the efficiency is also improved.
For inotify recognition events, see the blog:
Http://hi.baidu.com/johntech/blog/item/e4a31a3db1ee1ce755e723f4.html
The following describes how to use and configure
I. Compilation
Source code is put under the src directory.
The required static library is in the lib directory.
The bin directory is the final binary file.
Execute the make command in the sersync directory to put the generated binary file into the bin directory.
Under normal circumstances, you do not need to compile any more. You can directly use the generated 64-bit executable file in the bin directory to start deployment.
Ii. Installation
1. Pre-installation configuration.
Before using the rsyncd, make sure that rsyncd. conf has been configured and the rsyncd daemon is enabled. The simple configuration method is as follows:
Modify the rsync file on the Master/Slave server to be synchronized. For details about the configuration parameters, Google.
VI/etc/rsyncd. conf
Uid = root
Gid = root
Max connections = 36000
Use chroot = No
Log File =/var/log/rsyncd. Log
PID file =/var/run/rsyncd. PID
Lock file =/var/run/rsyncd. Lock
[Tongbu]
Path =/opt/tongbu
Comment = xoyo video files
Ignore errors
Read Only = No
Hosts allow = 192.168.8.40/26 192.168.138.94/24
Hosts deny = *
Then enable the rsyncd. conf process on each server as follows:
Rsync -- daemon
2.
Install this software
Because it is statically compiled, you can modify the configuration file on the xoyo server and directly use the executable file in the bin directory.
Tar zxvf sersync2.1.tar.gz
CD sersync
Before use, enter the configuration file:
VI
Confxml. xml
Depending on the modules used, you need to modify the configuration file as follows:
1. Use the server synchronization function
XML Configuration File Usage
As shown in the configuration file:
You only need to modify the content below the sersync Tag:
<Sersync>
<Localpath watch = "/opt/tongbu">
<Remote IP = "192.168.8.38" name = "tongbu"/>
<Remote IP = "192.168.8.39" name = "tongbu"/>
</Localpath>
<Crontab start = "true" schedule = "30"/>
</Sersync>
:
For the watch of the localpath label, enter the local path to be synchronized. For the remote label, enter the remote host IP address and module name to be synchronized. If the crontab function sets the start label to true, you can set the schedule attribute to determine how long the monitoring directory is completely synchronized.
The sersync directory contains many files, but if you do not need to compile the file, you only need to use sersync2.1 and confxml. XML in the bin directory. Others are source code files, library files and.
Execution and parameter configuration
./Sersync2.1-h to view the Help File
./Sersync2.1-r synchronize the entire path with the remote server before the synchronization program is enabled
./Sersync2.1-D enable daemon mode and run it in the background
./Sersync2.1-O specifies the configuration file name. If the configuration file name is not confxml. XML, use '-o xxxxx. xml'
./Sersync2.1-N specifies the number of synchronization daemon threads. The default value is 10, which is applicable to the current 4-core servers. If you need
Increase or decrease the '-N qty '.
The common method is
./Sersync2.1
-D-r
Log File description
During execution, the rsync_fail_log.sh file will be generated. During synchronization, if the file to be synchronized fails, the queue will be re-transmitted first. If the file fails again, it will be recorded in the rsync_fail_log.sh file, the modified file is automatically executed every 10 hours and cleared again.
Refresh CDN function usage
XML Configuration File Usage
The following XML file needs to be configured to refresh the CDN module.
<Plugin name = "refreshcdn">
-<Localpath watch = "/data0/htdocs/cms.xoyo.com/site/">
<Cdninfo domainname = "ccms.chinache.com" Port = "80" username = "yourname" passwd = "yourpasswd"/>
<Sendurl base = "http://pic.xoyo.com/cms"/>
<Regexurl RegEx = "false" match = "cms.xoyo.com/site ([/a-zA-Z0-9] *) .xoyo.com/images"/>
</Localpath>
</Plugin>
Localpath watch is the directory to be monitored.
The cdnio tag specifies the domain name, port number, and user name and password of the CDN interface.
The sendurl tag is the prefix of the URL to be refreshed.
When the RegEx attribute in the regexurl tag is true, use the regular statement of the match attribute to match the path information returned by inotify, and use the part that matches the regular expression as part of the URL,
Example:
If the file generation event is/data0/htdoc/cms.xoyo.com/site/jx3.xoyo.com/image/a/123.txt
After the above match regular match, the last refresh path is:
Http://pic.xoyo.com/cms/jx3/a/123.txt;
If the RegEx attribute is false, the last refresh path is
Http://pic.xoyo.com/cms/jx3.xoyo.com/images/a/123.txt;
Execution and parameter configuration
./Sersync-D-M refreshcdn
Log File description
The execution process generates an error. log file, which records the information received from CDN and refresh the path.
Expansion module
Socket interface
Socket module. When this module is enabled, the inotify file path information is sent to the specified IP address and port:
./Sersync-D-M socket
HTTP interface
The HTTP module interface can post the inotify monitoring event to the host of the specified domain name:
./Sersync-D-M HTTP
Here, mask is an event mask, and 8 is modified and saved. For other inotify event masks, see Google.