A tutorial on using rsync

Source: Internet
Author: User
Tags ftp site rsync
Contents
  1. Introduction
  2. How does it work?
  3. Setting up a server
  4. Using rsync itself
  5. Rsync on the net

Introduction

Rsync is a wonderful little utility that's amazingly easy to set up on your machines. rather than have a scripted FTP session, or some other form of File Transfer script -- rsync copies only the diffs of files that have actually changed, compressed and through SSH If You Want To for security. that's a mouthful -- but what it means is:

  • Diffs-only actual changed pieces of files are transferred, rather than the whole file. this makes updates faster, especially over slower links like modems. FTP wocould transfer the entire file, even if only one byte changed.
  • Compression-the tiny pieces of diffs are then compressed on the fly, further saving you file transfer time and cing the load on the network.
  • Secure Shell-the security concious of you out there wocould like this, And you shoshould all be using it. the stream from Rsync is passed through the SSH protocol to encrypt your session instead of RSH, which is also an option (and required if you don't use SSH-enable it in your/etc/inet. d and restart your Inet daemon if you disabled it for security ).

Rsync is rather versatile as a backup/flushing tool, offering features above and beyond the above. I personally use it to synchronize website trees from staging to production servers and to backup key areas of the filesystems both automatically through cron and by a CGI script. here are some other key features of rsync:

  • Support for copying links, devices, owners, groups and permissions
  • Exclude and exclude-from options similar to GNU Tar
  • A cvs exclude mode for ignoring the same files that CVS wowould ignore
  • Does not require root privileges
  • Pipelining of file transfers to minimize latency costs
  • Support for anonymous or authenticated rsync servers (ideal for login ing)

How does it work?

You must set up one machine or another of a pair to be an "rsync server" by running rsync in a daemon mode ("Rsync -- daemon"At the CommandLine) and setting up a short, easy configuration file (/Etc/rsyncd. conf). Below I'll detail a sample configuration file. The options are readily understood, few in number -- yet quite powerful.

Any number of machines with rsync installed may then synchronize to and/or from the machine running the rsync daemon. you can use this to make backups, mirror filesystems, distribute files or any number of similar operations. through the use of the "Rsync algorithm" which transfers only the diffs between files (similar to a patch file) and then compressing them -- you are left with a very efficient system.

For those of you new to Secure Shell ("ssh" for short), you should be using it! There's a very useful and quite thourough getting started with SSH document available. you may also want to visit the Secure Shell web site. or, just hit the master FTP site in Finland and Sng it for yourself. it provides a secure, encrypted "pipe" for your network traffic. you shoshould be using it instead of Telnet,RshOrRloginAnd use the replacement"SCP"Command instead"RCP."

Setting up a server

You must set up a configuration file on the machine meant to be a server and run the rsync binary in daemon mode. even your rsync client machines can run rsync in daemon mode for two-way transfers. you can do this automatically for each connection via the inet daemon or at the CommandLine in standalone mode to leave it running in the background for often repeated rsyncs. I personally use it in Stan Dalone mode, Like Apache. I have a crontab entry that synchronizes a Web site directory hourly. plus there is a CGI script that folks fire off frequently during the day for immediate updating of content. this is a lot of rsync CILS! If you start off the rsync daemon through your Inet daemon, then you incur much more overhead with each rsync call. You basically restart the rsync daemon for every connection your server machine gets! It's the same reasoning as starting Apache in standalone mode rather than through the inet daemon. it's quicker and more efficient to start rsync in standalone mode if you anticipate a lot of rsync traffic. otherwise, for the occasional transfer follow the procedure to fire off rsync via the inet daemon. this way the rsync daemon, as small as it is, doesn't sit in memory if you only use it once a day or whatever. your call.

Below is a sample rsync configuration file. It is placed in your/etc directoryRsyncd. conf.

motd file = /etc/rsyncd.motdlog file = /var/log/rsyncd.logpid file = /var/run/rsyncd.pidlock file = /var/run/rsync.lock[simple_path_name]   path = /rsync_files_here   comment = My Very Own Rsync Server   uid = nobody   gid = nobody   read only = no   list = yes   auth users = username   secrets file = /etc/rsyncd.scrt

Various options that you wowould modify right from the start are the areas in italics in the sample above. i'll start at the top, line by line, and go through what you should pay attention. what the sample above does is setup a single "path" for rsync transfers to that machine.

Starting at the top are four lines specifying files and their paths for rsync running in daemon mode. the first is a "message of the day" (motd) file like you wocould use for an FTP server. this is a file who's contents get displayed when clients connect to this machine. use it as a welcome, warning or simply identification. the next line specifies a log file to send diagnostic and norml Run-Time Messages. the PID file contains the "process ID" (PID) Number of the running rsync daemon. A lock file is used to ensure that things run smoothly. these options areGlobalTo the rsync daemon.

The next block of lines is specific to a "path" that rsync uses. the options contained therein have effect only within the block (they're local, not global options ). start with the "path" name. it's somewhat confusing that rsync uses the term "path" -- as it's not necessarily a full pathname. it serves as an "rsync area nickname" of sorts. it's a short, easy to remember (and Type !) Name That you assign to a try filesystem path with all the options you specify. Here are the things you need to set up first and foremost:

  • Path-this is the actual filesystem path to where the files are rsync 'ed from and/or.
  • Comment-a short, descriptive explanation of what and where the path points to for listings.
  • Auth users-you really shocould put this in to restrict access to only a pre-defined user that you specify in the following secrets file-does not have to be a valid system user.
  • Secrets file-the file containing plaintext key/value pairs of usernames and passwords.

One thing you shoshould seriously consider is the "hosts allow" and "hosts deny" options for your path. Enter the IPS or hostnames that you wish to specifically allow or deny! If you don't do this, or at least use the "auth users" option, then basically that area of your filesystem isWide open to the worldBy anyone using rsync! Something I seriously think you shoshould avoid...

Check the rsyncd. conf man page"Man rsyncd. conf"And read it very carefully where security options are concerned. You don't want just anyone to come in and rsync up an empty directory with the" -- Delete "option, now do you?

The other options are all explained in the man page for rsyncd. conf. basically, the above options specify that the files are chmod 'ed to UID/GID, the filesystem path is read/write and that the rsync Path shows up in rsync listings. the Rsync secrets file I keep in/Etc/Along with the configuration andMotdFiles, And I prefix them with "rsyncd." To keep them together.

Using rsync itself

Now on to actually using, or initiating an rsync transfer with rsync itself. it's the same binary as the daemon, just without the "-- daemon" flag. it's simplicity is a valid ue. i'll start with a CommandLine that I use in a script to synchronize a web tree below.

rsync --verbose  --progress --stats --compress --rsh=/usr/local/bin/ssh      --recursive --times --perms --links --delete /      --exclude "*bak" --exclude "*~" /      /www/* webserver:simple_path_name

Let's go through it one line at a time. the first line callrsync itself and specifies the Options "verbose," progress "and" stats "so that you can see what's going on this first time around. the "compress" and "RSH" Options specify that you want your stream compressed and to send it through SSH (remember from above ?) For security's sake.

The next line specifies how rsync itself operates on your files. you're telling rsync here to go through your source pathname recursively with "recursive" and to preserve the file timestamps and permissions with "times" and "perms. "Copy symbolic links with" Links "and delete things from the remote rsync server that are also deleted locally with" Delete."

Now we have a line where there's quite a bit of power and flexibility. you can specify GNU tar-like include and exclude patterns here. in this example, I'm telling rsync to ignore some backup files that are common in this web tree ("*. bak "and "*~ "Files ). you can put whatever you want to match here, suited to your specific needs. you can leave this line out and rsync will copy all your files as they are locally to the remote machine. depends on what you want.

Finally, the line that specifies the source pathname, the remote rsync machine and rsync "path. "The first part"/www/* "specifies where on my local filesytem I want rsync to grab the files from for transmission to the remote rsync server. the next word, "webserver" shocould be the DNS name or IP address of your rsync server. it can be "W. x. y. z "or" rsync.mydomain.com "or even just" webserver "if you have a nickname defined in your/Etc/hostsFile, as I do here. the single colon specifies that you want the whole mess sent through your SSH tunnel, as opposed to the regular RSH tunnel. this is an important point to pay attention! If you use two colons, then despite the specification of SSH on the CommandLine previously, you'll still go through RSH. ooops. the last "www" in that line is the rsync "path" that you set up on the server as in the sample above.

Yes, that's it! If you run the above command on your local rsync client, then you will transfer the entire "/www/*" tree to the remote "webserver" machine before t backup files, preserving file timestamps and permissions -- compressed and secure -- with visual feedback on what's happening.

Note that in the above example, I used GNU style long options so that you can see what the CommandLine is all about. you can also use abbreviations, single letters -- to do the same thing. try running rsync with the "-- Help" option alone and you can see what syntax and options are available.

Rsync on the net

You can find the rsync distribution at the rsync home page or hit the rsync FTP site.

There are also various pages of information on rsync out there, limit of which reside on the rsync Web site. below are three privileges ents that you shoshould also read thouroughly before using rsync so that you understand it well:

  • Rsync man page
  • Rsyncd. conf man page
  • Rsync algorithm Technical Report




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.