Automatic Backup on Linux

Source: Internet
Author: User
Tags gnupg openssh server secure copy websphere application server ibm developerworks
Easy independent, secure, and distributed network backup

Level: Intermediate

Carlos justiniano (
Software Designer, ecuity Inc.

The loss of important data may cause fatal damage. Despite this, countless professionals ignored the backup of their data. Although the causes may vary, the most common explanation is that routine backup is cumbersome. Because the machine is good at completing common and repetitive tasks, the automated backup process is the key to reducing the boring nature of the work and the inherent procrastination.

If you use Linux, you can use extremely powerful tools to create custom backup solutions. The solution in this article allows you to use open source code tools attached to almost every Linux release version to perform simple to more advanced and secure network backup.

Simple backup
This document describes the procedure in one step. It is very intuitive as long as you follow the basic steps.

Before studying more advanced distributed backup solutions, Let's first look at a simple and powerful archiving mechanism. Let's analyze a convenient script named arc, which allows us to create a backup snapshot at the Linux shell prompt.

Listing 1. Arc shell script

tar czvf $1.$(date +%Y%m%d%-H%M%S).tgz $1
exit $?

The arc script receives a separate file or directory name as a parameter, creates a compressed archive file, and embeds the current date into the name of the generated archive file. For example, if you have a directory named beoserver, you can call the arc script to pass the beoserver directory name to it to create a compressed archive file, such as beoserver.20040321-014844.tgz

The date command is used to embed a date and timestamp to help you organize archive files. The date format is year, month, day, hour, minute, and second-although the second domain is used in excess. View the Data command Manual (man date) to learn about other options. In addition, in Listing 1, we passed the-V (verbose) option to tar. This enables tar to display the files it is archiving. If you prefer silent backup, delete this-V option.

Listing 2. Archiving the beoserver directory

$ ls 
arc beoserver
$ ./arc beoserver
$ ls
arc beoserver beoserver.20040321-014844.tgz

Advanced backup
This simple backup is practical; however, it still contains a manual backup process. We recommend that you back up data to multiple media sets and back up data to different geographic locations. The central idea is to avoid relying on any independent storage media or independent location.

In the next example, we will address this challenge. We will analyze a hypothetical Distributed Network shown in 1, which shows the system management of two remote servers and one offline storage server.

Figure 1. Distributed Network

Backup files on servers #1 and #2 will be securely transmitted to the offline storage server, and the entire distributed backup process will be conducted on a regular basis without manual interference. We will use a set of standard tools (part of the open Secure Shell Tool Kit (openssh), as well as the tape archiver (TAR) and cron task scheduling services. All our plans are to use cron for scheduling, use shell and tar applications to complete the backup process, and use OpenSSH Secure Shell (SSH) encrypted remote access, authentication, and Secure Shell copy (SCP) to automatically complete file transmission. To obtain additional information, check the manual of each tool.

Use a public/private key for secure remote access
In the context of Digital Security, a key refers to a piece of data that is used to encrypt or decrypt other data fragments. The public key/private key mode is interesting because only the corresponding private key can be used to decrypt data encrypted with the public key. You can freely publish a public key so that others can encrypt the messages sent to you. One of the reasons that the public/private key mode completely changes digital security is that the sender and receiver do not have to share a common password. Among other contributions, public/private key encryption is possible through e-commerce and other secure transmission. In this article, we will create and use the public key and private key to create a very secure distributed backup solution.

Each machine in the backup process must run the OpenSSH Secure shell Service (sshd), and port 22 can be accessed through any internal firewall. If you access a remote server, you may be using a secure shell.

Our goal is to securely access the machine without providing a password. Some people think that the simplest way is to set password-free access: Do not do this. This is not safe. The method we will use in this article may take about one hour, building a system that is as convenient as using a "password-free" account-is generally considered safe.

First, make sure that OpenSSH has been installed. Then, check the version number. At the end of this article, the latest OpenSSH release was February 24, 2004, which was released in 3.8. You should consider using a newer and stable release version. At least the version used should be newer than version 2.x. Visit the OpenSSH Security web page to obtain details about the defects of a specific old version (see the link in the references below ). So far, OpenSSH is very stable, and it has proved that there are no many defects reported by other SSH tools.

At the shell prompt, enter ssh and give an important V option to check the version number:

$ Ssh-V
OpenSSH_3.5p1, SSH protocols 1.5/2.0, OpenSSL 0x0090701f

If the version number returned by ssh is greater than 2.x, the machine is in a relatively good state. In any case, we recommend that you use the latest stable version for all your software, which is especially important for security-related software.

The first step is to log on to the offline storage server using an account that has the privilege to Access Server 1 and Server 2 (see figure 1 ).

$ Ssh

After logging on to the offline storage server, use the ssh-keygen program and provide the-t dsa option to create a public key/key pair. The-t option is required to specify the key type to be generated. We will use the Digital Signature Algorithm (DSA) Algorithm, which allows us to use the updated SSH2 protocol. Refer to the ssh-keygen manual for more details.

During ssh-keygen execution, you are prompted to enter the location of the ssh key storage before asking for your password (passphrase. When querying where to store the key, you only need to press the Enter key, and then the ssh-keygen program will create a file named. ssh hidden directory (if it does not exist), and two files, one public key file and one private key file.

An interesting feature of ssh-keygen is that when prompted to enter a password, it allows you to simply press the Enter key. If you do not provide a password, ssh-keygen will generate an unencrypted key! As you think, this is not a good idea. When a password is required, make sure that a long enough character message is entered. It is best to include a mix of characters, not just a simple password string.

Listing 3. Always select a password

[offsite]:$ ssh-keygen -t dsa 
Generating public/private dsa key pair.
Enter file in which to save the key (/home/accountname/.ssh/id_dsa):
Enter passphrase (empty for no passphrase): (enter passphrase)
Enter same passphrase again: (enter passphrase)
Your identification has been saved in /home/accountname/.ssh/id_dsa.
Your public key has been saved in /home/accountname/.ssh/
The key fingerprint is:
7e:5e:b2:f2:d4:54:58:6a:fa:6b:52:9c:da:a8:53:1b accountname@offsite

Because the. ssh directory generated by ssh-keygen is a hidden "dot" directory, You need to input the-a option to the ls command to view the newly created directory:

[Offsite] $ ls-
... Bash_logout. bash_profile. bashrc. emacs. gtkrc. ssh

Go to the hidden. ssh directory and list its content:

[Offsite] $ cd. ssh
[Offsite] $ ls-lrt

Now, in the hidden. ssh directory, we already have a private key (id_dsa) and a public key ( ). You can use text editing tools such as vi or emacs or simply use the less or cat command to analyze the content of each key file. You will see that the contents composed of mixed characters are base64-encoded.

Then, we need to copy and install the public key on Server 1 and Server 2. Do not use ftp. It is more reasonable to use a secure copy program to transmit the public key to each remote machine.

Listing 4. Installing the public key on a remote server

[offsite]$ scp .ssh/'s password: (enter password, not new
passphrase!) 100% |*****************************| 614 00:00

[offsite]$ scp .ssh/'s password: (enter password, not new
passphrase!) 100% |*****************************| 614 00:00

After installing the new public key, we can use the password specified when creating the private key and public key to log on to each machine. Now, log on to each machine and append the offsite. pub file to a file named authorized_keys, which is stored in the. ssh directory of each remote machine. We can use a text editor or simply use the cat command to append the content of the offsite. pub file to the authorized_keys file:

Listing 5. Add offsite. pub to the authorized Key List

[offsite]$ ssh's password: (enter password, not new
[server1]$ cat >> ./ssh/authorized_keys

The next step is to consider some additional security. First, we modify the access permissions of. ssh so that only the owner has the read, write, and execution permissions. Then, make sure that the authorized_keys file can only be accessed by the owner. Finally, delete the previously uploaded offsite. pub key file because it is no longer needed. It is important to set proper access permissions because the OpenSSH server may refuse to use keys with insecure access permissions.

Listing 6. Use chmod to modify permissions

[server1]$ chmod 700 .ssh 
[server1]$ chmod 600 ./ssh/authorized_keys
[server1]$ rm
[server1]$ exit

After completing the same steps on Server 2, we can return to the offline storage machine to test access with a new password type. On the offline server, you can enter the following content:

[Offsite] $ ssh-v

When checking that your account can now use a new password instead of the original password to access a remote server, use the-v or verbose flag option to display debugging information. The debugging output not only allows you to observe how the authentication process works at a high level, but also shows important information that you cannot obtain in other ways. In future connections, you may not need to specify the-v mark, but it is quite useful when testing the connection.

Use SSH-agent to automate Machine Access
The ssh-agent program is like a gatekeeper who securely provides access to the security key as needed. After the ssh-agent is started, it runs in the background and can be used by ssh, scp, and other OpenSSH applications. This allows the ssh program to request a decrypted key, instead of asking you for a secure password for the private key every time you need it.

Let's take a closer look at ssh-agent. When the ssh-agent is running, it will output the shell command:

Listing 7. ssh-agent application

[offsite]$ ssh-agent 
SSH_AUTH_SOCK=/tmp/ssh-XX1O24LS/agent.14179; export SSH_AUTH_SOCK;
echo Agent pid 14180;

We can use shell's eval command to let shell execute the output command displayed by ssh-agent:

[Offsite] $ eval 'ssh-agent'
Edas Agent pid 14198

The eval Command tells shell to evaluate (execute) commands generated by the ssh-agent program. Make sure that you specify the backticks (') instead of single quotes! After execution, the eval 'ssh-agent' statement returns the agent's process identifier. Behind the scenes, SSH_AUTH_SOCK and SSH_AGENT_PID shell variables have been exported and can be used. You can display them in the shell console to view their values:

[Offsite] $ echo $ SSH_AUTH_SOCK

$ SSH_AUTH_SOCK (abbreviated as SSH Authentication Socket) is the location of a local Socket, through which the application can communicate with the ssh-agent. Add the eval 'ssh-agent' statement to your ~ /. Bash_profile file to ensure that SSH_AUTH_SOCK and SSH_AGENT_PID are always registered.

Now ssh-agent has become a background process, which can be viewed using top and ps commands.

Now we can use ssh-agent to share our password. Therefore, we must use a program named ssh-add, which adds (sends) Our password to the running ssh-agent program.

Listing 8. ssh-add for password-free Login

[offsite]$ ssh-add 
Enter passphrase for /home/accountname/.ssh/id_dsa: (enter passphrase)
Identity added: /home/accountname/.ssh/id_dsa

Now, when we access server1, we will not be prompted to enter the password:

[Offsite] $ ssh
[Server1] $ exit

If you do not believe this yet, try removing the (kill-9) ssh-agent process and then reconnecting to server1. This time, you will notice that server1 will ask for the password of the private key stored in id_dsa in the. ssh directory:

[Offsite] $ kill-9 $ SSH_AGENT_PID
[Offsite] $ ssh
Enter passphrase for key'/home/accountname/. ssh/id_dsa ':

Use keychain to simplify key access
So far, we have learned about several OpenSSH programs (ssh, scp, ssh-agent, and ssh-add ), in addition, we have created and installed the private key and public key to enable a secure and automatic logon process. You may have realized that most of the setup work only needs to be performed once. For example, the process of creating a key, installing a key, and executing an ssh-agent through. bash_profile only needs to be performed once on each machine. That's good news.

The bad message is that every time we log on to an offline machine, we must call ssh-add, ssh-agent is not directly compatible with the cron scheduling process that we will use to automate backup. The reason cron processes cannot communicate with ssh-agent is that cron jobs are executed as cron sub-processes so that they do not inherit the $ SSH_AUTH_SOCK shell variable.

Fortunately, there is a solution that not only eliminates the limitations of ssh-agent and ssh-add, in addition, we can use cron to automatically perform various processes that require secure password-less access to other machines. In his three developerWorks series published in 2001, namely OpenSSH key management (see references for links), Daniel Robbins introduced a shell script named keychain, it is a front-end of ssh-add and ssh-agent, which simplifies the entire password-free process. Over time, the keychain script has undergone many improvements and is now maintained by Aron Griffin. Its latest version 2.3.2-1 was released on July 1, June 17, 2004.

The keychain shell script is too long to be listed in this article, because the carefully written script contains a lot of error detection, rich documentation, and a lot of cross-platform code. However, keychain can be easily downloaded from the project's Web site (see references for links ).

After the keychain is downloaded and installed, it is easy to use it. You only need to log on to each machine and add the following two lines to each. bash_profile file:

Keychain id_dsa
.~ /. Keychain/$ HOSTNAME-sh

When you log on to each machine again for the first time, the keychain will ask you for the password. However, unless the machine is restarted, the keychain will not require you to re-enter the password when you log on later. The best thing is that the cron task can now use OpenSSH commands to securely access remote machines without the need to use passwords for interaction. We now have both better security and ease of use.

Listing 9. Initialization on each machine

KeyChain 2.3.2; 
Copyright 2002-2004 Gentoo Technologies, Inc.; Distributed under the

* Initializing /home/accountname/.keychain/localhost.localdomain-sh
* Initializing /home/accountname/.keychain/localhost.localdomain-csh
* Starting ssh-agent
* Adding 1 key(s)...
Enter passphrase for /home/accountname/.ssh/id_dsa: (enter passphrase)

Script backup process
Our next task is to create a shell script to execute the necessary backup process. The objective is to perform full database backup for servers 1 and 2. In our example, each server runs the MySQL database server. We use the mysqldump command line tool to export some database tables to an SQL input file.

Listing 10. dbbackup. sh shell script of Server 1


# change into the backup_agent directory where data files are stored.
cd /home/backup_agent

# use mysqldump utility to export the sites database tables
mysqldump -u sitedb -pG0oDP@sswrd --add-drop-table sitedb --tables
tbl_ccode tbl_machine tbl_session tbl_stats > userdb.sql

# compress and archive
tar czf userdb.tgz userdb.sql

On Server 2, we will set a unique form similar to the script used to back up the site database. Each script is marked as executable through the following steps:

[Server1]: $ chmod + x dbbackup. sh

Dbbackup is set on servers 1 and 2. sh, we return to the offline data server, where we will create a shell script to call various remote dbbackup. sh script and then transfer the compressed (. tgz) data file.

Listing 11. shell script used on offline data servers


# use ssh to remotely execute the script on server 1
/usr/bin/ssh "/home/backup_agent/"

# use scp to securely copy the newly archived userdb.tgz file
# from server 1. Note the use of the date command to timestamp
# the file on the offsite data server.
/home/backups/userdb-$(date +%Y%m%d-%H%M%S).tgz

# execute on server 2
/usr/bin/ssh "/home/backup_agent/"

# use scp to transfer transdb.tgz to offsite server.
/home/backups/transdb-$(date +%Y%m%d-%H%M%S).tgz

The shell script uses the ssh command to execute the script on the remote server. Because we have set password-free access, ssh commands can be remotely executed on Server 1 and Server 2 through offline servers. Thanks to keychain, the entire authentication process can now be completed automatically.

Our next step is the last step. It is to schedule the execution of the shell script on the offline data storage server. We will add two entries to the cron scheduling server to execute the backup script twice a day, once at, and again. Use the edit (-e) option on an offline server to call the crontab program.

[Offsite]: $ crontab-e

Crontab calls the Default EDITOR specified by the VISUAL or EDITOR shell environment variables. Then, enter two entries and save and close the file.

Listing 12. Crontab entries on offline servers

34 3 * * * /home/backups/ 
34 20 * * * /home/backups/

A crontab line consists of two main parts: the time table part and the subsequent command part. The schedule is divided into multiple domains to specify when a command should be executed:

Listing 13. Crontab format

+---- minute 
| +----- hour
| | +------ day of the month
| | | +------ month
| | | | +---- day of the week
| | | | | +-- command to execute
| | | | | |
34 3 * * * /home/backups/

Verify your backup
You should perform routine check on the backup to ensure that the program is correctly performed. Automatic programs can avoid tedious work, but they can never be lazy. If your data is worth backing up, it also deserves regular sampling checks.

Consider adding a cron job to remind you to check the backup at least once a month. In addition, it is also a good idea to modify the security key frequently. You can also schedule a cron job to remind you to do this.

Other security measures
For higher security, you can install and configure an intrusion Detection System (IDS), such as Snort, on each machine. It is foreseeable that IDS will notify you when an intrusion is occurring or has recently expired. After IDS is in place, you can add security at other levels, such as digital signature and encryption for your backup.

Popular open-source tools, such as GNU Privacy Guard (GnuPG), OpenSSL, and ncrypt, support encryption of archival files through shell scripts, however, it is not recommended that you do so without the additional levels of protection provided by IDS (see references for more information about Snort ).

This article shows you how to run your script on a remote server and how to perform secure and automatic file transfer. I hope that you will be inspired to consider protecting your important data and use open source tools such as OpenSSH and Snort to build new solutions.

-On the official OpenSSH homepage and OpenSSH Security page, you will find downloads, documents, and more.

-Read Daniel Robbins's three-part IBM developerWorks article "OpenSSH key management" (developerWorks, 2001) and download his keychain application.

-To learn more about SSH, Carlos recommends O 'Reilly's SSH, The Secure Shell: The Definitive Guide (O 'Reilly & Associates, January 1, 2001 ).

-The Snort intrusion detection system (IDS) is the best open-source product designed to detect and report unauthorized access or suspicious behavior. If you are planning to automate the signing and encryption of archive files, you must use an IDS.

-You can use GNU Privacy Guard (GnuPG), OpenSSL, and ncrypt to sign and encrypt the archive backup files in shell scripts.

-If you have never used them, refer to the prompts in TCP wrappers and xinetd.

-Perl enthusiasts will also be interested in reading "using Perl to automate UNIX system management" (developerWorks, 2001) and "Introducing cfengine for system management" (developerWorks, 2002) and "using Perl for application configuration" (developerWorks, 2000), they are all works of Ted Zlatanov.

-The developerWorks article "Windows to Linux: Part 1. backup and recovery" (developerWorks, 8th) provides tips on backup policy.

-IBM's Tivoli Storage Manager for Linux can automatically perform reliable backup, archiving, and centralized data management on Linux computers and servers according to the customized schedule. In addition, products for user management, access control, and network monitoring-and more-all have unified environments and interfaces.

-Learn more about Tivoli solutions in the Tivoli area on IBM developerWorks.

-In the developerWorks Linux area, you can find more references for Linux developers.

-Order Linux books for sale at a discount in the Developer Bookstore Linux area.

-Use the latest IBM tools and Middleware on developerWorks subnetworks to develop and test your Linux applications: You can obtain IBM software from WebSphere, DB2, Lotus, Rational, and Tivoli, and a license that can use the software within 12 months, all costs are lower than you think.

-Download the free trial version of The developerworks submodules that can run on Linux from the speed-start your Linux app area on developerworks, includes WebSphere Studio site developer, WebSphere SDK for Web Services, WebSphere Application Server, DB2 Universal Database personal developers edition, Tivoli Access Manager, and Lotus Domino server. To get started more quickly, see how-to articles and technical support for various products.

About the author
Carlos justiniano is a software designer for ecuity, Inc. He is interested in communication and distributed computing. Carlos wrote articles for many technical journals. He is also the founder and designer of the Linux-based chainrain project, which won 2005 Guinness World Record related to distributed computing. You can contact him via

Full text from: IBM developerworks Chinese website

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.