This article describes how to use backup scripts and commands and commercially available software to protect disk data securely and efficiently in a Linux environment.
The backup and recovery system assumes the function of ex ante backup and afterwards recovery. In the current high-speed development of the network environment, any one network information systems can not guarantee absolute security. As long as there is a network, there will be a variety of threats from the network. In order to protect against network attacks and intrusion, although we introduced an increasingly sophisticated intrusion detection system, firewall system, hackers are increasingly sophisticated means of intrusion, they can always find these systems security vulnerabilities and inadequate intrusion, so the network intrusion caused by the security incidents are increasing year by day.
In this case, we can not guarantee the absolute security of critical systems in the network, so we need to adopt backup and recovery technology. Backup and recovery technology is the use of storage media and a certain strategy to regularly backup the system business data to ensure that the data is accidentally lost as soon as possible to restore the user's loss to the lowest point. It is a very important core technology in the subject of information security.
Backup Technology under Linux
For backup and Recovery, Linux provides tools such as tar, cpio, and dump to implement. In this way, users do not have to pay to purchase, can be based on the actual situation of the system, combined with the corresponding backup recovery tool to achieve the basic functions of backup and recovery.
Introduction to the TAR Tools
Tar is a classic Unix command that has been ported to Linux. Tar is an abbreviation for Tape archive (tape archive) that was originally designed to package files onto tape, and now most of us use it to back up a partition or some important file directories. We use tar to package the entire tree, which makes it especially useful for backups. The archive can be restored in its entirety, or a separate file and directory may be expanded from it. Backups can be saved to a file-based device or to a tape device. Files can be redirected at restore time to put them back in a different directory (or system) than the directory (or system) in which they were originally saved. Tar is a file system-independent, so it's a wide range of applications that can be used on ext2, ext3, JFS, Reiser, and other file systems.
Using tar is very similar to using file utilities such as WinZip, WinRAR in Windows environments. Simply point it to a purpose (can be a file or device), and then specify the file you want to package, you can dynamically compress the archive with the standard compression type, or specify an external compression program of your choice. To compress or decompress a file by bzip2, you can use the Tar-z command.
Here is a simple example of using the tool for data backup:
Tar czvf-/root/code >/tmp/code_bak.tgz (Pack All program files in the/root/code directory back up to/tmp/code_bak.tgz)
Tar Xzvf/tmp/code_bak.tgz/root/code (restores the backed-up directory file to the specified directory)
Introduction to Cpio Tools
The Cpio command can be copied or copied from the tar or cpio files. The Cpio command is compatible with the tar command, but this command has features that are not available in the TAR command, as follows:
Supports Cpio and tar two file formats;
Supports many older tape data formats;
The file name of the file that can be read through a pipe (pipe-line).
At present, only a few Linux packages are released in CPIO format. Users who are interested in the details of the Cpio command can read its manual using the "man cpio" command.
Dump and restore Introduction
Dump can perform a function similar to tar. However, dump tends to consider file systems rather than individual files. Dump checks the files on the ext2 file system and determines which files need to be backed up. These files are copied to a given disk, tape, or other storage medium for security protection. On most media, capacity is determined by writing until the return of a end-of-media tag.
The program with the dump is restore, which is used to restore files from the dump image. The restore command performs the reverse function of the dump. You can restore a full backup of the file system first, and subsequent incremental backups can be overwritten on top of the restored full backup. You can restore a separate file or directory tree from a full or partial backup.
Both dump and restore can be run on the network, so users can back up or restore from a remote device. Dump and restore use tape drives and file devices that provide a wide range of options. However, both are limited to ext2 and Ext3 file systems. If you are using a JFS, Reiser, or other file system, you will need additional utilities, such as tar. As an example:
Dump 0f/dev/nst0/(Back up ext2 file system to first SCSI device)
Restore-xf/dev/nst0/home/code (restores all data from the/home/code directory in the backed up SCSI device to disk)
In addition, commercial storage backup tools, such as Tivoli Storage Manager, are now available on the market, and are visual tools that allow users to easily backup and restore through the user interface. It is worth noting that this is a commercial software, therefore, it is not free, for many users, it provides the functionality of our Linux with the backup and recovery tool similar to the above, but more humane and friendly, so that users from the boring command line to get out of the way, so as to facilitate user use.
Backup requires a certain amount of material and financial resources, how the proceeds of backup and backup consumption of the resource to make a trade-off is a big problem for network security workers to consider, because spending more than output means inefficiency or even failure. Therefore, the data backup in Linux environment needs to be differentiated according to the actual situation in order to achieve better results. In particular, some key catalogs need to be differentiated to achieve targeted backups and reduce unnecessary waste.
In general, the following directories are required to be backed up, and they play a pivotal role in the system:/ETC contains all the core profiles, including password files, network profiles, system names, firewall rules, NFS file system configuration files, and other global system items; var contains information used by the System Daemon (service). Includes DNS configuration, DHCP lease, message buffering files, HTTP server files, and so on;/home contains the default user home directories for all users, including their personal settings, downloaded files, and important information that users have stored in the system; Root is the home directory of root users;/bin is an important place to store commands such as LS and PS that are commonly used in many systems, and these commands have an impact on disk and are extremely vulnerable to hackers.
Conversely, in a Linux system, the following directories should not be backed up:/proc never needs to be backed up, it is not a real file system, but rather a virtualized view of the kernel and the environment, which includes files such as/proc/cpuinfo,/proc/meminfo, This file is a virtual view of the entire running memory, once the system shuts down or restarts, the information in the directory ceases to exist;/dev The file that contains the hardware device indicates that if you plan to restore to a blank system, you can back up/dev, and if you plan to restore to an installed Linux system, backup /dev is not necessary; In addition, some soft links (that is, files that point to other disk files) do not need to be backed up because they are stored only to the address of the file, which the user can determine by using the "ls-l" command, as shown below, Where linux-2.4 is a soft link file that points to the linux-2.4.7-10 directory.
lrwxrwxrwx 1 root 14 June linux-2.4-> linux-2.4.7-10
Backup technology and basic classification
Generally speaking, the establishment of a complete network data backup system must have the following necessary conditions:
Data backup capacity is relatively large, so the data backup for the critical business system needs to be automated to reduce the workload of the system administrator;
With Backup server to form a backup center, the application system of various platforms and other information data are centralized backup, the system administrator can manage, monitor and configure the backup system on any workstation, realize the characteristics of distributed processing and centralized management.
Users can easily and quickly restore damaged entire file system and all kinds of data;
Backup system should also consider the impact of network bandwidth on backup performance, backup server platform selection and security, backup system capacity of the appropriate redundancy, backup system good scalability and other factors.
The backup effort requires a strategy to determine the data backup. A backup strategy refers to what you need to back up, when and how you want to back up. Users need to develop a different backup strategy based on their actual situation. The following three types of backup strategies are currently used:
1. Full backup
is to make a full backup of the system every day. The benefit of this backup strategy is that you can recover lost data when there is a disaster with data loss. However, it also has deficiencies. First, a full backup of the entire system on a daily basis results in a large duplication of the data being backed up. These duplicate data occupy a lot of space, which means more cost to the user. Second, because of the large amount of data that needs to be backed up, the backup takes longer. For those users and organizations with busy business and limited backup time, it would be unwise to choose this backup strategy. Also, because the backup time interval is too small, resulting in excessive number of full backups, resulting in a large amount of hardware resources waste, is also very unnecessary.
2. Incremental backup
is to make a daily backup of only new or modified data for the day. The advantage of this backup strategy is that it saves storage media space and shortens backup time. But its disadvantage is that when a disaster occurs, the recovery of data is more troublesome. In addition, the reliability of this backup is poor. The data in the backup interval is unrecoverable, which requires the user to make the appropriate trade-offs when specifying the backup interval in order to achieve better results.
3. Differential backup
Administrator first makes a full system backup initially (for example, Sunday), and then in the next few days, the administrator backs up all the data (new or modified) from Sunday to the magnetic media on the same day. The differential backup strategy has all the advantages of avoiding the defects of these two strategies. First, it does not require a full backup of the system on a daily basis, so the backup takes a short time and saves space; second, its disaster recovery is convenient.