The system administrator manages the most valuable asset-data of the enterprise. Linux, which occupies half of the market for enterprise-level server operating systems, makes the Linux System Administrator the most important asset administrator. The Administrator's responsibility is to make limited IT resources to store the most valuable data. In 1991, when IBM launched a 3.5 inch 1 GB hard drive, administrators could gain insight into every file on the hard drive and manually implement file management. Today, PB-level storage devices, it brings unprecedented challenges to file management.
File Deletion operations can be completed by Linux users. Which of the following operations can you perform to delete these files?
- Delete a file ending with a specified Suffix in the entire file system
- Deletes a specified file from a file system with 1 million objects.
- Delete 0.1 million files created on a specified date from a file system of tens of millions
- In hundreds of millions of file systems, file system cleanup is executed every day to delete millions of files generated one year ago.
The following describes how to implement the above File Deletion policies and methods. If the above operations are easy for you, you can ignore this article.
For file system cleanup, we can simply divide the cleanup task into two categories: cleanup of expired files and cleanup of junk files.
- Expired files
Any data has its own life cycle. The data life cycle curve tells us that the data has the greatest value in a period of time after it is generated and generated, and then the data value degrades over time. When the data lifecycle ends, delete these expired files and release the storage space for valuable data.
- Junk files
During system running, various temporary files, temporary files during application running, Trace files generated by system errors, Core Dump, and so on will be generated. After these files are processed, the reserved value is lost. These files can be collectively referred to as spam files. Timely cleanup of junk files helps system maintenance and management, and ensures stable and effective system operation.
Overview of Automatic File cleanup features and methods of automatic file cleanup
Delete a file in the specified absolute path, and rm will be able to implement it. If you only know the file name and do not know the path, we can find it through 'Find 'and then delete it. By extension, if we can find the specified file based on The Preset conditions, we can delete the file. This is the basic idea of Automatic File cleanup. A list of files to be deleted is generated based on The Preset conditions, and a regular cleanup task is executed to delete the files.
For expired files, they are marked with timestamps. Different file systems may have different time attributes, such as file creation time, access time, and expiration time. Because most of the expired files are stored in the archive system, the number of such files is huge. For large systems, the number of expired files may reach hundreds of thousands or even millions every day. For the number of files of such scale, it takes a lot of time to scan the file system and generate the file list. Therefore, file cleanup performance is a problem that must be considered by such personnel.
For junk files, they may be files stored in a specific directory, or they may end with a special suffix, it is also possible that files of 0 or ultra-large sizes are generated due to system errors. For these files, the number of files is generally small, but there are many types and the situation is complicated, based on the experience of the system administrator, You need to develop more detailed file query conditions, regularly scan, generate a file list, and then perform further processing.
Introduction to related Linux commands
Common File System Management Commands include 'Ls', 'rm ', and 'Find. Since these commands are common system management commands, we will not repeat them here. For detailed usage, see Command help or Linux User Manual. Because large-scale file systems are generally stored in dedicated file systems, these file systems provide unique commands for file system management. This article uses the ibm gpfs file system as an example to briefly introduce several file system management commands of GPFS.
- Mmlsattr
This command is used to view extended attributes of files in the GPFS file system, such as storage pool information and expiration time.
- Mmapplypolicy
GPFS uses policies to manage files. This command can perform various operations on the GPFS File System Based on the policy file defined by the user, which is highly efficient.