Basic Linux File System Knowledge

Source: Internet
Author: User

I have read a ora 6 practical tutorial over the past two days. The following are my notes on Linux file systems:
1. Linux File System allocation policy:

Block allocation and extent allocation)
Block allocation: file blocks on the disk are allocated to files as needed, avoiding the waste of storage space. However, when the file is expanded, the file blocks in the file are not consecutive, resulting in excessive disk seek time.
During each file extension, the block allocation algorithm needs to write the structure information of the file block, that is, meta-Dada. Meta-data is always written to the storage device together with the file. The file change operation can only be performed after all meta-data operations are completed. Therefore, meta-data operations can significantly reduce the performance of the entire file system.
Extension allocation: When a file is created, a series of consecutive blocks are allocated at a time. When the file is expanded, many blocks are allocated at a time. Meta-data is written when a file is created. When the file size does not exceed the size of all allocated file blocks, no meta-data is written until the file block needs to be reassigned.
Extended allocation uses block allocation in groups, which reduces the time for writing data to SCSI devices and provides good performance when reading sequential files. However, when a file is randomly read, it is similar to block allocation.
The size of the file block group or block cluster is determined during compilation. The size of the cluster has a great impact on the performance of the file system.
Note: meta-data metadata: information related to files, such as permissions, owner, creation, access, or change time.
2. file record format
The Linux system uses an index node (inode) to record file information. An index node is a data structure that contains the length of a file, creation and modification time, permission, ownership, and location in the disk.
A file system maintains an array of index nodes. Each file or directory corresponds to the unique element in the index node array. The index number of each index node in the array is called the index node number.
The Linux File System saves the file index node number and file name in the directory at the same time. Therefore, the directory is only a table that combines the file name and its index node number, each pair of file names and index node numbers in the directory is called a connection.
An index node number corresponds to a file, but an index node number corresponds to multiple file names.
Connections are divided into soft connections and hard connections. Soft connections are also called symbolic connections.
Hard connection: the original file name and connection file name both point to the same physical address. The directory cannot have a hard connection; the hard connection cannot span the File System (different partitions cannot be crossed), and only one file is copied on the disk.
The deletion of files can be successful only when the same index node is a unique connection. Therefore, hard connection can prevent unnecessary accidental deletion.
Soft connection: Use the Ln-s command to establish a symbolic connection to the file. Symbolic connection is a special file in Linux. As a file, its data is the path name of the file it connects. There is no function to prevent accidental deletion.
3. File System Type:

Ext2: common file systems in early Linux
Ext3: an upgraded version of ext2 with the log function
Ramfs: memory file system, fast
NFS: A Network File System invented by Sun. It is mainly used for remote file sharing.
MS-DOS: A MS-DOS File System
Vfat: File System Used in Windows 95/98
Fat: File System Used in Windows XP
NTFS: The file system used by Windows NT/XP
HPFs: file system used by OS/2
Proc: Virtual process File System
Iso9660: file system used by most CDs
Ufssun: file system used by OS
Ncpfs: file system used by the Novell Server
Smbfs: Samba Shared File System
XFS: an advanced Log File System developed by SGI. It supports ultra-large files.
JFS: the log file system used by IBM AIX
Reiserfs: File System Based on the Balance Tree Structure
UDF: erasable data disc File System
4. Virtual File System (VFS)

All file systems supported by Linux are called logical file systems, while Linux adds an interface layer of the storage file system (vitual File System (VFS) based on the traditional logical file system.
The Virtual File System (VFS) is located at the top of the file system. It manages various logical file systems, shields differences between various logical file systems, and provides unified file and device access interfaces.
5. Logical Structure of the file
The logical structure of a file can be divided into two categories: byte streaming unstructured files and recorded structured files.
A file composed of byte streams (byte sequences) is a non-structured or streaming file. It does not take into account the logical structure of the file, but simply regards it as a series of byte sequences, it is easy to add content anywhere in the file.
A file composed of records is called a record-type file. A record is the basic information unit of this type of file. A record-type file is generally used for information management.
6. File Type

Common files: generally stream files
Directory file: Used to indicate and manage all files in the system
Connection file: used to share files in different directories
Device Files: including block device files and character device files. Block device files represent disk files and CDs. Character Device Files operate on terminals, keyboards, and other devices according to characters.
MPs Queue (FIFO) file: a method for communication between processes
Socket file: the file type is related to network communication.
7. File structure: including index nodes and Data
Index node: Also known as an I node. In the file system structure, it contains a record of information about the corresponding file, the information includes the File Permission, file name, file size, storage location, and creation date. The index nodes of all files in the file system are stored in the index node table.
Data: the actual content of the file. It can be empty or very large and have its own structure.
8. ext2 File System

The data block size of the ext2 file system is generally 1024b, 2048b, or 4096b.
Inode used by the ext2 File System ):
The index node adopts multiple index structures, mainly reflected in direct pointers and three indirect pointers. The direct pointer contains 12 Direct pointer blocks that direct to the data block containing the file data. The next three indirect pointers are designed to adapt to the file size changes.
E. g: assume that the data block size is 1024b, and 12 Direct pointers can be used to save a file with a maximum size of 12 kb. When the file size exceeds 12 kb, a single-level indirect pointer is used, the data block pointed to by this pointer stores a set of data block pointers that point to data blocks containing actual data in turn,
If each pointer occupies 4B, 1024/4 = 256 data pointers can be saved in each single-level pointer data block, therefore, the direct pointer and single-level indirect pointer can be used to save 1024*12 + 1024*256 = 268 KB files. When the file size exceeds 268kb, the second-level indirect pointer is used until the third-level indirect pointer is used.
The maximum file size that can be saved by direct pointer, single-level indirect pointer, second-level indirect pointer, and third-level indirect pointer is:
1024*12 + 1024*256 + 1024*256*256 + 1024*256*256*256 = 16843020 kb, about 16 GB
If the data block size is 2048b and the pointer occupies 4B, the maximum file size is: 2048*12 + 2048*512 + 2048*512*512 + 2048*512*512*512 = 268,960,792 kb about 268 GB
If the size of the data block is 4096b and the pointer occupies 4B, the maximum file size is: 4096*12 + 4096*1024 + 4096*1024*1024 + 4096*1024*1024*1024 = 4,299,165,744 kb, about 4 TB
Note: run the command tune2fs-L/dev/sda5 to view the file system.
Ext2 File System Max. File Name Length: 255 characters

Disadvantages of ext2 File System:
When writing the file content, ext2 does not write the file meta-data at the same time. The working order of ext2 is to write the file content first, and then write the file meta-data when it is idle. In the event of an accident, the file system will be in an inconsistent state.
When the system is restarted, Linux will start the file system check program, scan the entire file system and try to fix it, but it does not provide guarantee.
9. ext3 File System:

Ext3 is based on the ext2 code, so the disk format is the same as ext2 and the same metadata is used.
The ext2 file system is lossless converted to the ext3 File System: tune2fs-J/dev/sda6

The journaling block device layer (jbd) completes the ext3 File System Log function. Jbd is not unique to the ext3 file system. It is designed to add logging to a block device.
When a file is modified and executed, the ext3 File System Code notifies jbd, which is called a transaction ). In the event of an accident, the log function can replay the interrupted transaction.

Three data modes in logs:
1) Data = writeback: does not process any form of log data, giving users the highest overall performance
2) Data = odered: only the metadata log is recorded, but the metadata and data constitute a unit called transaction ). This mode ensures the reliability of the sentence and the consistency of the file system. The performance is much lower than the data = writeback mode, but faster than the data = journal mode.
3) Data = Journal: provides complete data and metadata logs. All new data is first written into the log before being located. After an accident occurs, the log can be replayed to bring the data and metadata back to the consistent state. The overall performance of this mode is the slowest, but the data needs to be read from the disk and written to the disk is the fastest in three modes.
Ext3 File System Maximum File Name Length: 255 characters
Ext3 File System advantages: availability, data integrity, speed, and compatibility
10. reiserfs File System

The reiserfs file system is jointly developed by Hans Reiser and his development team. The entire file system is completely designed from scratch and is a very good file system. It is also one of the earliest log file systems used in Linux.
Reiserfs features
Advanced Log Mechanism
Reiserfs has an advanced log (journaling/logging) function mechanism. The log mechanism ensures that logs are written to the hard disk before each actual data modification. The security of files and data has been greatly improved.
Efficient disk space utilization
Reiserfs does not allocate inode to some small files. Instead, these files are packaged and stored in the same disk block. Other file systems place each small file in one disk block.
Unique search methods
The reiserfs is based on the rapid balancing tree (balanced tree) search, which has excellent performance and is a very efficient algorithm. When reiserfs searches for a large number of files, the search speed is much faster than ext2. The reiserfs file system uses B * tree to store files, while other file systems use B + tree. B * tree query is much faster than B + tree. Reiserfs is fast in file location.
In practical use, reiserfs is five times faster than ext2 when processing files smaller than 4 K. The reiserfs with the File compression function (default) Stores 6% more data than the ext2 file system.
Support for massive Disks
Reiserfs is a very good file system and has been used on high-end Unix systems. It can easily manage hundreds of GB file systems. The maximum size supported by the reiserfs file system is 16 TB. This is very suitable for enterprise applications.
Excellent Performance
Thanks to its efficient storage and fast small file I/O features, when you start the X Window System Using the reiserfs file system PC, it takes 1/3 less time to use the ext2 file system than on the same machine. In addition, the reiserfs file system supports 4 GB files for a single file, which provides a better choice for applications of large database systems on Linux.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.