Linux flash File System Analysis

Source: Internet
Author: User

You may have heard of journaling flash File System (jffs) and yet another flash File System (yaffs) before, but do you know what it means to use the file system of the underlying flash device? This article will introduce you to the Flash file system in Linux and explore how they process the underlying consumable devices (flash parts) through the average read/write (wear leveling ), it also identifies various Flash file systems and their basic designs.

Solid state drives are currently very popular, but embedded systems have been using solid state drives for storage long ago. You can see that the flash system is used for personal digital assistants (PDAs), mobile phones, MP3 players, digital cameras, USB flash drives (UFD), and even laptops. In many cases, file systems of commercial devices can be customized and proprietary, but they face the following challenges.

Flash-based file systems are in a variety of forms. This article will discuss several read-only file systems, and review the available read/write file systems and their working principles. However, Let's first look at the flash devices and their challenges.

Flash Memory Technology

Flash Memory (which can be achieved through several different technologies) is a non-volatile memory, which means that the content remains unchanged after the power is disconnected.

The two most common flash devices are nor and NAND. Nor-based Flash technology is relatively early, it supports high read performance, but to reduce the cost of capacity. NAND Flash provides higher capacity while achieving fast write and erase performance. Nand also requires more complex input/output (I/O) interfaces.

Flash parts are usually divided into multiple partitions, allowing multiple operations at the same time (erase a partition and read another partition at the same time ). Partitions are divided into blocks (usually 64kb or 128kb ). The partition firmware can further segment a block in a unique way-for example, a block contains 512 bytes but does not contain metadata.

A flash device has a common limitation that it requires device management compared with other storage devices (such as RAM disks. The only allowed write operation in the flash memory device is to change 1 to 0. If you need to cancel the operation, you must erase the entire block (reset all data back to status 1 ). This means that other valid data in the block must be deleted for persistence. Nor flash memory can usually write one byte at a time, while nand flash memory must write multiple bytes (usually 512 bytes ).

The two memory types differ in the aspect of block erasure. Each type requires a special erase operation, which can cover an entire block in flash memory. Nor technology requires a preparation step to clear all values and then start the erase operation. Erase is a special operation for flash devices, which is very time-consuming. The erasure operation is related to electricity. It removes electrons from all units of the entire block.

The nor flash device usually takes several seconds to perform the erase operation, while the NAND device only takes several milliseconds. A key feature of Flash devices is the number of erase operations that can be performed. In the nor device, each block in the flash memory can be erased 100,000 times, and the memory can reach 1 million times.

Challenges for flash memory

In addition to the limitations mentioned above, there are still many challenges to manage flash devices. The three major challenges are garbage collection, bad block management, and average read/write.

Garbage Collection

Garbage collection is a process of recycling invalid blocks (invalid blocks contain invalid data ). The recycle process involves moving valid data to a new block, and then erasing the invalid block to make it available. If the file system has less available space, this process is usually executed in the background (or as needed ).

Manage Bad blocks

When the flash device is used for a long time, the flash device may have bad blocks, and even the flash device may be unavailable due to bad blocks at the factory. If the flash operation (such as erase) fails, or the write operation is invalid (Error Correction Code, ECC, is found through invalid Error Correction Code), it indicates that there is a bad block.

After identifying bad blocks, these blocks are marked into a bad block table in flash. The specific operation depends on the device, but it can be achieved through a set of independent reserved blocks (different from common data block management. The process of processing Bad blocks-whether at the factory or in use-is called Bad block management. In some cases, an internal microcontroller can be implemented in hardware, so it is transparent to upper-layer file systems.

Average read/write

As mentioned above, flash devices are consumables: a limited number of repeated erase operations can be performed before they become bad blocks (therefore, they must be marked by bad block management ). The average read/write algorithm maximizes the life of flash. There are two types of average read/write: Dynamic average read/write and static average read/write.

The dynamic average read/write address the limit on the number of erase cycles of blocks. The dynamic average read/write algorithm does not randomly use available blocks, but uses blocks on average. Therefore, each block has the same chance of use. The static average read/write algorithm solves a more interesting problem. In addition to maximizing the number of erase cycles, some flash devices are also affected by the maximum read cycle between two erase cycles. This means that if the data is stored in the block for too long and read many times, the data will be gradually consumed until it is lost. The static average read/write algorithm solves this problem because it can regularly move data to new blocks.

System Architecture

So far, I have discussed flash devices and their basic challenges. Now let's see how these devices are combined into a layered architecture (see figure 1 ). The top layer of the architecture is the Virtual File System (VFS), which provides common interfaces for advanced applications. The following describes the Flash file system in VFS ). Next, the Flash Translation Layer (FTL) manages flash devices as a whole, including allocating blocks from the underlying flash devices, address translation, dynamic average read/write, and garbage collection. Some flash devices can implement some FTL in hardware.

Figure 1. Basic architecture of the flash System

The Linux kernel uses the memory technology device (MTD) interface, which is a common interface for flash systems. MTD can automatically detect the bus width of Flash devices and the number of devices required to realize the bus width.

Flash File System

Linux supports multiple flash file systems. The next section describes the design and advantages of each file system.

Journaling flash File System

Journaling FLASH file system is one of the earliest FLASH file systems for Linux. Jffs is a log structure file system specially designed for nor flash devices. It is unique and can solve many flash device problems, but it also causes some new problems.

Jffs regards flash devices as cyclic block logs. The data written to flash is written to the end of the space, and the starting part of the block is reclaimed, and the space between the two is idle. When the space becomes small, garbage collection is executed. The garbage collector moves valid blocks to the end of the log, skips invalid or discarded blocks, and erases them (see figure 2 ). Therefore, this file system can automatically achieve static and dynamic average read/write. The main disadvantage of this architecture is that the erasure operation is performed too frequently (instead of using the best erasure policy), resulting in rapid device wear.

Figure 2. Loop logs before and after garbage collection

When jffs is mounted, the structure details will be read to the memory, which will delay the Mount time and consume more memory.

Journaling FLASH file system 2

Although jffs was very useful in the early days, its average read/write algorithm can easily shorten the life of nor flash devices. Therefore, we re-designed the underlying algorithm and removed the circular log. Jffs2 algorithms are designed specifically for NAND Flash devices and improve compression performance.

In jffs2, each block in Flash is processed separately. Jffs2 fully performs average read/write on devices by maintaining the block list. The Clean list indicates that all the blocks in the device are valid. The garbage collection algorithm intelligently identifies the blocks to be recycled using a reasonable method. Currently, this algorithm is selected from the clean or dirty list based on the probability. The dirty list has a selection probability of 99% (moving valid content to another block), while the clean list has a selection probability of 1% (moving content to a new block ). In both cases, the selected block is erased and placed in the Free List (see figure 3 ). This allows the Garbage Collector to reuse discarded blocks, but still moves data around flash to support static average read/write.

Figure 3. Block management and garbage collection in jffs2

Yet another flash File System

Yaffs is another flash file system developed for NAND Flash. The earliest version (yaffs) supports flash devices with 512-byte pages, but newer versions (yaffs2) Support new devices with larger pages and larger write restrictions.

Most FLASH file systems mark discarded blocks, but yaffs2 uses monotonically incrementing numeric serial numbers to mark additional blocks. When scanning a file system during mounting, you can quickly identify a valid inode. Yaffs is retained in the ram tree to represent the block structure of the flash device, including fast mounting through Checkpointing-this process will save the ram tree structure to the flash device during normal uninstallation, to quickly read and restore data to ram during mounting (see figure 4 ). Compared with other flash file systems, the performance of yaffs2 is its biggest advantage.

Figure 4. Block management and garbage collection in yaffs2

Read-Only compressed file system

In some embedded systems, there is no need to provide a changeable File System: An Immutable file system is sufficient. Linux supports multiple read-only file systems, including cramfs and squashfs.

 

Cramfs

The cramfs file system is a compressed Linux read-only file system that can be used for flash devices. Cramfs features simplicity and high space utilization. This file system is used in Embedded design with low memory usage.

Although cramfs metadata is not compressed, cramfs uses zlib for each page to allow random page access (decompress the page upon access ).

You can use the mkcramfs utility and loopback device to try cramfs.

Squashfs

Squashfs is another type of compressed Linux read-only file system that can be used for flash devices. You can find squashfs in many live CD Linux releases. In addition to zlib compression, squashfs also uses lembel-Ziv-Markov Chain Algorithm (lzma) to improve compression speed.

Like cramfs, you can use squashfs on standard Linux systems through mksquashfs and loopback devices.

Conclusion

As with most open source code, software is constantly evolving and new flash file systems are under development. An interesting alternative file system that is still in the development stage is logfs, which contains some very novel ideas. For example, logfs maintains a tree structure in the flash device, so the Mount time is similar to that of the traditional File System (such as ext2 ). It also uses a complex tree for garbage collection (a B + tree form ). However, the most interesting thing about logfs is its excellent scalability and support for large flash components.

With the increasing popularity of FLASH file systems, you will see a lot of research on them. Logfs is an example, but other file systems similar to ubifs are also evolving. The architecture of FLASH file systems is very interesting and will be the source of technological innovation in the future.

Node. The blocks in the dirty list contain at least one discarded node. Finally, the Free List contains the blocks that have been erased and can be used.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.