Online Test of balanced loss and power loss recovery on NAND Flash

Source: Internet
Author: User

With its advantages in large capacity and low price, NAND Flash has quickly become a new favorite of embedded system storage. Therefore, its file system research has become increasingly widespread.

This article briefly introduces the commonly used nand flash file system yaffs, and provides the test results for the online testing of yaffs in terms of uniform loss and power loss recovery, focuses on the embedded software testing solutions and methods, analyzes the test results, and proposes the improvement solutions and applicable environment.

Introduction

With the extensive application of embedded technology in various electronic products, data storage and management in embedded systems has become an important research topic. Flash Memory has many advantages such as high speed, large capacity, and low cost, therefore, in embedded systems, flash memory is widely used as an external memory. in embedded systems, you need to have your own file system instead of directly porting a general file system. There are two main reasons: first, the application conditions of embedded systems are poor, the power supply voltage is unstable, sudden power outages and illegal plugging are prone to catastrophic effects, and general file systems do not have sufficient design considerations for reliability. Second, the record information of the general file system (such as the fat table) needs to be modified multiple times, and the record information is stored in a fixed block of flash memory, which will lead to frequent operations on the block, therefore, the flash server's service life is shortened, which puts forward higher requirements on software technology.

To manage complex storage hardware and provide a reliable and efficient storage environment, there are three mainstream FLASH file systems based on NAND and nor: trueffs, jffsx, and yaffsyaffs (yet another flash file system) are embedded file systems specially designed for NAND Flash. They are suitable for large-capacity storage devices. They are file systems with log structures, it provides mechanisms such as power loss and power loss protection to effectively reduce the impact of the above reasons on the consistency and integrity of the file system. This article is based on this premise, this paper introduces the testing scheme of the embedded yaffs file system, and conducts real-time online testing and analysis on two important system performance indicators: Loss balancing and power loss Protection in the file system, the file system should be modified in different application environments.

1 yaffs File System Overview

The yaffs file system is similar to the jffs/jffs2 file system. The jffsl/2 file system was originally designed for nor flash applications, however, there is a big difference between nor flash and NAND Flash. Although jffsl/2 file systems can also be used in nandflash, however, since it makes some trade-off for the nor feature in terms of memory usage and startup time, it is usually not the optimal solution for NAND.

1.1 comparison between nor and NAND

Basically, nor is more suitable for storing program code. Its capacity is generally small (for example, smaller than 32 MB) and the price is high. However, the NAND capacity can reach more than LGB, and the price is relatively low, suitable for data storage. Generally, the size of one page of the NAND flash chip below 128 MB is 512 bytes, which is used to store data. Each page has 16 bytes of spare space (sparedata ), acts as the OOB (out of band) area, used to store ECC (error correction codc) validation/Bad block signs and other information; then consists of several pages to form a block, generally, a 32-page (16 KB) block is not completely reliable compared with nor, and a certain proportion of Bad blocks are allowed when each chip leaves the factory, access to data is not a linear address ing, but a serial access to data through register operations.

1.2 Storage Method of yaffs data on NAND

According to the features of page-based access to NAND flash, organize files into fixed-size data segments. Use the 16-byte backup space provided by NAND Flash to store ECC inspection information and file system organization information, it not only realizes error detection and bad block processing, but also improves the loading speed of file systems.

Yaffs organizes the file into a fixed size (512 bytes) data segment. Each file has a page dedicated to storing the file header, the file header stores information such as the file mode, owner ID, group ID, length, and file name, the data segment of the file is organized into a tree structure. When the file is rewritten, the new data block is always written first, delete the old data block from the file and use the ECC stored in the spare space on the page for error detection. When an error occurs, the system retries for a certain number of times. After multiple retries fail, the page is suspended. Taking the NAND flash chip (512 + 16) bytes as an example, the layout of data storage in the yaffs file system is 1.

2 yaffs File System Test

2.1 Overall Test Description

The yaffs file system is open-source. The test is based on the white box test. In the code segment of interest, insert the test code to ensure that the test code does not affect the original code. After the test, the original code can be restored immediately, all test code (including Test variables and functions) are embedded in the # define fs_test macro definition.

2.2 generate a simulated File

In an embedded environment, there are many problems and difficulties in performing a large number of tests on the file system over a long period of time. It is also difficult to monitor the insertion of test code and data. In this case, a PC simulated test is used, simulate the NAND device by reading/writing files, and monitor the simulated files on the PC. In order to achieve the purpose of testing, the Code defines the types of various NAND devices, this information is also used to generate the corresponding simulation file to adapt to different device simulators.

Specify the size (file_size_in_meg) and structure (blocks_per_meg, block_size) of the NAND device to be simulated, and generate the file g_filedisk according to the corresponding size and structure.

3. Uniform Loss Test

3.1 Purpose

The number of writes per block of the NAND flash device is limited in the application environment that requires real-time recording. To ensure the device life, the number of writes per block should be kept relatively average. maximum lifetime of the NAND flash device. This test records the number of writes per block to test the performance of the yaffs file system in terms of uniform loss.

3.2 Test Method

After the test code is added to the spare area of each page of the device simulation file, it is used to record the number of writes to the page. Because the write/erase operations are performed in block units, therefore, the number of write records on each page of each block is the same. In future tests, you can only use the block space on the first page to record the number of writes. Other spaces are used for other tests.

The test code is inserted into checkinit () and yaffs_feeraseblockinnand () (yaffs_fileem.cpp). The number of times that a simulated file is generated (a new simulated file) or read (a simulated file already exists) during device initialization; when the program executes the write erase function, it accumulates and saves the number of writes.

Test variables thrown by the test program; array that records the number of writes-g_ersnumarray [file_size_in_meg * blocks_per_meg], maximum-g_persmax pointing to the erased value, and minimum-g_persmin pointing to the erased Value

Wireout0.log and wireoutl are used for testing programs. log Files record the number of writes to each block. You can see that the number of writes to each block and the Maximum/minimum values are the same because the test time is long, take two files to avoid errors when writing the record file, and lose all the records. Write the two files in turn to ensure that at least one file is the most recent record before the system error.

3.3 test results

The yaffs file system uses unallocated space in "order" for new write operations, and erases the discarded block write and erase operations in the same "order, all are in the order of unallocated space or discarded space when the unused space of the system is less than a preset value, the system will reclaim blocks with discarded pages. This write and erase policy ensures the uniformity of the loss to a certain extent.

Although this mechanism meets the requirement of uniform loss to a certain extent, it still has problems and is not suitable for all embedded application environments. Assume that a 16 mb nand device is used, if there is 10 MB space for storing relatively fixed and infrequently modified data files, the files that are frequently modified can only be erased in the remaining 6 MB space, for the entire device, the system does not have a proper removal policy to move fixed files in the 6 MB space, uniform loss is not achieved for the entire device. In an application environment with a large amount of information recorded in real time, a corresponding shift policy function should be compiled to regularly move fixed files to ensure the uniform loss of the entire NAND device.

4. Power Failure and recovery performance test

4.1 Purpose

The file system should be able to restore (protect) Useful data to the maximum extent possible when the system suddenly loses power when a file is modified, the file protection methods after power loss can be divided into three types based on the actual situation:

① The old file is used to replace the newly written file. The newly written file (not written) is ignored. This protection method has many applications, such as power loss during configuration Update and the setting before power loss, the user is acceptable.

② Using a new file completely replaces the old file (the new file is retained as much as it is written). This protection method is suitable for text scenarios. For example, although new text messages are incomplete, however, based on the situation, the user can obtain part of the information. If the sender information is complete or predictable, the sender can be requested to resend the message.

③ Newly written files and old files are not written. The so-called "New and Old" protection method can be applied to dynamically updated files. However, this protection method is used for file read/write operations using offset, and garbled characters are generated.

4.2 Test Method

The test code generates a random power-down message, and simulates a power-down behavior. The test code is inserted into the yaffs_fewritechunktonand () (yaffs_fileem.cpp). Random power-down locations are generated when the data and spare areas are written, after a power failure is simulated, the program re-connects to the file system and reads the file being updated when the power is down. The results are given in comparison with the original file.

Test variables used by the Program: power loss category-g_tstpoweroff, 1 is data zone power loss, 2 is spare zone power loss during full simulation, the power loss category is randomly generated using testlog. log records the testlog result after power loss. log is enabled in the Add mode, and the new record is written at the end without affecting the original record results.

4.3 testing means

To simulate a "power loss" operation, you need to handle the actual power loss. After the power supply is re-supplied, the entire system starts again, including all the system parameters, system stacks, and on-site failure before the pay-as-you-go Startup File System powers down. It is difficult to simulate power-down behaviors during the test. Direct power-off is neither safe nor realistic, you can use the exit () function to abort the program to simulate the execution of the write operation. After a random number of bytes are written, use the exit () function to immediately stop the program. Then restart the program, read the files written when the power is down and analyze the files to check the new power-down protection function of the file system.

The above test method is not applicable to automatic testing, and it is impossible to manually perform a large number of tests. In this test, the author cleverly uses the try {} and catch (} structure, it not only simulates the actual power-down behavior, but also ensures the smooth automatic test.
The core code used to simulate power loss is as follows:


4.4 Test Results

YAFFS-NAND file system, only provide the above ② and ③ two file protection methods when opening the file, if the "truncated to 0" to open the existing file, the protection method is ②. If you use a new file to completely replace the old file, if you open the existing file in a modified way, the protection method is ③, and the new and old protection methods are used.

Note: When the data zone loses power, the above two protection methods are in good condition. If the test passes but the spare zone loses power, the file system has a high probability that it cannot read the files when the power is down. It is almost impossible to use the files normally. In actual power loss, the ratio of Data zone to spare zone (512: 16 ), the probability of a write operation occurring in the spare zone when the power is down is 3.03%, which is unacceptable. In addition, the file system does not provide protection for old files. This type of application (or more) needs to be implemented separately.

4.5 improvement on power-down protection of yaffs-Nand

4.5.1 added protection methods

The principle of the yaffs-NAND file system determines that only the above ② and ③ file protection methods must provide the first protection method, and the file system must be extended, add two functions and two structs:


Eonlyold, eonlynew, enewold, and edefault indicate three different protection methods and default protection methods (provided by the original file system). The sproinfo structure not only records the protection mode, but also records the file name, used to close files

Yaffs_openex () modifies the file opening flag based on the input protection method, and explicitly sets the original ② and ③ protection methods. When the protection method is eoniyold, yaffs-openex () will open another temporary file and return the pointer to the user using the passed sprolnfo structure. It will bring back the file name and file handle for yaffs_closeex () When closing the file () when the function closes a file, if the file protection mode is ② or ③, it will be closed directly. If the first mode is used, the original file will be deleted first, rename the new file to protect the file.

The usage is as follows:
① Use the default method. The new parameter is null by default and can be directly compatible with existing code.
② If you use the extension method, you need to form a sproinfo object, but when you pass its pointer to yafffs_openex (), you also need to pass its pointer to yaffs_closeex () for example:

The extension method after the improvement is tested. The results are the same as those before the improvement. However, the first protection method that only keeps the old files is added, and the program runs well.

4.5.2 changes that cannot be restored when the spare area loses power

The source code is thoroughly studied and it is found that the failure of power loss in the spare zone cannot be recovered. The spare zone tag information self-check part. The yaffs_gettagsfromspare () function reads tag information from the spare zone, call yaffs_checkeccontags () to check the tag information. However, the original code only corrects the spare of an ECC verification error and returns the}-layer function. The upper-layer function only records the number of fag errors, it is not processed. Therefore, when spare IX powers down. unrecoverable errors may occur.

Modify the return type of the yaffs_gettagsfromspare () function to int to return ECC verification error messages in the spare region when yaffs_checkeccontags () returns a tag validation error, yaffs_Get-TagsFromSpare () when the function returns this error to the place where the function is called, modify it accordingly: When an error occurs in the spare ECC, call yaffs_deletechunk (), delete this page because the spare error caused by power failure cannot be recovered. After modification, the page runs well.

Conclusion

The yaffs file system is designed specifically for nand flash memory, which makes the low-cost nand flash memory chip highly efficient and robust. However, there are still problems with the performance of the yaffs file system, this is not fully applicable to performance-demanding embedded systems. This paper tests two important indicators: Uniform loss and power-down recovery in the yaffs file system, and provides the test results, based on some problems existing in the testing process, the improvement scheme is put forward. The actual tests show that the system can be significantly improved after the improvement and can adapt to more application environments.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.