How to mine the IO performance of NAND Flash

Source: Internet
Author: User

The NAND flash chip is the basic storage unit for SSD. The development and structure change of the NAND flash chip technology will promote the rapid development of the entire flash storage industry. When designing the flash storage system, especially when designing the NAND Flash Controller, SSD disk or card, you need to have a deep understanding of the operating methods, interface commands, and timing of the NAND Flash. Although a NAND flash chip is very small and adopts LGA or TSOP encapsulation, its internal structure is still very complicated. Especially with the increasing storage density, there are more and more abstract concepts in NAND Flash, such as flash particles, device, die, plane, block, and page. Particles are the basic chip encapsulation particles we usually see. Each particle can encapsulate multiple devices. Each device can be regarded as an independent chip with independent control and data signal lines; each device can be composed of multiple die. Each die has an independent operation register and status indication signal. The external signal line is shared. A die can be divided into multiple plane, each plane has an independent data register group that allows concurrent operations on multiple plane under certain circumstances. A plane can be divided into several blocks; each block is an independent data block erasure unit. A block is finally divided into multiple pages, and each page is the basic unit of read/write operations.

 

Taking the micron flash chip as an example, the mt29f32g chip consists of one die, including two plane; The mt29f64g chip consists of two die, and the two die belong to two devices respectively, each die contains two plane. The Basic die unit structure of mt29f32g and mt29f64g chips can be described as follows:

 

650) This. width = 650; "Title =" 1.jpg" src = "http://s3.51cto.com/wyfs02/M01/47/61/wKioL1P6A0WyY8OBAAE8CH7yLs8210.jpg" alt = "wkiol1p6a0wyy8obaae8ch7yls8210.jpg"/>

 

The addresses of mt29f32g and mt29f64g chips can be defined as follows:

 

650) This. width = 650; "Title =" 2.jpg" src = "http://s3.51cto.com/wyfs02/M00/47/60/wKiom1P6AkXBkR_5AADYgD73rWU029.jpg" alt = "wkiom1p6akxbkr_5aadygd73rwu029.jpg"/>

 

For the mt29f128g micron chip, the storage density is improved, so the entire chip consists of two devices, each of which includes two die, each containing two plane. The structure of the device, the basic storage unit inside the mt29f128g chip, is described as follows:

 

650) This. width = 650; "Title =" 3.jpg" src = "http://s3.51cto.com/wyfs02/M02/47/61/wKioL1P6A3LSn14mAAGooE66jWI246.jpg" alt = "wkiol1p6a3lsn14maagooe66jwi246.jpg"/>

 

Mt29f128g address information is defined in the following table:

 

650) This. width = 650; "Title =" 4.jpg" src = "http://s3.51cto.com/wyfs02/M01/47/60/wKiom1P6AnDwDJg_AADVgivbI3Y433.jpg" alt = "wkiom1p6andwdjg_aadvgivbi3y433.jpg"/>

 

After learning about the internal structure of NAND Flash, we need to think about how to make good use of the internal structure of NAND flash at the software level to improve the overall I/O performance.

 

The plane unit has independent data registers. Can I concurrently perform plane operations to improve I/O performance? Taking the mt29f128g chip as an example, each die can be divided into two physical plane. Each plane contains a 4314-byte data register, a 4314-byte data cache register, and a block array consisting of 4 K pages. Since the two plane data registers are physically independent, the two plane can execute the program, read, and erase operations at the same time. In this way, the I/O performance of the NAND Flash system can be improved. The sequence diagram of reading data simultaneously by two plane is as follows:

 

650) This. width = 650; "Title =" 5.jpg" src = "http://s3.51cto.com/wyfs02/M00/47/61/wKioL1P6A5-QuJrdAAEie1b7Suw561.jpg" alt = "wKioL1P6A5-QuJrdAAEie1b7Suw561.jpg"/>

 

From the above sequence diagram, we can see that the concurrent operations between two plane are not so random. When you need to read data from both plane, first load the address information of the first plane, and then load the address information of the second plane. After both addresses are loaded, issue the end command for 30 h. Then the entire die enters the busy state, and the R/B # signal is low. When the die is busy, no operation can be performed on it. At this stage, data is loaded from the NAND Flash medium to the registers of two plane. After the R/B # signal is restored, you can read data from two plane. It is worth noting that data reading in the second plane requires the support of the 06h-e0h command. From this point of view, because the two plane only have independent data registers and share the operation registers, they cannot achieve very random data concurrency.

 

Shows the time sequence of two plane concurrent write operations:

 

650) This. width = 650; "Title =" 6.jpg" src = "http://s3.51cto.com/wyfs02/M02/47/60/wKiom1P6Ap_ytIBnAABVRZ4he5U657.jpg" alt = "wkiom1p6ap_ytibnaabvrz4he5u657.jpg"/>

 

Similar to concurrent read operations, the concurrent write between two plane is not random and the same operation must be performed at the same time. Two plane concurrent operations need to initiate commands at the same time. For write operations, you must first load two plane access addresses. The 11h ending character of the first address period does not trigger real programming operations. The 10h ending character of the second address period does trigger programming operations. Once the programming operation is started, the status signal R/B # is set to low until the programming operation is completed.

 

Shows the time series of two plane concurrent erasure operations:

 

650) This. width = 650; "Title =" 7.jpg" src = "http://s3.51cto.com/wyfs02/M02/47/60/wKiom1P6ArjwRHRGAACcUz3IgDA718.jpg" alt = "wkiom1p6arjwrhrgaaccuz3igda718.jpg"/>

 

Similar to read/write operations, the two plane must concurrently load the address information of the two plane, and the background must concurrently execute the erasure operation. Compared with serial operations, this concurrent operation can improve the overall performance of NAND Flash.

 

Therefore, from the above description, although the data registers between two plane are completely independent, the operation registers are shared, the read/write and erase operations can be concurrently executed on the two plane. However, the condition for this concurrent operation is that two plane must perform the same operation at the same time. Instead of two plane, you can perform different operations at will, independently, and concurrently. This is the limitation of two plane concurrent operations, but even so, if the software layer can design a good algorithm to fully enable concurrent execution of multiple plane, the IO performance can be greatly improved.

 

In the NAND flash chip, a real independent concurrent unit is die. Taking mt29f128g as an example, a device has two die. In the chip, the two die have independent operation registers and State signal lines, and their external control and State signal lines are shared. In this case, the chip provides an interleave operation mode, which can completely concurrently read/write and erase the two die. Is the concurrency read sequence of two die:

 

650) This. width = 650; "Title =" 8.jpg" src = "http://s3.51cto.com/wyfs02/M00/47/60/wKiom1P6AtXCQBcWAAByejRVDu4122.jpg" alt = "wkiom1p6atxcqbcwaabyejrvdu4122.jpg"/>

 

It can be seen that the two die have an independent R/B # signal line, and the external State signal line is the "logic and" Result of the internal state signal. The operations of the two die can be independent and concurrent, but because the external interface is shared, data output also needs to be serialized.

 

Shows the concurrent write sequence of interleave:

 

650) This. width = 650; "Title =" 9.jpg" src = "http://s3.51cto.com/wyfs02/M00/47/61/wKioL1P6BAWT1ZNeAAB5qnnX0O4853.jpg" alt = "wkiol1p6bawt1zneaab5qnnx0o4853.jpg"/>

 

Similar to concurrent read operations, the two die can perform write operations independently and concurrently.

 

At the device level, such concurrent operations are more casual. Different devices have completely independent external interfaces. Therefore, the two devices can perform independent operations at the same time.

 

In summary, there are three types of concurrent execution units in the NAND flash chip: device, die, and plane. Among them, plane is independent of data registers. Therefore, multiple plane can execute the same operation concurrently. Die has independent operation registers, independent internal status signal lines, and shares External Interfaces. Therefore, multiple die can be operated concurrently independently. devices have independent control and data signal lines. Therefore, multiple devices can be concurrently operated. Using these concurrent units in the NAND flash chip can greatly improve the I/O performance of flash storage.

 

How to mine the IO performance of NAND Flash

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.