Dockone WeChat Share (67): Flash optimization Testing and application under the Internet scenario

Source: Internet
Author: User
This is a creation in Article, where the information may have evolved or changed.
"The editor's words" flash existence in the past few years the storage domain development is very fast, the application is also more and more widespread, how can better use the flash memory, this share tells some flash related optimization and the application.

Flash application Scenarios

    • Database
    • Nosql
    • Distributed storage
    • Cdn
    • Public cloud storage

Combined with the above scenarios, flash memory is primarily suitable for scenarios with high random IO requirements and bandwidth requirements. Scene selection, but also to play the advantages of Flash. At present, the above business in the next few years will be more rapid development in the public cloud storage this part. is a manufacturer of cloud disk comparison, you can see that the price of flash memory is very close to the mechanical hard disk, and single from the cost of each IOPS look, cost-effective will be higher.

Flash Overview

Solid-state drives, but can be understood from a broad sense, starting from 2010 in the Internet industry, large-scale application, performance and stability has been large-scale cluster online verification, application scenarios are very extensive. Of course, the IOPS of flash memory is several orders of magnitude higher than traditional mechanical drives, but the more core is the lower latency, the more advantageous.

It is good to note that the Flash has an elevation of access delay.

Mentioned Flash, had to mention the very basic component NAND in flash memory. NAND classification is now also very much.


Why do we have to do the test?
    • Learn about Products
    • Know Yourself
    • Optimize yourself
    • Optimize cost Models

So it is a very important problem to be able to test more efficiently in the face of so many vendors and products. While everyone is now turning to cloud services, there is not much direct access to hardware, but cloud-vendor testing remains an important part.

is the test very low?
    • The test is simple?
    • No technical content?
    • Is the test boring?

is the storage technology stack we need to know.

Test criteria:
    • Clear goals
    • Efficient
    • Completeness of
    • Can be quantified
    • can be compared
    • Output

Test process:
    • Clear Test requirements
    • Clear Test objectives
    • Selecting test tools and test models
    • Develop a test plan
    • Test process tracking
    • Test data validation
    • Test report

Test tools:
    • IO level: FIO,SYSBENCH,IOMETER,DD, etc.
    • Oltp:sysbench,tpc-c
    • Auxiliary tools: Tcpcopy,tcprstat,pt-log-player

    • Open-source multithreaded performance testing tools
    • Support for tests such as CPU IO Mutex OLTP
    • Test cases can be customized with LUA scripting
    • Common Insert Select and OLTP three scenarios

Test Pain points:
    • A lot of repetitive work
    • Standards are not uniform
    • Very long test cycles
    • High labor costs
    • Exception handling during testing
    • Test data processing and test reports

To solve the pain point is the first normalization, mainly the following aspects:
    • Standardized test objectives
    • Standardized testing tools
    • Standardized testing processes
    • Standardized test reports

Automated Test Flow:
    • Automated Test framework
    • Based on Python
    • Contains the overall standard test process
    • Coverage of mainstream test tools
    • Processing and generating reports at the data
    • Custom test Plans

is the test flowchart

The benefits of automation are also obvious:
    • Significant savings in manpower
    • Improve test efficiency
    • A more complete test
    • Have the energy to do more in-depth test optimization

A few things to keep in mind when testing flash:
    • The performance we need is steady state
    • OP
    • Nand
    • Write it All
    • The test time should not be too short
    • Performance Jitter
    • Monitoring

Some problems with MySQL testing:
    • Test data set size, at least over billion
    • and the memory buffer ratio, to see the performance under the small cache
    • Physical Reading
    • Complexity of transactions
    • Multi-table concurrency

Some points of attention at the system level:
    • File system: EXT4 XFS
    • IO scheduling algorithm
    • IO CPU Affinity
    • SCSI-MQ/BLK-MQ (new kernel feature)

Test optimization combined

InnoDB Compression test:
    • InnoDB built-in compression
    • Based on Zlib library
    • The theory can reach about 50% compression ratios
    • But there's a loss of performance
    • CPU Time Swap storage space
    • Benefits for SSD Life
    • How to use it well?

Based on our previous testing process, we can get the conclusion that the InnoDB compression ratio is about 50%, the write performance loss is large, the loss ratio is about 70%. Based on this conclusion, we can choose whether to use InnoDB compression for our online business.

    • A storage engine for MySQL that supports transaction ACID features
    • Support for multiple versioning (MVCC)
    • Based on fractal Tree Index, ideal for writing dense scenes
    • High compression ratio
    • Native support for online DDL
    • Mainstream branches are supported
    • Fee to open source

This is our test result, we can see tokudb better compression ratio and better write stability, of course, the cost is higher CPU consumption.


    • Now is no longer the era of performance for Kings
    • It's more important to really understand your needs.
    • Discover flash performance, hardware and software integration
    • Expand Flash Application Space
    • To do something truly worthwhile
    • How to do a better combination of hardware and software (in fact, now the hardware is ahead of some software)

At the end of the picture, don't just live in the present, brave to accept the new technology, the courage to try and wrong, of course, the cost of trial and error and the benefits to be assessed and controllable. In fact, a lot of technical understanding, and may not be said to be "evil."

The above content is organized according to the July 5, 2016 night group sharing content. Share people Yang Shanggang, Panda TV DBA, former Sina Senior Database engineer, is responsible for optimizing the core database architecture of Sina Weibo, as well as database-related server storage selection design. Dockone Weekly will organize the technology to share, welcome interested students add: Liyingjiesz, into group participation, you want to listen to the topic or want to share the topic can give us a message.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.