MySQL Performance Optimization · Discussion on the flush strategy of InnoDB buffer pool

Source: Internet
Author: User
Tags mutex percona percona server


mysql performance optimization · Discussion on the flush strategy of InnoDB buffer pool


Background



We know that InnoDB uses buffer pool to cache data pages that are read from disk to memory. Buffer pool is usually composed of a number of memory blocks plus a set of control structure objects. The number of memory blocks depends on the number of buffer pool instance, but in the 5.7 release the default is to allocate memory blocks in a 128M (configurable) chunk unit, which is intended to support the online dynamic resizing of the buffer pool.



Each block of memory in Buffer pool allocates memory by mmap, so you will find that the virtual memory is very high while the instance is starting, and the physical RAM is low. These large chunks of memory are then divided into frames in 16KB to store data pages.



Although in most cases the buffer pool is storing data pages in 16KB, there is one exception: when using a compressed table, you need to store both the compressed page and the page in memory, and the binary buddy allocator algorithm to allocate memory space for compressed pages. For example, we read a 8KB compressed page, take a 16KB block from the buffer pool, take 8KB, the rest of the 8KB on the idle list, if you are immediately followed by another 4KB compression page read into memory, you can split 8KB from this 4KB, At the same time, the remaining 4KB is placed on the idle list.



To manage the buffer pool, each buffer pool instance is managed using several linked lists:


    • The LRU linked list contains all data pages that are read into memory;
    • Flush_list contains dirty pages that have been modified;
    • UNZIP_LRU contains all unpacked pages;
    • The free list holds the currently idle block.



In addition, to avoid scanning the LRU when querying data pages, a page hash is maintained for each buffer pool instance, and the corresponding page can be found directly by Space ID and page No.



In general, when we need to read a page, we first find the corresponding buffer pool instance based on Space ID and page No. Then query the page hash, if not in the page hash, it means you need to read from disk. Before reading the disk, we need to allocate a free block for the data page that will be read into the memory. When there is an idle block on the free list, it can be extracted directly from the list, and if not, the page will need to be evicted from the UNZIP_LRU or LRU.



There are certain principles to follow (reference function Buf_lru_scan_and_free_block, 5.7.5):


    1. First try to expel the decompression page from the UNZIP_LRU;
    2. If not, try to evict the page from the LRU list;
    3. If you still cannot get the idle block from the LRU, the user thread will participate in the brush dirty, try to do a single page FLUSH, separate the dirty page from the LRU, and then try again.


The page in Buffer pool is modified, not immediately written to disk, but by a background thread, and as with most database systems, a dirty page's write follows the log-first Wal principle, so that at each block, a recently modified LSN is recorded. When you write a data page, you need to ensure that the redo currently written to the log file is no less than this LSN.



However, a brush-dirty strategy based on the Wal principle may pose a problem: when the write load on the database is too high, the redo log is generated very quickly, and redo log may quickly reach the synchronization checkpoint point. Brush dirty is required to propel the LSN. Because this behavior is triggered by the user thread checking to redo log space is insufficient, a large number of user threads will likely fall into this inefficient logic, yielding a noticeable performance inflection point.




Page Cleaner Thread



In MySQL5.6, a separate page cleaner thread is opened to brush the LRU list and flush list. The default runs every second, and the 5.6 release provides a whole bunch of parameters to control the flush behavior of page cleaner, including:


innodb_adaptive_flushing_lwm, 
innodb_max_dirty_pages_pct_lwm
innodb_flushing_avg_loops
innodb_io_capacity_max
innodb_lru_scan_depth


Here we do not introduce, in general, if you find redo log propulsion very fast, in order to avoid the user thread caught in the brush dirty, can be resolved by a large innodb_io_capacity_max, which limits the dirty page per second refresh, the value can increase the page Cleaner the workload per second for the thread. If you find that the free list in your system is insufficient, you will always need to evict the dirty page to get the idle block, and you can adjust the innodb_lru_scan_depth appropriately. This parameter represents the depth of the sweep from the LRU on each buffer pool instance, which helps to free up more free pages and avoids the user thread from doing a single page flush.



In order to improve extensibility and brush dirty efficiency, several page cleaner threads were introduced in version 5.7.4 to achieve the effect of parallel brush dirty. Currently page cleaner is not bound to buffer pool, its model is a coordinator thread + multiple worker threads, and the reconcile thread itself is also a worker thread. So if Innodb_page_cleaners is set to 4, then it is a coordinated thread, plus 3 worker threads, working as producer-consumer. The work queue length is the number of buffer pool instance, expressed using a global slot array.



After the reconcile thread determines the number of page and lsn_limit to flush, it sets the slot array, sets the state of each slot to page_cleaner_state_requested, and sets the target page number and lsn_limit. Then wake up the worker thread (pc_request)



After the worker thread wakes up, takes an unoccupied slot from the slot array, modifies its state, indicates that it has been dispatched, and then operates on the buffer pool instance that corresponds to the slot. Until all slots are consumed, the next round is entered. In this way, multiple page cleaner threads implement the concurrent flush buffer pool, which increases the efficiency of the flush dirty page/lru.




INNODB Flush Strategy Optimization for MySQL5.7



In previous versions, because there might be multiple threads operating in the Buffer pool Brush page (which releases the buffer pool mutex when the brush was dirty), it was necessary to backtrack to the tail of the linked list each time a page was brushed, making the time complexity of the scan BP list the worst O (n*n).



In the 5.6 version of the Flush list scan made a certain repair, using a pointer to record the currently flush page, after the flush operation is completed, then see if the pointer is modified by another thread, if modified, back to the tail of the list, otherwise there is no need to backtrack. But this repair is not complete, in the worst case, the complexity of time is still not ideal.



As a result, this problem has been completely repaired in version 5.7, using multiple pointers called hazard pointer, which stores the next target page to be scanned when a list is scanned, divided into several categories for different purposes:


    • FLUSH_HP: Used as batch brush flush LIST
    • LRU_HP: Used as a batch brush LRU LIST
    • Lru_scan_itr: Used to evict a replaceable page from the LRU list, always starting at the end of the previous scan, not the LRU tail
    • Single_scan_itr: When there is no free block in buffer pool, the user thread will evict a replaceable page or flush a dirty page from the flush list, always starting at the end of the previous scan, rather than the LRU tail.


The latter two types of HP are called by the user thread when attempting to acquire an idle block, and the pointer is reset to the LRU trailer only when it advances to a page that has a buf_page_t::old set to true (approximately three-eighths positions from the tail of the LRU link table to the total length).



These pointers are allocated when the buffer pool is initialized, and each buffer pool instance has its own HP pointer. When a thread operates on a page in buffer pool, such as when a page needs to be removed from the LRU, if the current page is set to HP, the HP is updated to the previous page of the current page. When the flush operation of the current page is completed, the next flush is done directly using the page pointer stored in HP.




Community Optimisation



As always, Percona server has done a lot of optimizations for the buffer pool flush in version 5.6, with the following major modifications:


    • Optimized brushing LRU Process Buf_flush_lru_tail
      This function is called by the page cleaner thread.
      • Native logic: Flush each buffer pool instance in turn, and the depth of each scan is configured by the parameter innodb_lru_scan_depth. And within each instance, it is divided into multiple chunk to invoke;
      • The modified logic is: each time flush a buffer pool of LRU, only brush a chunk, and then the next instance, brush all instnace, then go back to the front and brush a chunk. In short, the centralized flush operation is decentralized, with the aim of dispersing the pressure, avoiding a centralized operation on a instance, giving other threads more access to the buffer pool.
    • Allows setting the time-out of the brush Lru/flush list to prevent the flush operation from being too long for other threads (such as a user thread attempting to do a single page FLUSH) stall, and when the time-out is reached, the page cleaner thread exits FLUSH.
    • Avoid user threads from participating in the brush buffer pool
      When a user thread participates in the brush buffer pool, because the number of threads is not controllable, there will be serious competition overhead, such as a single page flush when the free list is insufficient, and a dirty page flush when redo space is insufficient, which can severely affect performance. Percona server allows you to choose to have the page cleaner the thread to do the work, and the user thread only needs to wait. For efficiency reasons, the user can also set the CPU scheduling priority for the page cleaner thread.
      In addition, after the page cleaner thread has been optimized, you can know that the system is currently in a synchronized refresh state, you can do more intense brush dirty (furious flush), the user thread involved, it may only play a counter-reaction.
    • Allows setting the page cleaner thread, the purge thread, the IO thread, the CPU scheduling priority of the master thread, and the priority of obtaining the InnoDB mutex.
      • Use the new separate background thread to brush the LRU list of buffer pool, stripping this part of the workload from the page cleaner thread.
        In fact, it is the direct transfer of the LRU code to the standalone thread. From the previous version of Percona, it is constantly strengthening the background thread, so that the user thread less involved in the brush dirty/checkpoint such a time-consuming operation.


MySQL Performance Optimization · Discussion on the flush strategy of InnoDB buffer pool


Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.