[Reprint] Summary and reflection on the MySQL optimization of Sina Weibo

Last Update:2015-07-19 Source: Internet

Author: User

Tags percona

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Original: http://mp.weixin.qq.com/s?__biz=MzA4Nzg5Nzc5OA==&mid=206762682&idx=1&sn= 1233ed1496d7fd059d247329f3d3a183&scene=1&key= C76941211a49ab587d35d0d840a84ff2e3948510bca7698783e134b95c3e8ad0a30d1f83897e9c764e289a1011c565db&ascene=0 &uin=mjk1odmyntyymg%3d%3d&devicetype=imac+macbookpro9%2c2+osx+osx+10.10.3+build (14D136) &version= 11020012&pass_ticket=gji1zstfd3ify8npism73eddcyhehf3pnzgmnpzerbyqm7llmmscbvsbramc%2fpt3

This article is based on the efficient operation and Maintenance series Group of a wonderful sharing organized. The "Efficient operation and maintenance" public number as the official only public number of this series group, original and exclusive debut.

Please pay attention to the "efficient operation and maintenance" of the public number, to pay attention to and free participation in the "Operation and Maintenance forum" Monthly offline communication activities, and the first surely dry full of original articles (see the end of the article).

Edit

Gao Hao Miao-Beijing, Xu Wenhui @21v (article collation)
Gao Hao Miao-Beijing (release)

Guest Introduction

Yang Shanggang, the original Sina Senior DBA, is now responsible for the database in the U.S. map, Micro Bo Zolker. 2011 joined Sina, early mainly responsible for Sina Weibo core database architecture design, later mainly responsible for the database platform hardware and software optimization.

Topic Introduction

Some strategies for MySQL database performance optimization

MySQL performance optimization

Some general optimization strategies for MySQL performance optimization:

Read/write separation
Sharding
Parameter optimization
Index optimization
System optimization
Hardware optimization

Read/write separation

Read and write separation is a more conventional optimization strategy, this is also easy to understand and implement, the main purpose of the scheme is to achieve the isolation of read and write, reduce mutual interference, most of the internet scene is read and write less business, so this strategy is no problem.

Sharding

Sharding splitting is a strategy to redistribute data through a certain strategy, which mainly solves the problem of single-instance write pressure or excessive capacity. But Sharding brings many problems, such as increased operations management costs and increased complexity of business access.

Sharding This strategy in the early stage of the microblog pressure is still very much, every day a variety of split is very good, but in fact later found that split is very boring, calm down to find that many of the split is not so urgent or unnecessary.

The problem of splitting up every day is to lead to no time to do more in-depth optimization and automation work, so the split must be done in a timely manner.

Sharding split is divided into two dimensions, vertical split and horizontal split, the horizontal split is mainly to re-distribute the data without changing the schema, while the vertical split is mainly on the business level decoupling. Basically the two should be used together.

So the control of the granularity is very important, from the current hardware level, a single sharding control within 2.3 billion is not a problem.

Weibo at the time the largest single table in 6 billion +, single-table capacity over T,DBA sometimes also want to "lazy" some.

Master-Slave delay optimization

In fact, MySQL is another easy to be criticized is the master-slave replication delay problem, this is mainly the early MySQL replication single-threaded design problems.

The cause of the delay is mainly two points, the main database write too large caused from the library SQL single-threaded performance can not keep up, or from the library read the pressure is too large to affect the SQL single-threaded. Of course, MySQL 5.6 is also beginning to introduce parallel replication, but the granularity is still very large, and is based on library replication, so the increase is limited.

Just from the two point cause of the main cause or IO bottleneck is the majority of factors, so the first choice to solve the delay problem is the use of high-performance IO storage, followed by the use of parallel replication scheme, again using sharding split.

Schema optimization

To talk about schema optimization, the main consideration is the table structure of the field design and index design rationality, and the sub-table strategy is reasonable. It is very important for post-management optimization.

The column type is good enough, the smaller the better, the simpler the better, the more simple the data type, can use the integer type as far as possible without strings, such as save time and string, avoid null, the primary key use self-increment as far as possible, do not recommend the use of strings, especially InnoDB single-table index number not more than five Index design differences, avoid redundant indexes, character set selection, as far as possible Utf8,emoji character selection utf8mb4.

Parameter optimization

Then there are some parameters optimization strategies of MySQL, mainly InnoDB-level parameter optimization, MySQL server layer parameter Optimization Performance Improvement Limited

Innodb_file_per_table
Innodb_buffer_pool_size
innodb_flush_log_at_trx_commit= 0 1 2 (data security related)
Innodb_log_file_size
Innodb_page_size
Sync_binlog (related to data security)

Pay particular attention to
innodb_flush_log_at_trx_commit and sync_binlog settings, if the data security requirements are high, the recommendation is set to 1

Hardware optimizations primarily, NUMA, large memory, and SSD-related usage and optimization

Numa in the current multi-core architecture, theoretical optimization is valuable, but the actual test results are limited, but also increased management costs. So the MySQL branch like Twitter also recommends turning off Numa. The reason NUMA does not work is primarily due to the limited impact that cross-node access has on MySQL overall latency.

SSD is a powerful tool in MySQL performance improvement. Weibo's well-known Big V once said that "the last decade to really change the database technology is Flash technology", and like Ali's go to the IoE is also inseparable from the application of SSD.

At that time in Sina's application mode is 2U server, 10 SSD do Raid5, performance is good.

Echo Noop/deadline >/sys/block/[device]/queue/scheduler (most noticeable performance improvement)
Echo 2 >/sys/block/[device]/queue/rq_affinity (CentOS 6.4 +)
echo 0 >/sys/block/[device]/queue/add_random (Close file system barrier)

Optimization of system parameters for SSDs:

Optimized MySQL parameters for SSDs above 5.5 to improve innodb_write_io_threads and innodb_read_io_threads,innodb_io_ Capacity need to be large, log files and redo put to mechanical hard disk, Undo put to SSD Atomic write, do not need double write,buffer innodb compression, reduce SSD life wear, single-machine multi-instance +cgroup.

FAQ

Question one: About Amazon's Aurora implementation difficulty
From Amazon's disclosure, the difficulty is still relatively large. First of all from the upper database level and the original MySQL gap is very large, should also be rewritten or rewritten, this has been a lot of discussion on Weibo, in addition to file systems and hardware storage, Amazon is also a deep customization, Equivalent to the original InnoDB of some data-tolerant security policy against the file system and storage plane implementation.

Question two: Do you use PCI-E card now?
The United States map now do not, the original Sina used a lot of, single from the ioPS is high, the actual running business gap is not big, unless you read and write the volume is very large.

Question three: What are the suggestions for reducing the master-slave delay in addition to the improved IO I just talked about?
That is, parallel copy and split is valid, there are some pre-read from the library scheme, pre-read relay log, the promotion is not big.

Question four: why is it advisable to put redo on a mechanical plate?
File read and write characteristics, redo is mainly sequential write-based, SSD is more adept at random write, the actual we are also put SSD, in fact, SSD life is still good, put together do not distinguish between no problem.

Question five: What scenario does not need to be split?
is to see how much data you have and how much you visit. From the general scenario, a single sharding control within 200 million is more reasonable, or according to their own scene to determine whether the bottleneck is only sharding to solve, can not use it. General large table plus field with Pt-online-schema can also. But like the 6 billion table we were at that time, it was really troublesome to expect to add fields.

Question six: What are the main changes to the original MHA? Galera anyone use it?
We did the main rewrite of his switching logic section at that time. Galera native should not be used, most of the Percona is based on Galera package Percona xtradb cluster. As far as I know Sohu and go everywhere in use, pit or a lot of, need to the code bottom has a better control of the force.

Question Seven: Can you change the MHA section to share? Do you use MARIADB, this can be copied in parallel? Do you use triggers?
The part we modified is mainly the core switching logic rewritten by Python, and log processing is native. MARIADB can be replicated in parallel, we do not mariadb, the actual online use of the domestic is not much. Try not to use triggers online.

Question eight: Can you talk about SQL optimization?
SQL optimization involves more miscellaneous things, MySQL equivalent query best, as far as possible to use the index, the use of explain and pt-query-digest.

Question nine: What problems have you encountered with MHA, MHA automatically switch if the data is lost, how to recover? PXC and MHA in the production of how to choose, in addition you mentioned your side part of the manual, part automatic, what is the reason?
MHA just as far as possible to ensure that you do not lose, if you complete the completion of data completion is still a problem, only to see if the business log can be restored. PXC is still a lot of restrictions, such as cross-IDC and the number of cluster nodes, operation and maintenance costs are much higher than the original. The goal is to fully automate, some core important business may require manual intervention to determine, there will be port grading strategy.

Question 10: What about the official fabric?
Currently in the lab stage, there is still a distance from production.

Question 11: Read/write separation, are you a different IP written by the programmer? Or a proxy?
DNS domain name. At that time also developed several versions of middleware, various reasons, the online use is not very extensive. In fact, the microblog used is similar to the practice of TDDL, proxy encapsulated in the front-end.

Question 12: Say Backup, cold, mysqldump, xtrabackup what software you use, say why, give some advice
Generally use xtrabackup, unless you want to do a logical backup, use Mysqldump.

How to develop happily together

This is a new era! Everyone has his own voice, deserves to be respected, and has the opportunity to be respected.

High-efficient operation and maintenance series group is the high-end operation and maintenance circle, operation and maintenance industry vertical social model. Existing members more than 800, including the operation and maintenance of the director and above level of more than 300 members.

"Efficient operation and maintenance" of the public is worth your attention, as the only official public number of efficient operation and Maintenance series group, the weekly publication of a number of dry goods full of original Good article: from the series group of discussion essence, operation and Maintenance forum online/offline activities wonderful sharing and some group of friends original. "Efficient operation and maintenance" is also the Internet column, "Efficient operation and maintenance of best practices" and Operations 2.0 official public number.

Come, friends, join the festivities.

"Tip" at present, efficient operation and maintenance of two groups are full, if you wish, you can add Shida personal number Xiaotianguo as friends, and apply to join our chat group (technical discussion + Some water paste, has come from 1 groups and 2 people like lively friends).

Important: Unless previously authorised, you can reprint this article 2 days after the public number is published. To respect the knowledge, please reprint the full text, and include the bank and the following QR code.

[Reprint] Summary and reflection on the MySQL optimization of Sina Weibo

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More