Chine _mysql to greatly optimize Mysql query performance

Source: Internet
Author: User
Tags mutex mysql query mysql version ranges cpu usage percona percona server

Review Mysql/innodb's history of improvement. You can easily find out. In the MySQL 5.6 stable version has never been so fast in read-only, it is easy to understand, as well as in Read-only (RO) has a good expansion. It is also expected to reach a higher level on read+write (RW). (especially when reading data is the primary work of the database)

However. We are also very pleased with the RO performance in MySQL 5.6, in the 5.7 version, the main focus on Read+write (RW), because in the processing of large data has not been able to meet our expectations. But RW relies on the RO. Can raise the speed again. With continuous improvement, the InnoDB team optimizes the performance per second of the 5.7 version with a strong push.

The following is the order for everyone to explain

In fact, there are two ways in which you can control internal links in a read-only workload in MySQL:

    • With a single table: Mdl,trx_sys and Lock_sys (InnoDB)
    • Multiple tables: Trx_sys and Lock_sys (mainly InnoDB)

The workload of any quick single meter range test is largely due to MDL links leading to locking. Multiple tables will be constrained by InnoDB internal widget (different tables will be protected by different MDL locks, so the link bottlenecks in MDL will be reduced). But again, it depends on the size of the workload--a much more read-only work measurement will perform better in MySQL5.6 (such as Sysbench Oltp_ro), while queries with fewer and faster workloads (such as Sysbench point-selects ( Using a foreign key to get a record) will make all links difficult and can only be measured in 16 nuclear-ht, and in the 32 core performance is poor. However, any workload such as point-select testing will work with all MySQL internals so that you can see the maximum possible performance (starting with the SQL parser, terminating and fetching values) ... This may also reach the maximum SQL query/sec (QPS) rate under your given MySQL version and given HW configuration.

The best we can get on Mysql5.6 is 250,000 queries per second, which is the best result of using SQL statement queries over that time Mysql/innodb.

Of course, only if you use the ' read-only ' feature to achieve this high speed (new features on Mysql5.6), in addition, you need to use autocommit=1, otherwise the CPU will be easily wasted on startup transactions, commit transactions, you will actually lose the overall performance of the system.

Therefore, the first improvement introduced on the Mysql5.7 is ' Automatic discovery of Read-only transactions (virtually every INNODB transaction is considered read-only until there is a DML declaration outside)---this largely simplifies the read-only transaction functionality, saves users and developers time, and does not have to manage the read-only transaction function. However, with this feature you still cannot achieve the best possible MySQL query rate per second, because CPU time is still wasted in the open, end state processing of transactions.

At the same time, Percona uses different schemes to solve the problem of "Transaction list" management (trx-list) and Trx_sys mutual exclusion links in InnoDB. The Percona solution behaves well when point-selects high load with transactions, but MySQL5.7 behaves generally (but I will not publish the 5.7 result because its code is not public) ... So, at least I can do some comparison now:

Observation results:

    • In the 8 tables in Mysql5.6,percona 5.5 and MySQL5.7, the same Roint-select-trx read-only test (with transactions) (2013.5-month results)
    • You can also see that we are far from the peak of 250,000/s in the same 16-core-HT configuration.
    • MySQL5.6 extends the link time in Trx_sys mutex access, and the number of requests per second is reduced since 64 users.
    • Percona5.5 can maintain a long time load and the request starts to decrease at 512 users per second
    • When the MySQL5.7 has been maintained for some time, the request is still not reduced per second (you can't see it in this picture for more users) ...


However, it is clear that the transaction should be avoided if you want to get the maximum potential per second query rate with MySQL.

Let's take a look at this is May 2013 our maximum query rate per second.

Tested in the same 1.8 tables, but did not use MySQL5.6 things:

Observation:

    • The test above is to keep MySQL5.6 always performed on 16 cores, then 16 core-ht,32 cores, 32 core-ht.
    • As you can see, the maximum query rate per second is larger than expected-—— on MySQL is 275,000 per second
    • The biggest result has reached 16 core-ht.
    • However, the results on the 32 core are not as good as the 16-core-ht (due to the disruption of the competition, in the same kernel, the configuration with 2CPU threads can better manage the threading competition-so true concurrency is still stored on 16 threads, not 32 cores)


The same tests on MySQL5.7 look very different, because the time period of the Lock_sys mutex in 5.7 is already very low, and the Trx_sys mutex related code gets the first change:

Observation results:

    • First you can see that 5.7 is better than 5.6 in the same 16-core-HT configuration.
    • After that, there is no obvious enhancement in the 32 kernel configuration!
    • The maximum request for 350,000/sec is reached under 32 core-HT Configuration!
    • It is easy to see from the above special (aggressive) read-only load test that we get a better result in the 32 core than 16, and we haven't started the Hyper-threading (at 32--ht) ... Cow! ;-)


On the other hand, there is still room for improvement, which is clear. Contention for the Trx_sys is still ongoing. We do not have sufficient CPU capacity to do useful work (there are still many CPU cycles used in the lock rotation) ... But now the results are much better than before, and much better than 5.6, so there's no reason to continue digging to improve this performance, we focus on the performance of the reading and writing load we used to spend a lot of space.

By the end of May, our performance meeting, Sunny added several new changes to the Try_sys mutex, from which the maximum number of queries available per second (QPS) could be reached 375k! This is not a sufficient performance improvement for 5.7, right? ;-)

At the same time, we went on to exchange views with the Percona team that recommended other ways to manage the Trx list, and their solution looked very interesting, But on the 5.5, such code does not show a higher number of queries per second (QPS), and the maximum number of queries (QPS) that can be performed on 5.6 per second (once tested Percona Server 5.6) is no larger than that of MySQL 5.6. The discussion, however, involves an interesting point of view: What effect does it have on read-only performance if there are some reading and writing loads running? ... And even if the MySQL 5.7 code is still running better under the same test conditions, the effect is very obvious (you can view my analysis here, but again, I'm not going to be able to show the results on 5.7 this time, because its code hasn't been published to the public--perhaps in a later article.

Since it also has an effect on any pure read and write load, there is plenty of motivation to sunnys the entire TRX List of code for a long time, but this experience is fascinating!

) Day after day, we are pleased to see that our query graph is getting taller every second until we have reached a query per second on the same 32-core Hyper-Threading Server 440k!

5.7 The number of results from the Select 8 tables made on development milestone Release 2:

No need to explain ...;-))

However, there is a little strange place-we try to analyze the impact of all bottlenecks and code changes with sunny through different tools. And in some tests, to my surprise, sunny observed a higher number of queries per second than I could. This "singularity" is related to the following factors:

    • Under high load, now 5.7 code is running near the hardware limit (mainly CPU) location, so each instruction is very important!
    • If you are using a UNIX socket or IP port, then the distinction will be very obvious!
    • Sysbench itself uses 30% of the CPU time, but the same test load uses the old version of Sysbench (with a shorter code path), it will only use 20%CPU, and the remaining 10% is used on the MySQL server.
    • So, with the same test load, using a UNIX socket instead of an IP port, and using Sysbench-0.4.8 instead of Sysbench-0.4.13, we'll get more queries per second than the 500k!-is easy, right? ;-))

Let's compare the difference between "before" and "after"

Observation results:

    • The CPU usage is reduced through sysbench.
    • Higher CPU availability on the MySQL server.
    • We have implemented 500,000 queries per second.

What else is there?

I may only mention: Kudos Sunny and the entire MySQL development team;

Let's take a look at the maximum per second query that is now selected with 8 table workloads.

    • MySQL-5.7.2 (DMR2)
    • MySQL-5.6.14
    • MySQL-5.5.33
    • Percona Server 5.6.13-rc60.5
    • Percona Server 5.5.33-rel31.1
    • MariaDB-10.0.4
    • MariaDB-5.5.32

Each engine is tested under the following configuration:

    • CPU Taskset:8 nuclear-ht,16 nucleus, 16 nuclear-ht,32 nucleus, 32 nuclear-ht
    • Number of concurrent sessions: 8,16,32 ... 1024
    • INNODB spin wait delay: 6,96

The best result is a comparison between any two specific combinations. By comparing the database engine, I got the following chart, which I have already mentioned in previous articles.

Faces are some comments:

    • There is no need to comment too much on the big gap in Mysql5.7, because it is obvious.
    • It is interesting, then, that the code base engine based on MySQL5.5 does not have any near-MySQL5.6 results.
    • This has confirmed that after using the MySQL5.6 code base engine, Percona server has reached MySQL5.6 level, yet MariaDB-10 is still on the road to explore.
    • So, no doubt, MySQL5.6 is the cornerstone of the code!
    • MySQL5.7 is another optimization extension on the basis of MySQL5.6.

What kind of extensibility does it have?

The answer is simple: MySQL5.7 is the only extension on this basis.

If you use an IP port and a heavyweight Sysbench-0.4.13, you get the following results:

The QPS is slightly lower, but the overall trend is exactly the same.

Scalability is also very similar:

Note: It is not good to bind too many workloads to a single table:

    • Reducing the InnoDB debate makes other arguments more obvious.
    • When the load is bound on a single table, the MDL debate will become more dominant.
    • It is expected that we will remain unchanged in the next DMRs.

There are many challenges ahead of us;
As a reference, my hardware configuration information for the above tests is as follows:

    • Server:32cores-ht (bi-thread) Intel 2300Mhz, 128GB RAM
    • Os:oracle Linux 6.2
    • FS: Enable "Noatime,nodiratime,nobarrier" Mount EXT4


My.conf:

Copy Code code as follows:
max_connections=4000
key_buffer_size=200m
Low_priority_updates=1
Table_open_cache = 8000
back_log=1500
Query_cache_type=0
Table_open_cache_instances=16

# files
Innodb_file_per_table
innodb_log_file_size=1024m
Innodb_log_files_in_group = 3
innodb_open_files=4000

# buffers
innodb_buffer_pool_size=32000m
Innodb_buffer_pool_instances=32
innodb_additional_mem_pool_size=20m
innodb_log_buffer_size=64m
join_buffer_size=32k
sort_buffer_size=32k

# InnoDB
Innodb_checksums=0
Innodb_doublewrite=0
Innodb_support_xa=0
Innodb_thread_concurrency=0
innodb_flush_log_at_trx_commit=2
Innodb_max_dirty_pages_pct=50
Innodb_use_native_aio=1
Innodb_stats_persistent = 1
innodb_spin_wait_delay= 6/96

# Perf Special
innodb_adaptive_flushing = 1
innodb_flush_neighbors = 0
Innodb_read_io_threads = 4
Innodb_write_io_threads = 4
innodb_io_capacity = 4000
Innodb_purge_threads=1
Innodb_adaptive_hash_index=0

# Monitoring
innodb_monitor_enable = '% '
Performance_schema=off


If you need it, the binary version of the Linux Sysbench is here:

    • Sysbench-0.4.13-lux86
    • Sysbench-0.4.8-lux86


The Sysbench command to run the point-selects test using a UNIX socket is as follows (8 processes are started in parallel):

Copy Code code as follows:
ld_preload=/usr/lib64/libjemalloc.so/bmk/sysbench-0.4.8--num-threads=$1--TEST=OLTP--oltp-table-size=10000000 \
--oltp-dist-type=uniform--oltp-table-name=sbtest_10m_$n \
--max-requests=0--max-time=$2--mysql-socket=/ssd_raid0/mysql.sock \
--mysql-user=dim--mysql-password=dim--mysql-db=sysbench \
--mysql-table-engine=innodb--db-driver=mysql \
--oltp-point-selects=1--oltp-simple-ranges=0--oltp-sum-ranges=0 \
--oltp-order-ranges=0--oltp-distinct-ranges=0--oltp-skip-trx=on \
--oltp-read-only=on Run >/tmp/test_$n.log &


The Sysbench command to run the Point-selects test using an IP port is as follows (8 processes are started in parallel):

Copy Code code as follows:
ld_preload=/usr/lib64/libjemalloc.so/bmk/sysbench-0.4.13--num-threads=$1--TEST=OLTP--oltp-table-size=10000000 \
--oltp-dist-type=uniform--oltp-table-name=sbtest_10m_$n \
--max-requests=0--max-time=$2--mysql-host=127.0.0.1--mysql-port=5700 \
--mysql-user=dim--mysql-password=dim--mysql-db=sysbench \
--mysql-table-engine=innodb--db-driver=mysql \
--oltp-point-selects=1--oltp-simple-ranges=0--oltp-sum-ranges=0 \
--oltp-order-ranges=0--oltp-distinct-ranges=0--oltp-skip-trx=on \
--oltp-read-only=on Run >/tmp/test_$n.log &

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.