Using MySQL as a NoSQL, mysqlnosql
FROM:Using MySQL as a NoSQL-A story for exceeding 750,000 qps on a commodity server
Due to the limitations of MySQL, many sites adopt the MySQL + Memcached architecture. Some other sites use NoSQL instead of MySQL, such as TokyoCabinet/Tyrant. It is undeniable that NoSQL is much faster than MySQL in some simple queries (especially PK queries. In addition, the vast majority of queries on the website are such simple queries.
Like other large companies, DeNA faces similar problems. But in the end we use a completely different method, only MySQL is used. We still use Memcached for front-end caching (for example, preprocessing HTML, quantity/summary information), but at the backend, we neither use Memcached to cache any records nor use NoSQL. Why? Because our MySQL can achieve better performance than other NoSQL products. In our benchmark test, a common MySQL/InnoDB 5.1 server reaches 750,000 + QPS, and the performance in the production environment is certainly not exclusive. Maybe you can hardly believe this number, but it is a fact. I will share our experience below.
(Author's experience) 2010-08, I left Oracle and now I am working in DeNA, one of Japan's largest social game platform vendors.
How many PK queries are required per second? In DeNA applications, PK queries are often required. For example, if userinfo is extracted based on the user id and the log Content is retrieved Based on the diary id, Memcached and NoSQL are suitable for this requirement. In a simple multi-threaded "Memcached GET" benchmark, it is likely that 400,000 get operations are performed per second, even if the Memcached client is on different servers. On a server with a 2.5 GHz x 8-core CPU and Broadcom quad-port Gigabit Nic, the latest libMemcached and Memcached get operations can reach 420,000 times per second.
In MySQL, how many PK queries can be made per second? We can use sysbench, super-smack or mysqlsla for Benchmark Testing.
[matsunobu@host ~]$ mysqlslap --query="select user_name,.. from test.user where user_id=1" \\--number-of-queries=10000000 --concurrency=30 --host=xxx -uroot
Using the following command, we soon learned that InnoDB's QPS is about 100,000, almost only 1/4 of Memcached.
[matsunobu@host ~]$ mysqladmin extended-status -i 1 -r -uroot \\| grep -e "Com_select"
...| Com_select | 107069 || Com_select | 108873 || Com_select | 108921 || Com_select | 109511 || Com_select | 108084 || Com_select | 108483 || Com_select | 108115 |...
It seems that 100,000 + QPS is not too bad, but why is MySQL so much worse than Memcached? What is MySQL doing. According to the statistics of vmstat, the data of % user and % system is very high.
[matsunobu@host ~]$ vmstat 1
r b swpd free buff cache in cs us sy id wa st23 0 0 963004 224216 29937708 58242 163470 59 28 12 0 024 0 0 963312 224216 29937708 57725 164855 59 28 13 0 019 0 0 963232 224216 29937708 58127 164196 60 28 12 0 016 0 0 963260 224216 29937708 58021 165275 60 28 12 0 020 0 0 963308 224216 29937708 57865 165041 60 28 12 0 0
Looking at the Oprofile output, we can see the CPU consumption:
samples % app name symbol name259130 4.5199 mysqld MYSQLparse(void*)196841 3.4334 mysqld my_pthread_fastmutex_lock106439 1.8566 libc-2.5.so _int_malloc94583 1.6498 bnx2 /bnx284550 1.4748 ha_innodb_plugin.so.0.0.0 ut_delay67945 1.1851 mysqld _ZL20make_join_statisticsP4JOINP10TABLE_LISTP4ItemP16st_dynamic_array63435 1.1065 mysqld JOIN::optimize()55825 0.9737 vmlinux wakeup_stack_begin55054 0.9603 mysqld MYSQLlex(void*, void*)50833 0.8867 libpthread-2.5.so pthread_mutex_trylock49602 0.8652 ha_innodb_plugin.so.0.0.0 row_search_for_mysql47518 0.8288 libc-2.5.so memcpy46957 0.8190 vmlinux .text.elf_core_dump46499 0.8111 libc-2.5.so malloc
In the SQL parsing phase of MySQL, MYSQLparse () and MYSQLlex () are called; in the query optimization phase, make_join_statistics () and JOIN: optimize () are called (). Obviously, the main resource consumption is the SQL layer, rather than the InnoDB Storage layer. Compared with Memcached/NoSQL, MySQL requires additional work:
Parsing SQL statements Parsing SQL.
Opening and locking tables open and lock the table.
Making SQL execution plans SQL execution Plan.
Unlocking and closing tables unlock and close the table.In addition, MySQL must implement a large amount of concurrency control. For example, when sending/receiving network data packets, fcntl () will be called many times; Global mutexes such as LOCK_open, LOCK_thread_count is also frequently acquired/released. Therefore, my_pthread_fastmutex_lock () is the second place in the output of Oprofile (). And % system occupies a very high CPU (28% ).
In fact, the MySQL development team and external development teams have realized the impact of a large number of concurrency control on performance, and some problems have been solved in MySQL5.5. In the future, MySQL versions will get better and better.
Another big problem is that % user has reached 60%. Competition for mutex leads to an increase in % system, rather than % user. Even if the problem about mutex in MySQL is fixed, it is still difficult to reach the expected 300,000 QPS. some people may mention HANDLER, but because opening/closing table is still necessary during SQL parsing, it can only help to increase throughput.
If only a small amount of data enters the memory, the consumption caused by SQL is negligible. It is very simple because the I/0 operation on the disk consumes a lot. In this case, you do not need to think too much about the consumption of SQL.
However, in most hot MySQL servers, most of the data is loaded into the memory and changed only by the CPU limit. The results of Profiling are similar to those described above: the SQL layer consumes a lot of resources. Assume that a large number of PK queries are required (I. e. SELECT x FROM t WHERE id = ?) Or perform a LIMIT Range Query, even if 70-80% of them are all performing PK queries in the same table (only the values given in the query conditions are different, that is, the values are different ), mySQL still needs to do parse/open/lock/unlock/close each time, which has a great impact on efficiency.
Is there a good way to reduce the CPU resources/competition at the MySQL SQL layer? If you use MySQL Cluster, NDBAPI is a good solution. When I was a consultant for MySQL/Sun/Oracle, I have seen many customers feel very uncomfortable with SQL Node + NDB performance, but when they use the NDBAPI client, it is found that performance tuning has increased by N times. Of course, NDBAPI and SQL can be used in MySQL Cluster at the same time, but NDBAPI is recommended for frequent access modes, SQL + MySQL + NDB is used when ad-hoc or query is not frequent.
This is what we need to access the API at a fast speed, but we also want to use SQL in the case of ad_hoc or complex queries. for other web Services, DeNA uses InnoDB and converts it to NDB. This is not an easy task, because the built-in InnoDB does not support SQL or network-layer services.
The best way is to implement a NoSQL network service in the form of MySQL plugin in MySQL. It listens to a port to receive communication using the NoSQL protocol/API, and then directly accesses InnoDB through the storage engine API in MySQL. This method is similar to NDBAPI, but it can communicate with InnoDB.
This idea was first proposed by Kazuho Oku at Cybozu Labs last year. He wrote a MyCached UDF using Memcached protocols communication. My college student has implemented another plug-in, HandlerSocket,
It exists in the form of MySQL daemaon plugin, so MySQL can be used as NoSQL in applications. its biggest function is to achieve interaction with storage engines, such as InnoDB, without any SQL overhead. when accessing the MySQL table, of course, she also needs open/close table, but it does not go to open/close table every time, because it saves the previously accessed table for use, opening/closing tables is the most resource-consuming and easily leads to competition for mutex, it is very effective for improving performance. When the traffic changes, HandlerSocket will close tables, so it will not block administrative commands (DDL ).What is the difference between it and MySQL + Memcached? The differences between figure 1 and figure 2 are compared. Figure 2 shows the typical use of MySQL + Memecached. Because the get operation of Memcached is much faster than the primary key query in the memory/disk of MySQL, Memcached is used to cache database records. If the query speed of HandlerSocket is comparable to that of Memcached, we do not need to use Memcached to cache records.
For example, assume that there is a user table that uses user_id to obtain user information:
CREATE TABLE user(user_id INT UNSIGNED PRIMARY KEY,user_name VARCHAR(50),user_email VARCHAR(255),created DATETIME)ENGINE=InnoDB;
Use the SELECT statement to obtain user information
mysql> SELECT user_name, user_email, created FROM user WHERE user_id=101;+---------------+-----------------------+---------------------+| user_name | user_email | created| +---------------+-----------------------+---------------------+| Yukari Takeba | yukari.takeba@dena.jp | 2010-02-03 11:22:33| +---------------+-----------------------+---------------------+1 row in set (0.00 sec)
Next let's take a look at how to use HandlerSocket to accomplish the same thing.
Install HandlerSocketFor detailed installation steps of HandlerSocket, see here. The basic steps are as follows:
1 download HandlerSocket
2. Compile HandlerSocket (client and server)[root@localhost handlersocket]# ./configure --with-mysql-source=mysql-source-dir --with-mysql-bindir=mysql-server-bin-dir[root@localhost handlersocket]# make[root@localhost handlersocket]# make install
3. Install HandlerSocketmysql> INSTALL PLUGIN \'handlersocket\' soname \'handlersocket.so\';
Because HandlerSocket is a MySQL plug-in, you can use it like other plug-ins, such as InnoDB, Q4M, and Spider, that is, you do not need to modify the MySQL source code. MySQL is preferably 5.1 or later, mySQL source code and MySQL library are required for compiling HandlerSocket.
Write the HandlerSocket client codeCurrently, the client libraries for Calling C ++ and perl are provided. The following is the sample code called using perl:
#!/usr/bin/perl use strict;use warnings;use Net::HandlerSocket; #1. establishing a connectionmy $args = { host => \'ip_to_remote_host\', port => 9998 };my $hs = new Net::HandlerSocket($args); #2. initializing an index so that we can use in main logics.# MySQL tables will be opened here (if not opened)my $res = $hs->open_index(0, \'test\', \'user\', \'PRIMARY\', \'user_name,user_email,created\');die $hs->get_error() if $res != 0; #3. main logic#fetching rows by id#execute_single (index id, cond, cond value, max rows, offset)$res = $hs->execute_single(0, \'=\', [ \'101\' ], 1, 0);die $hs->get_error() if $res->[0] != 0;shift(@$res);for (my $row = 0; $row < 1; ++$row) { my $user_name= $res->[$row + 0]; my $user_email= $res->[$row + 1]; my $created= $res->[$row + 2]; print "$user_name\\t$user_email\\t$created\\n";} #4. closing the connection$hs->close()
#!/usr/bin/perl use strict;use warnings;use Net::HandlerSocket; #1. establishing a connectionmy $args = { host => \'ip_to_remote_host\', port => 9998 };my $hs = new Net::HandlerSocket($args); #2. initializing an index so that we can use in main logics.# MySQL tables will be opened here (if not opened)my $res = $hs->open_index(0, \'test\', \'user\', \'PRIMARY\', \'user_name,user_email,created\');die $hs->get_error() if $res != 0; #3. main logic#fetching rows by id#execute_single (index id, cond, cond value, max rows, offset)$res = $hs->execute_single(0, \'=\', [ \'101\' ], 1, 0);die $hs->get_error() if $res->[0] != 0;shift(@$res);for (my $row = 0; $row < 1; ++$row) { my $user_name= $res->[$row + 0]; my $user_email= $res->[$row + 1]; my $created= $res->[$row + 2]; print "$user_name\\t$user_email\\t$created\\n";} #4. closing the connection$hs->close();
The above Code uses the user_id = 101 condition to obtain the user_name, user_email, and created information of the user table. The obtained results should be the same as those previously queried on the MySQL client.
[matsunobu@host ~]$ perl sample.plYukari Takeba yukari.takeba@dena.jp 2010-02-03 11:22:33
For most Web applications, maintaining a lightweight HandlerSocket connection is a good practice (persistent connection), allowing a large number of requests to concentrate on the main logic (part #3 in the code above ).
The HandlerSocket protocol is a small text-based protocol similar to the Memcached text protocol. You can use telnet to obtain data through HandlerSocket.
[matsunobu@host ~]$ telnet 192.168.1.2 9998Trying 192.168.1.2...Connected to xxx.dena.jp (192.168.1.2).Escape character is \'^]\'.P 0 test user PRIMARY user_name,user_email,created0 10 = 1 1010 3 Yukari Takeba yukari.takeba@dena.jp 2010-02-03 11:22:33(Green lines are request packets, fields must be separated by TAB)
Green indicates the request data packet. fields must be separated by the Tab key.
Now it is time to display the benchmark test results. The user table above is used to test the number of times the primary key query operation is performed from a multi-thread remote client, all user data is loaded into the memory (I tested 1 million rows), and Memcached is tested with similar data (I used libMemcached and Memcached_get () to retrieve user data ), in the MySQL SQL test, I used the traditional SELECT statement: "SELECT user_name, user_email, created FROM user WHERE user_id =? ", Both Memcached and HandlerSocket client code are written in C/C ++. All client programs are located on the remote host and connected to MySQL/Memcached through TCP/IP. The maximum throughput is as follows:
approx qps server CPU utilMySQL via SQL 105,000 %us 60% %sy 28%Memcached 420,000 %us 8% %sy 88%MySQL via HandlerSocket 750,000 %us 45% %sy 53%
The throughput of HandlerSocket is 7.5 higher than that of traditional SQL, and % us is only 3/4 of that of traditional SQL. This shows that the SQL layer of MySQL is very resource-consuming, if you can skip this layer, the performance will be greatly improved. Interestingly, MySQL uses HandlerSocket 178% faster than Memcached, and Memcached consumes more % sy resources. Therefore, although Memcached is a good product, there is still room for optimization.
The following is the output content of oprofile, which is collected during the MySQL HandlerSocket test. In core operations, such as network packet processing, CPU resource consumption for data retrieval (bnx2 is a network device driver ).
samples % app name symbol name984785 5.9118 bnx2 /bnx2847486 5.0876 ha_innodb_plugin.so.0.0.0 ut_delay545303 3.2735 ha_innodb_plugin.so.0.0.0 btr_search_guess_on_hash317570 1.9064 ha_innodb_plugin.so.0.0.0 row_search_for_mysql298271 1.7906 vmlinux tcp_ack291739 1.7513 libc-2.5.so vfprintf264704 1.5891 vmlinux .text.super_90_sync 248546 1.4921 vmlinux blk_recount_segments244474 1.4676 libc-2.5.so _int_malloc226738 1.3611 ha_innodb_plugin.so.0.0.0 _ZL14build_template P19row_prebuilt_structP3THDP8st_tablej206057 1.2370 HandlerSocket.so dena::hstcpsvr_worker::run_one_ep()183330 1.1006 ha_innodb_plugin.so.0.0.0 mutex_spin_wait175738 1.0550 HandlerSocket.so dena::dbcontext:: cmd_find_internal(dena::dbcallback_i&, dena::prep_stmt const&, ha_rkey_function, dena::cmd_exec_args const&)169967 1.0203 ha_innodb_plugin.so.0.0.0 buf_page_get_known_nowait165337 0.9925 libc-2.5.so memcpy149611 0.8981 ha_innodb_plugin.so.0.0.0 row_sel_store_mysql_rec148967 0.8943 vmlinux generic_make_request
Because HandlerSocket runs inside MySQL and interacts directly with InnoDB, common SQL commands such as SHOW GLOBAL STATUS can be used to obtain statistics, it is worth noting that Innodb_rows_read has reached 750000 +.
$ mysqladmin extended-status -uroot -i 1 -r | grep "InnoDB_rows_read"...| Innodb_rows_read | 750192 || Innodb_rows_read | 751510 || Innodb_rows_read | 757558 || Innodb_rows_read | 747060 || Innodb_rows_read | 748474 || Innodb_rows_read | 759344 || Innodb_rows_read | 753081 || Innodb_rows_read | 754375 |...
The detailed information of the testing machine is as follows:
Model |
Dell PowerEdge R710 |
CPU |
Nehalem 8-core, E5540@2.53GHz |
Memory |
32 GB (all data is loaded into the buffer pool) |
MySQL |
5.1.50 InnoDB |
Memcached/libMemcached |
1.4.5 (Memcached), 0.44 (libMemcached) |
Network |
Boradcom NetXtreme II BCM5709 1000Base-T (four ports are built in, three of which are used) |
Memcached and HandlerSocket both impose network I/O restrictions. When I test a single port, the QPS of HandlerSocket is 260000, while that of Memcached is 220000.
As described below, HandlerSocket has its own characteristics and advantages, and some of them are really awesome for us.
Supports multiple query modesHandlerSocket currently supports primary key/uniqueness query, non-uniqueness index query, range scan, LIMIT and INSERT/UPDATE/DELETE, but does not support operations without any indexes. In addition, multi_get () (similar to in (1, 2, 3), only one round-trip network) can also obtain multiple rows of data. You can query the details here.
Process a large number of concurrent connectionsHandlerSocket connections are lightweight, because HandlerSocket uses epoll () and worker-thread/thread-pooling architectures, while the number of internal MySQL threads is limited (My. cnfHandlersocket_threads Parameter Control in), so even if tens of millions of network connections are established to the HandlerSocket, its stability will not be affected (it consumes too much memory, it will cause huge mutual exclusion competition and other problems, for example, bug #26590, bug #33948, bug #49169 ).
And its excellent performanceHandlerSocket, as described above, is not inferior to other NoSQL lineups. As a matter of fact, I have never seen any NoSQL product that can query more than 750000 times on a common server. It not only does not call SQL-related functions, but also optimizes network/concurrency problems.
Smaller network packetsCompared with the traditional MySQL protocol, the HandlerSocket protocol is shorter, so the network traffic is smaller.
Limited number of internal MySQL threadsRefer to the above content.
Group client requestsWhen a large number of concurrent requests arrive at HandlerSocket, each worker thread aggregates as many requests as possible and then executes the aggregated requests and returned results at the same time. In this way, the performance is greatly improved by sacrificing a little response time. For example, you can get the following benefits. If someone is interested, I will explain them in depth in future articles.
Reduces the number of fsync () calls.
Reduces replication latency.
No repeated CacheWhen Memcached is used to cache MySQL/InnoDB records, these records are cached in both Memcached and InnoD B buffer pools, so the efficiency is very low (the memory is still very expensive ). the HandlerSocket plug-in is used because it accesses the InnoDB Storage engine and records are cached in the InnoDB buffer pool. In this way, other SQL statements can reuse it.
No data inconsistencyBecause data is stored only in one place (in InnoDB), unlike Memcached, Data Consistency needs to be checked between Memcached and MySQL.
Crash SecurityThe backend storage is InnoDB, Which is transactional and crash-safe, even if innodb-flush-log-at-trx-commit is set! = 1. When the server crashes, only data in less than 1 s will be lost.
SQL can be used from the MySQL clientIn many cases, we still want to use SQL (such as production abstract reports), which is why we cannot use embedded InnoDB. Most NoSQL products do not support SQL interfaces, handlerSocket is just a MySQL plug-in that can send SQL statements from the MySQL client. However, HandlerSocket is recommended when high throughput is required.
Benefit from MySQLBecause HandlerSocket runs inside MySQL, all MySQL operations, such as SQL, online backup, replication, and monitoring through Nagios/EnterpriseMonitor are supported. HandlerSocket can be monitored through common MySQL commands, such as show global stauts, show engine innodb status, and show processlist.
No need to modify/recreate MySQLBecause HandlerSocket is a plug-in, it supports MySQL Community edition and Enterprise Server Edition, and can be used without any modifications to MySQL.
Independent from the storage engineAlthough we only tested the 5.1 and 5.5 InnoDB plug-ins, HandlerSocket can interact with any storage engine.
Need to learn HandlerSocket APIAlthough it is easy to use, we still need to learn how to interact with HandlerSocket. We provide C ++ API and Perl binding.
No security featuresSimilar to other NoSQL databases, HandlerSocket does not support security. The working thread of HandlerSocket runs with the privileges of system users. Therefore, applications can access all tables through the HandlerSocket protocol. Of course, you can use a firewall to filter data packets like other NoSQL products.
No advantage for HDD binding workloadsFor hdd I/O binding workloads, the database cannot perform thousands of queries per second, usually only 1-10% of CPU utilization. In this case, the SQL Execution layer will not become a bottleneck, therefore, HandlerSocket has no advantages. We only use HandlerSocket on servers where data is fully loaded into the memory.
We have used the HandlerSocket plug-in the production environment, and the effect is very obvious, because we have reduced a lot of Memcached and MySQL slave servers, and the overall network traffic is also decreasing. No performance problems (such as slow response time and latency) have been found ).
In my opinion, the NoSQL/Database community has completely underestimated MySQL. Compared with other products, it has a long history, and so far, my excellent former colleagues have also made many unique and great improvements. From the NDBAPI, we can see that MySQL has the potential to become NoSQL. Because the storage engine API and daemon interface are completely independent, it is possible for Akira and DeNA to develop HandlerSocket. As a former MySQL employee and a long-term understanding of MySQL, I want to see MySQL become better and more popular, not just as an RDBMS, but also as a member of the NoSQL camp.
I think the article is useful? Now: with friendsTotal learning progress!
This article draws on the http://jishu.zol.com.cn/205307.html
Copyright Disclaimer: you are welcome to reprint it. I hope you can add the original article address while reprinting it. Thank you for your cooperation.