Table_rows query optimization and table_rows Optimization

Source: Internet
Author: User

Table_rows query optimization and table_rows Optimization

In daily application O & M, Dev or db must calculate the number of rows in the table to serve as an information reference for the application or maintenance. Many may ignore the impact of SQL statements similar to select count (*) from table_name on database performance, but you can see that the execution has been performed thousands of times on the slow log platform, do you still remain indifferent when you execute a query for about 4 seconds? As a dba with the courage to challenge, you should say no. I think similar requirements are inevitable, but they should not be factors that affect database performance, even if this is not the case, what else can the company expect you to do. Based on the needs of the query, I divided the query into fuzzy query and precise query. The following three methods can be used to select the best choice. The following is an online log table with a minimum expression of about 6 GB.

1. The exact query knows the number of rows in the table. In this case, we need to use the count () function to calculate the number of rows in the table. In the innodb Storage engine, the count (*) function first reads data from the table in the memory to the memory buffer, and then scans the entire table to obtain the number of records. However, this method is too simple and violent. It is suitable for small table queries and does not apply to frequent large table queries. Especially in production, when the tables are large and there are no other non-clustered indexes (secondary indexes) in addition to clustered indexes (primary key indexes), it is undoubtedly a huge disaster.

mysql> select count(*) from operation_log;+----------+| count(*) |+----------+| 21049180 |+----------+1 row in set (10.92 sec)mysql> explain select count(*) from rule_ceshi.operation_log;+----+-------------+---------------+-------+---------------+----------+---------+------+----------+-------------+| id | select_type | table         | type  | possible_keys | key      | key_len | ref  | rows     | Extra       |+----+-------------+---------------+-------+---------------+----------+---------+------+----------+-------------+|  1 | SIMPLE      | operation_log | index | NULL          | user_key | 194     | NULL | 20660338 | Using index |+----+-------------+---------------+-------+---------------+----------+---------+------+----------+-------------+1 row in set (0.00 sec)mysql> show index from rule_ceshi.operation_log;+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+| Table         | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+| operation_log |          0 | PRIMARY  |            1 | id          | A         |    20660338 |     NULL | NULL   |      | BTREE      |         |               || operation_log |          1 | user_key |            1 | user_key    | A         |     2951476 |     NULL | NULL   |      | BTREE      |         |               |+---------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+2 rows in set (0.00 sec)

mysql> drop index user_key on rule_ceshi.operation_log;
Query OK, 0 rows affected (0.19 sec)
Records: 0 Duplicates: 0 Warnings: 0

mysql> select count(*) from rule_ceshi.operation_log;
+----------+
| count(*) |
+----------+
| 21049180 |
+----------+
1 row in set (23.39 sec)

The above test results show that count (*) uses clustered indexes and non-clustered indexes for full index scanning, but the non-clustered index is faster than the clustered index to obtain the number of records. Why? We generally do not mean that the primary key index is the fastest. Is this principle not applicable here or there is a bug in the optimizer. When I raised this question, I also had such doubts. After several times of reading this question and FQ, I finally told you that the primary key index is indeed the fastest, only the primary key index query has the precondition. For the conditions, please refer to my next article on how count (*) performs Index analysis.

 

2. The preceding method is acceptable for a single query on a physical machine with sufficient configurations. However, the appearance of similar SQL many times is not necessary for database performance, because it does not have a deep significance for business development. We know that there is no way to optimize SQL statements such as select count (*) from table_name through indexes. Therefore, we can only optimize the SQL statements by rewriting them, this is also an essential skill to be proficient in SQL optimization.

If you want to precisely query the number of rows in the table and query the time as short as possible, we need to think of the max () and min () functions, generally, the maximum and minimum values are returned quickly.

 1 mysql> select ifnull(max(id),0)-ifnull(min(id),0)+1 as rows from rule_ceshi.operation_log; 2 +----------+ 3 | rows     | 4 +----------+ 5 | 21124162 | 6 +----------+ 7 1 row in set (0.02 sec) 8  9 mysql> explain select ifnull(max(id),0)-ifnull(min(id),0)+1 as rows from rule_ceshi.operation_log;10 +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+11 | id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |12 +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+13 |  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |14 +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+15 1 row in set (0.01 sec)

Of course, the premise of using this optimization rewrite is that your on-board table has a primary key and an integer type, and the primary key must be continuous, that is, you have not performed the delete from table where xxx = xxx row record operation in the production. Otherwise, the statistics are still inaccurate.

 

3. We know that MySQL comes with a statistical information. Our show commands and other statistical tables are usually from the database. What if Dev tells us that we only need to perform a fuzzy query to know the number of data rows in the table? In this case, you can use the statistical information of the information_schema.tables table that comes with MySQL to preliminarily determine the data row size of the table.

mysql> select table_schema,table_name,table_type,table_rows from information_schema.tables where table_schema='rule_ceshi' and table_name='operation_log';+--------------+---------------+------------+------------+| table_schema | table_name    | table_type | table_rows |+--------------+---------------+------------+------------+| rule_ceshi   | operation_log | BASE TABLE |   20660338 |+--------------+---------------+------------+------------+1 row in set (0.01 sec)mysql> explain select table_schema,table_name,table_type,table_rows from information_schema.tables where table_schema='rule_ceshi' and table_name='operation_log';+----+-------------+--------+------+---------------+-------------------------+---------+------+------+---------------------------------------------------+| id | select_type | table  | type | possible_keys | key                     | key_len | ref  | rows | Extra                                             |+----+-------------+--------+------+---------------+-------------------------+---------+------+------+---------------------------------------------------+|  1 | SIMPLE      | tables | ALL  | NULL          | TABLE_SCHEMA,TABLE_NAME | NULL    | NULL | NULL | Using where; Open_full_table; Scanned 0 databases |+----+-------------+--------+------+---------------+-------------------------+---------+------+------+---------------------------------------------------+1 row in set (0.00 sec)

This method is useful for DBAs daily maintenance and determining the number of rows in a table.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.