Mysql database/table sharding Performance Optimization Analysis

Source: Internet
Author: User
Tags database sharding
When mysql is used with a massive data volume of tens of millions, we will mostly consider Table sharding queries and database sharding queries. Next we will introduce the optimization of table sharding queries.

When mysql is used with a massive data volume of tens of millions, we will mostly consider Table sharding queries and database sharding queries. Next we will introduce the optimization of table sharding queries.

There are many unequal situations in our project. This article briefly analyzes how to optimize it.
The table sharding logic here is based on the number of user_name groups in the t_group table.
In this case, the index on the user_name field is a bad index. It does not have any obvious effect.

1. test PROCEDURE.

The Code is as follows:

DELIMITER $
Drop procedure't _ girl '. 'SP _ split_table' $
Create procedure't _ girl '. 'SP _ split_table '()
BEGIN
Declare done int default 0;
Declare v_user_name varchar (20) default '';
Declare v_table_name varchar (64) default '';
-- Get all users 'name.
Declare cur1 cursor for select user_name from t_group group by user_name;
-- Deal with error or warnings.
Declare continue handler for 1329 set done = 1;
-- Open cursor.
Open cur1;
While done <> 1
Do
Fetch cur1 into v_user_name;
If not done then
-- Get table name.
Set v_table_name = concat ('t_group _ ', v_user_name );
-- Create new extra table.
Set @ stmt = concat ('create table', v_table_name, 'like t_group ');
Prepare s1 from @ stmt;
Execute s1;
Drop prepare s1;
-- Load data into it.
Set @ stmt = concat ('insert into', v_table_name, 'select * from t_group where user_name = ''', v_user_name ,'''');
Prepare s1 from @ stmt;
Execute s1;
Drop prepare s1;
End if;
End while;
-- Close cursor.
Close cur1;
-- Free variable from memory.
Set @ stmt = NULL;
END $

DELIMITER;

2. test table.
We use a table with 10 million records for testing.

The Code is as follows:

Mysql> select count (*) from t_group;
+ ---------- +
| Count (*) |
+ ---------- +
| 1, 10388608 |
+ ---------- +
1 row in set (0.00 sec)

Table structure.
Mysql> desc t_group;
+ ------------- + ------------------ + ------ + ----- + ------------------- + ---------------- +
| Field | Type | Null | Key | Default | Extra |
+ ------------- + ------------------ + ------ + ----- + ------------------- + ---------------- +
| Id | int (10) unsigned | NO | PRI | NULL | auto_increment |
| Money | decimal () | NO |
| User_name | varchar (20) | NO | MUL |
| Create_time | timestamp | NO | CURRENT_TIMESTAMP |
+ ------------- + ------------------ + ------ + ----- + ------------------- + ---------------- +
4 rows in set (0.00 sec)

Index information.

Mysql> show index from t_group;
+ --------- + ------------ + ------------------ + -------------- + ------------- + ----------- + ------------- + ---------- + -------- + ------------ + ----------- +
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+ --------- + ------------ + ------------------ + -------------- + ------------- + ----------- + ------------- + ---------- + -------- + ------------ + ----------- +
| T_group | 0 | PRIMARY | 1 | id | A | 10388608 | NULL | BTREE |
| T_group | 1 | idx_user_name | 1 | user_name | A | 8 | NULL | BTREE |
| T_group | 1 | idx_combination1 | 1 | user_name | A | 8 | NULL | BTREE |
| T_group | 1 | idx_combination1 | 2 | money | A | 3776 | NULL | BTREE |
+ --------- + ------------ + ------------------ + -------------- + ------------- + ----------- + ------------- + ---------- + -------- + ------------ + ----------- +
4 rows in set (0.00 sec)

PS:
Idx_combination1 is a required index because user_name must be set to group. This is a loose index scan! Of course, you can kill her.
The index idx_user_name is used to accelerate the execution of queries of the constant type.
Table sharding is performed based on the user name.

The Code is as follows:

Mysql> select user_name from t_group where 1 group by user_name;
+ ----------- +
| User_name |
+ ----------- +
| David |
| Leo |
| Livia |
| Lucy |
| Sarah |
| Simon |
| Sony |
| Sunny |
+ ----------- +
8 rows in set (0.00 sec)

So the result table should be like this.

The Code is as follows:
Mysql> show tables like 't_ group _ % ';
+ ------------------------------ +
| Tables_in_t_girl (t_group _ %) |
+ ------------------------------ +
| T_group_david |
| T_group_leo |
| T_group_livia |
| T_group_lucy |
| T_group_sarah |
| T_group_simon |
| T_group_sony |
| T_group_sunny |
+ ------------------------------ +
8 rows in set (0.00 sec)

3. Comparison results.

The Code is as follows:


Mysql> select count (*) from t_group where user_name = 'David ';
+ ---------- +
| Count (*) |
+ ---------- +
| 1, 1298576 |
+ ---------- +
1 row in set (1.71 sec)

It took nearly 2 seconds.

Mysql> select count (*) from t_group_david;
+ ---------- +
| Count (*) |
+ ---------- +
| 1, 1298576 |
+ ---------- +
1 row in set (0.00 sec)
Almost instantly.

Mysql> select count (*) from t_group where user_name <> 'David ';
+ ---------- +
| Count (*) |
+ ---------- +
| 1, 9090032 |
+ ---------- +
1 row in set (9.26 sec)
After nearly 10 seconds of execution, we can imagine that this is intolerable in actual projects.
Mysql> select (select count (*) from t_group)-(select count (*) from t_group_david) as total;
+ --------- +
| Total |
+ --------- +
| 1, 9090032 |
+ --------- +
1 row in set (0.00 sec)
Almost instantly.


Let's look at the aggregate function.
Operations on the original table.

The Code is as follows:
Mysql> select min (money), max (money) from t_group where user_name = 'David ';
+ ------------ +
| Min (money) | max (money) |
+ ------------ +
|-6.41 | 500.59 |
+ ------------ +
1 row in set (0.00 sec)
Minimum. The maximum value is full index scan. So it is instantaneous.
Mysql> select sum (money), avg (money) from t_group where user_name = 'David ';
+ -------------- + ------------ +
| Sum (money) | avg (money) |
+ -------------- + ------------ +
| 319992383.84/246.417910 |
+ -------------- + ------------ +
1 row in set (2.15 sec)

The results of other Aggregate functions are not full index scan. It takes 2.15 seconds.

Small table operations.

The Code is as follows:
Mysql> select min (money), max (money) from t_group_david;
+ ------------ +
| Min (money) | max (money) |
+ ------------ +
|-6.41 | 500.59 |
+ ------------ +
1 row in set (1.50 sec)

The maximum and minimum values are full table scan, which takes 1.50 seconds and is not cost-effective. In this case.

The Code is as follows:
Mysql> select sum (money), avg (money) from t_group_david;
+ -------------- + ------------ +
| Sum (money) | avg (money) |
+ -------------- + ------------ +
| 319992383.84/246.417910 |
+ -------------- + ------------ +
1 row in set (1.68 sec)

It took two seconds to get these two results.

Let's take a look at the structure of this small table.

The Code is as follows:
Mysql> desc t_group_david;
+ ------------- + ------------------ + ------ + ----- + ------------------- + ---------------- +
| Field | Type | Null | Key | Default | Extra |
+ ------------- + ------------------ + ------ + ----- + ------------------- + ---------------- +
| Id | int (10) unsigned | NO | PRI | NULL | auto_increment |
| Money | decimal () | NO |
| User_name | varchar (20) | NO | MUL |
| Create_time | timestamp | NO | CURRENT_TIMESTAMP |
+ ------------- + ------------------ + ------ + ----- + ------------------- + ---------------- +
4 rows in set (0.00 sec)

Obviously, the user_name attribute is redundant. Then it will be killed.

The Code is as follows:
Mysql> alter table t_group_david drop user_name;
Query OK, 1298576 rows affected (7.58 sec)
Records: 1298576 Duplicates: 0 Warnings: 0

Now run the query on the small table again.

The Code is as follows:

Mysql> select min (money), max (money) from t_group_david;
+ ------------ +
| Min (money) | max (money) |
+ ------------ +
|-6.41 | 500.59 |
+ ------------ +
1 row in set (0.00 sec)

This is an instant.
Mysql> select sum (money), avg (money) from t_group_david;
+ -------------- + ------------ +
| Sum (money) | avg (money) |
+ -------------- + ------------ +
| 319992383.84/246.417910 |
+ -------------- + ------------ +
1 row in set (0.94 sec)

This time, it is within one second.

Mysql> Aborted

Summary: the smaller the table to be split, the better. Do it boldly.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.