Mysql Big Data Query Optimization experience (recommended) and mysql Experience Sharing

Source: Internet
Author: User

Mysql Big Data Query Optimization experience (recommended) and mysql Experience Sharing

Mysql optimized!

Mysql has a small amount of data and is not necessary for optimization. It has a large amount of data and is indispensable for optimization. If a query is not optimized for 10 seconds, the query can also be properly optimized for 10 milliseconds.

How painful it is!

Mysql optimization: For programmers, index optimization and where condition optimization.

Lab environment: MacBook Pro MJLQ2CH/A, mysql5.7, data volume: 2.12 million +

ONE:

 select * from article INNER JOIN ( SELECT id FROM article WHERE  length(content_url) > 0 and  (select status from source where id = article.source_id)=1 and  (select status from category where id = article.category_id)=1 and  status = 1 and id < 2164931 order by stick desc,pub_time desc limit 240,15 ) AS tUSING(id);

At first glance, the boss will definitely want to kill me. There is nothing to do with self-association, or inner join. On the XX floor, I took my pig knife. I want to slaughter the blogger !!!

To be honest, my head was not crowded when I went out in the morning, and I didn't want to do that either.

1. If the data volume is large, you need to perform paging queries with a large offset. The reason is that you must overwrite the entire table with the id in the join subtable to avoid full table scanning.

View my order by (Note: It's not an order by, but TM won't write it). replace this order by with the desc or explain field in your own table. Extra ---> filesort! Shit!

2. For order by of these conditions, we usually add index to the two fields respectively, but Extra ---> filesort will still be used. Another way is to add a joint index for all the conditions after order by. Note that the order must be consistent with your order by order. In this way, only the where clause is left.

Let's take a look.where,(select status from source where id = article.source_id)=1 and ...What about JB!

3. I thought about the join + index method and finally tested it. It is almost the same as this method. The production environment is written in this way, so that's it. There are two fewer indexes (source_id and category_id), and no one else can block the lazy disease. If it suffers a loss, it will come back to continue optimization.

4. this is what I got last night. The order in which the where condition is satisfied is that the last condition is satisfied first, from right to left. After deleting the index test, it is indeed effective, it can be reduced from 6 seconds to 4 seconds. After the index is optimized, we can test again to find that the order has almost negligible impact on time consumption. X ms.

TWO:

Select * from article inner join (SELECT id FROM article where instr (ifnull (title, ''), 'Wolf ')> 0 and status! = 9 order by pub_time desc limit 100,10) AS t USING (id );

Well -- it's an inner join .......

INSTR (ifnull (title, ''), 'Wolf ')> 0, why not like ......

1. Considering that this is a search on the management platform and does not go to the search engine, the search engine only synchronizes data once an hour, and the data is incomplete. When searching, the Administrator only needs the results. like % XX % cannot take the index, which is 5 times lower than instr and tests regexp '. * XX *. ', or a little longer than instr, simply .....

Desc or explain: filesort... add an index to pub_time, or filesort .....

2. There is another solution in this case,SELECT id FROM article force index(pub_time), Specify to use this index. But this writing method is too short of flexibility, OUT! Baidu, some people are confused: create a joint index for status and pub_time (the condition of pub_time_status and order is in front), so that the index will be automatically force when the where query is performed.

THREE:

Select * from article where status! = 9 order by pub_time desc limit 100000,25; desc or explain, or filesort ..... didn't we create a joint index for status and pub_time? tell me why ......

Okay, I don't know. I need to create another joint index for status and pub_time.status_pub_timeThis time, the where condition is in the front, the explicit statement does not have filesort, but the index is not used, and it hooks uppub_time_status. I don't understand.

At the same time, I explained two SQL statements, such:

Neither of the two can be deleted. If one is deleted, the SQL statement will be filesort!

FOUR:

SELECT * from follow where (((SELECT status FROM source WHERE id=follow.source_id)=1 and follow.type=1) or ((select status from topic WHERE id=follow.source_id)=1 and follow.type=2)) AND user_id=10054 ORDER BY sort limit 15,15; SELECT * from follow inner join( SELECT id from follow where (((SELECT status FROM source WHERE id=follow.source_id)=1 and follow.type=1) or ((select status from topic WHERE id=follow.source_id)=1 and follow.type=2)) AND user_id=10054 ORDER BY sort limit 15,15 ) as t using(id); (SELECT id, source_id, user_id, temporary, sort, follow_time, read_time,type from follow where (SELECT status FROM source WHERE id=follow.source_id)=1 and follow.type=1 and user_id=10054) union all (SELECT id, source_id, user_id, temporary, sort, follow_time, read_time,type from follow where (select status from topic WHERE id=follow.source_id)=1 and follow.type=2 and user_id=10054) ORDER BY sort limit 15,15;

Look at these three SQL statements, interesting, right!

For the sake of fairness, I have optimized the index. user_id_sort (user_id, sort) allows the where statement to force the index when user_id is used.

First sentence: 0.48 ms

Sentence 2: 0.42 ms

The third sentence: 6 ms. The reason for the long time is that union (queries two tables and merges them into sub-tables) cannot overwrite the sort of order by with index.

Sometimes union is not necessarily faster than or.

Summary

The above is the mysql Big Data Query Optimization experience shared by xiaobian. I hope it will help you. If you have any questions, please leave a message and I will reply to you in a timely manner. Thank you very much for your support for the help House website!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.