"Original" InnoDB in MySQL how to quickly delete large tables of 2T

Source: Internet
Author: User
Small comics

Come on, take a look at the little comic book to edify sentiment

OK, here's the point. Suppose that you have a table erp if you directly make the following command

drop table erp

This time all the MySQL related processes will stop until the drop end, and MySQL will resume execution. The reason for this is because, at drop table the time, innodb a global lock is maintained and the drop Lock is released.
This means that if during the day, when the traffic is very large, if you do not do any processing measures, the execution of the order to delete the big table, the whole hanging there, during the deletion of the mysql table QPS will be a serious decline, and then the product manager came to you for tea. So there is a scene in the comic, you can 12 o'clock in the evening, the dead of night and then deleted .
Of course, some people do not obey, may say: " you can write a delete table stored procedures, in the evening when no amount of traffic when the time to run the line." "
I was startled, think about it, can only say: "We still do not contradicting, or listen to my talk about the industry general practice." "

A hypothesis

First of all, here is a premise, MySQL opened a separate table space , MySQL5.6.7 after the default open.
my.cnfthat is, in, there is a configuration (these are MySQL optimization knowledge, later to introduce to you)

innodb_file_per_table = 1

To view the table space status, use the following command

mysql> show variables like '%per_table';  +-----------------------+-------+  | Variable_name         | Value |  +-----------------------+-------+  | innodb_file_per_table | OFF   |  +-----------------------+-------+

If innodb_file_per_table the value value is OFF , the representative is using a shared table space .
If innodb_file_per_table the value value is ON , the representative is using a stand-alone table space .
So, you're going to ask me, what's the difference between a standalone tablespace and a shared table space ?
shared tablespace : All the table data for a database, the index file is all in one file, the file path of the shared tablespace is in the data directory by default. The default file name is: ibdata1 (this file can be expanded into multiple). note that in this way, operation is super inconvenient. You see, all the data in a file, to the single table maintenance, very inconvenient. In addition, when you do delete the operation, the file will leave a lot of gaps, ibdata1 file will not automatically shrink. In other words, using shared table spaces to store data can be drop table problematic after the space cannot be freed.

stand-alone tablespace: Each table is deployed in a standalone manner, with one. frm table description file and one. ibd file.
. frm file: the metadata for each table is saved, including the definition of the table structure, and the file is not related to the database engine.
. ibd file: A file that holds data and indexes for each table.
note that in this way, each table has its own independent table space, which makes it easy to move between the different databases in a single table. In addition, the drop table table space can be automatically reclaimed when the operation is performed. After you perform the delete operation, you can alter table TableName engine=innodb reclaim some of the tablespace by defragmenting it.

PS: my.cnf datadir is used to set the data storage directory

Well, there's a whole bunch of balabala up here, and I just want to say one thing :

In the vast majority of cases, OPS will certainly choose a separate table space for MySQL storage, because the use of independent table space, from the performance optimization and operation of the difficult point of view, is too strong.

So, the premise that I mentioned at the beginning, MySQL needs to open a separate table space . This hypothesis, hundred 90 of the case is set up. If you really meet, your company's MySQL is a shared table space situation, you and your family's operations to talk about the heart, ask why use shared table Space .

Correct posture

Suppose we have datadir = /data/mysql/ , in addition, we have a database , named mytest . In the database mytest , there is a table, named erp , that executes the following command

mysql> system ls -l /data/mysql/mytest/

Get the following output (I filtered it)

-rw-r----- 1 mysql mysql          9023  8 18 05:21 erp.frm-rw-r----- 1 mysql mysql 2356792000512  8 18 05:21 erp.ibd

frmand ibd the role that has been described above. Now the erp.ibd file is too big, so the deletion is stuck.
How to solve this problem?
You need to use the knowledge of hard links in Linux for quick deletion. Let me have some of the contents of "Bird's private Cuisine",
Soft Links In fact, we can understand the analogy as a shortcut in Windows, not much introduction, mainly about the hard link.
As for this hard link , I simply say, do not want to post a lot of words come over, looks too tired.
Is that for a file that is actually stored, there is a

And then there 文件名 's a point to the topnode Index

So the so-called hard link , is more than one 文件名 point node Index , there are several 文件名 points node Index .
Suppose that this would have another 文件名 point pointing to the above node Index , i.e.

This time, you do the deletion 文件名(1) of the operation, the Linux system detected, there is a 文件名(2) point node Index , so do not really delete the file, but 步骤(2) the reference to the deletion, this step is very fast, after all, just delete the reference. And so the picture becomes

Next, you do delete 文件名(2) the operation, the Linux system detects, no other 文件名 point to that node Index , will delete the real storage file, this step operation is to delete the real file, so slower.

OK, we are using the above principle.
Start erp.ibd by creating a hard link, using the ln command

At this point, the file directory is as follows

-rw-r----- 1 mysql mysql          9023  8 18 05:21 erp.frm-rw-r----- 2 mysql mysql 2356792000512  8 18 05:21 erp.ibd-rw-r----- 2 mysql mysql 2356792000512  

You will find that there is one more erp.ibd.hdlk file, erp.ibd and erp.ibd.hdlk the Inode is 2.
At this point, you perform the drop table action

mysql> drop table erp;Query OK, 0 rows affected (0.99 sec)

You'll notice that it's deleted in less than 1 seconds. Because, at this point, there are two file names ( erp.ibd and erp.ibd.hdlk ), pointing to an inode at the same time. This time, perform the delete operation, just the reference to delete, so very fast.
So, at this point, the delete has removed the table from MySQL. But the disk space is not released because there is one file left erp.ibd.hdlk .
How to delete erp.ibd.hdlk it correctly?
If you have no experience, you will answer me and use the rm order to delete. It is necessary to note that in the production environment, the direct use rm of the command to delete large files, will cause disk IO overhead spikes, CPU load is too high, it will affect other programs to run.
Then, at such times, the command should be used truncate to delete the truncate command in the coreutils tool set.
Details, we can go to Baidu, some people on rm and truncate orders, specially tested, the truncate command on the disk IO,CPU load almost no impact.
Delete the script as follows

TRUNCATE=/usr/local/bin/truncatefor i in `seq 2194 -10 10 `; do   sleep 2  $TRUNCATE -s ${i}G /data/mysql/mytest/erp.ibd.hdlk donerm -rf /data/mysql/mytest/erp.ibd.hdlk ;

Starting at 2194G, each time you reduce 10G, stop for 2 seconds, continue until the file is only 10G, and then use rm the command to delete the remaining portions.

Other conditions

This refers to what to do if the database is deployed on Windows. This question, I come to answer, is actually not professional enough. Because I have never come across a production environment, MySQL is part of Windows. Assuming that there is a real encounter, Windows has a tool called mklink , is created under Windows to create a hard link lock, should be able to complete similar functions

Summarize

The content of this article, the development of small and medium-sized companies is relatively easy to meet. Because small and medium-sized companies do not have a professional DBA, they have to do anything to develop children's shoes. I hope you have something to gain.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.