MySQL (16) ----------- suggestions for massive data volumes, mysql -----------

Source: Internet
Author: User

MySQL (16) ----------- suggestions for massive data volumes, mysql -----------

The following are some important suggestions:
1. Select the Correct storage engine
Taking MySQL as an example, there are two storage engines MyISAM and InnoDB, each of which has advantages and disadvantages.
MyISAM is suitable for some applications that require a large number of queries, but it is not very good for a large number of write operations. Even if you only need to update a field, the entire table will be locked. Other processes, even the read process, cannot operate until the read operation is complete. In addition, MyISAM is extremely computation over select count.
The trend of InnoDB is that it is a very complex storage engine. For some small applications, it will be slower than MyISAM. However, it supports "Row lock", so it will be better when there are many write operations. In addition, it supports more advanced applications, such as transactions.
2. Optimize the field data type
Remember a principle that the smaller the column, the faster it will be. For most database engines, hard disk operations may be the most significant bottleneck. Therefore, it is very helpful to compact your data because it reduces access to the hard disk.
If a table has only a few columns (such as a dictionary table and a configuration table), we have no reason to use INT as the primary key. Using MEDIUMINT, SMALLINT, or smaller TINYINT is more economical. If you do not need to record the time, it is much better to use DATE than DATETIME. Of course, you also need to leave enough Extension Space.
3. Add an index for a search field
An index is not necessarily a primary key or a unique field. If a field in your table is always used for search, it is best to index it unless the field you want to search for is a large text field, you should create a full-text index.
4. Avoid using Select *The more data you read from the database, the slower the query. In addition, if your database server and WEB server are two independent servers, this will increase the network transmission load. Even if you want to query all fields in a data table, do not use the * wildcard. Using the built-in field exclusion definition may bring more convenience.
5. Use ENUM instead of VARCHAR
The ENUM type is extremely fast and compact. In fact, it stores TINYINT, but its appearance is displayed as a string. In this way, it is quite perfect to use this field for some option lists. For example, fields such as gender, ethnicity, department, and status have limited and fixed values. Therefore, you should use ENUM instead of VARCHAR.
6. Try to use NOT NULL
Unless you use the NULL value for a special reason, you should always keep your field not null. NULL actually requires extra space, and your program will be more complicated during your comparison. Of course, this does not mean that you cannot use NULL. The reality is very complicated. In some cases, you still need to use NULL values.
7. tables with a fixed length will be faster
If all the fields in the table are "fixed length", the entire table will be considered as "static" or "fixed-length ". For example, the table does not have the following types of fields: VARCHAR, TEXT, BLOB. As long as you include one of these fields, this table is not a "static table with a fixed length". In this way, the MySQL engine will use another method for processing.
A fixed-length table improves performance because MySQL searches faster, because these fixed-length tables are easy to calculate the offset of the next data, so reading will naturally be fast. If the field is not fixed, the program needs to find the primary key for each query.
In addition, tables with a fixed length are more easily cached and rebuilt. However, the only side effect is that a field with a fixed length will waste some space, because a field with a fixed length will be allocated so much space no matter you use it.
Using the Vertical Split Technology, you can split your table into two tables with a fixed length and one with an indefinite length.
8. vertical segmentationVertical segmentation is a way to convert tables in the database into several tables by column, which can reduce the complexity of the table and the number of fields, so as to achieve optimization.
For example, in the User table, a field is the home address, which is an optional field. In addition to personal information, you do not need to read or rewrite this field frequently. So why don't I put him in another table? This will make your table have better performance. If you think about it in a large number of cases, only user IDs, user names, passwords, and user roles will be frequently used in user tables. Small tables always have good performance.
In addition, you need to note that the tables formed by the split fields do not often Join them. Otherwise, such performance will be worse than when there is no division, and it will be a very few decline.
9. EXPLAIN your SELECT query;
The EXPLAIN keyword helps you know how MySQL processes your SQL statements. This helps you analyze the performance bottleneck of your query statement or table structure. The EXPLAIN query results also show you how your index primary key is used and how your data tables are searched and sorted ...... And so on.
Generally, we can add the keyword "EXPLAIN" to the preceding SELECT statements that are complex, especially those involving multiple tables.

 

Optimization of inserting large amounts of data:

1. Insert multiple rows instead of one row. It is much faster than single insert. In addition, increase the value of bulk_insert_buffer_size, for example, set it to 64 M (in the MYISAM engine)

Insert into t values (),(),();

2. Using load data infile... into table... is faster than inserting an insert statement with the same number of rows.

3. For tables using the InnoDB Storage engine, you can complete the insert operation in a transaction, so that InoDB will refresh and change at the transaction end. Instead of refreshing and changing every insert statement. The same operation can be applied to update.

4. If the table is not empty, use alter table table_name disable keys and load data infile. After the data is imported, execute:

Alter table table_name enable keys. If it is an empty table, this operation is not required because the myisam table imports data in an empty table first and then establishes indexs.

 

Instance

For example, in the following code, close the transaction commit and wait for the update to be completed before committing it at one time. You can change the original 10 hours of work to 10 minutes. Here, we read a more than 7 million-row file with about more than 3 million update records.

 

my $db_handle = DBI->connect("DBI:mysql:database=$database;host=$host", $db_user, $db_pass, {'RaiseError' => 1,AutoCommit => 0})|| die "Could not connect to database: $DBI::errstr";    eval {        while( !eof($fd) )        {            $CloudID = <$fd> ;chomp $CloudID;            $crc_code = <$fd> ;chomp $crc_code;            my $sql = "call `room_match`.`crcWriteCode`($CloudID,'$crc_code');" ;            my $affect_rows = $db_handle->do($sql);        }        $db_handle->commit();    };


At the beginning, SQL is executed for the first time, which is very slow! When autocommit is set to 0 and then commit, the speed is greatly improved.

Copyright Disclaimer: you are welcome to reprint it. I hope to add the original article address at the same time. Thank you for your cooperation and learning!

Related Article

E-Commerce Solutions

Leverage the same tools powering the Alibaba Ecosystem

Learn more >

Apsara Conference 2019

The Rise of Data Intelligence, September 25th - 27th, Hangzhou, China

Learn more >

Alibaba Cloud Free Trial

Learn and experience the power of Alibaba Cloud with a free trial worth $300-1200 USD

Learn more >

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.