Mysql optimization and mysql Performance Optimization

Source: Internet
Author: User
Tags disk usage

Mysql optimization and mysql Performance Optimization

For a data-centric application, the quality of the database directly affects the performance of the program, so the database performance is crucial. I

To ensure database efficiency, do the following:

① Database Design

② SQL statement Optimization

③ Database parameter configuration

④ Proper hardware resources and Operating Systems

This Order also shows the impact of these four operations on performance.

1.

A general understanding of the three paradigms is of great benefit to database design. In database design, in order to better apply the three paradigms, the three paradigms must be understood in a general sense (the general understanding is sufficient, not the most scientific and accurate understanding ):

1 paradigm: 1NF is an atomic constraint on attributes. Attributes must be atomic and cannot be decomposed. (As long as relational databases all meet 1NF requirements)

Second paradigm: 2NF is the uniqueness constraint on records, requiring that records have unique identifiers (Set primary keys), that is, the uniqueness of objects;

Third paradigm: 3NF is a constraint on field redundancy, that is, our field information can be derived through the association relationship. (usually we use foreign keys for processing)

 

However, databases without redundancy may not be the best. Sometimes, to improve operational efficiency, we must lower the paradigm standard and properly retain redundant data.

The specific approach is to follow the third paradigm in conceptual data model design, and to reduce the paradigm standard to be considered in physical data model design. Reducing the paradigm is to add fields and allow redundancy..

Appropriate inverse paradigm:

You can query the number of clicks on an image or the number of clicks on an album. Setting an hits attribute in the photos table does conform to the three paradigm,

When querying the number of times a photo album is clicked, You need to query all the albums with an albumid of 1 and then add the number of clicks. The efficiency of each such query is quite low. It is better

Although setting an hits attribute in the album table violates the third paradigm redundancy, the efficiency is greatly improved.

 

Incorrect inverse paradigm:

 

SQL statement Optimization

 

Interview Questions: There are several types of SQL statements

Ddl (Data Definition Language) [create alter drop]

Dml (data operation language) [insert delete upate]

Select

Dtl (Data transaction statement) [commit rollback savepoint]

Dcl (Data Control statement) [grant revoke]

 

After the MySQL client is successfully connected, you can use the show [session | global] status command to provide server status information.

The session indicates the statistical results of the current connection, and the global indicates the statistical results since the last time the database was started. The default value is session level.
The following example:
Show status like 'com _ % ';
Com_XXX indicates the number of times the XXX statement is executed.
Note: Com_select, Com_insert, Com_update, and Com_delete use these parameters,

It is easy to know whether the current database application is dominated by insert and update, query operations, and the approximate execution ratio of various SQL statements.

There are also several common parameters to help you understand the basic information of the database.
Connections: number of attempts to connect to the MySQL server
Uptime: the server's working time (unit: seconds)
Slow_queries: Number of slow queries (10 by default)

 

Here, we optimize slow queries (10 by default)

Show variables like 'long _ query_time ';

 

For testing, we have a sea table (mysql stored procedure)

The purpose is to see how to process the query speed in a massive table!

First, create a massive test table:

The stress test SQL script generates 1800000 pieces of massive data:

# Create a table deptcreate table dept (/* Department TABLE */deptno mediumint unsigned not null default 0, dname VARCHAR (20) not null default "", loc VARCHAR (13) not null default "") ENGINE = MyISAM default charset = utf8; # create table emp employee create table emp (empno mediumint unsigned not null default 0, ename VARCHAR (20) not null default "", job VARCHAR (9) not null default "", mgr mediumint unsigned not null default 0, hiredate date not null, sal DECIMAL (7,2) not null, comm DECIMAL (7,2) not null, deptno mediumint unsigned not null default 0) ENGINE = MyISAM default charset = utf8; # create table salgrade (grade mediumint unsigned not null default 0, losal DECIMAL (17,2) not null, hisal DECIMAL (17,2) not null) ENGINE = MyISAM default charset = utf8; insert into salgrade VALUES (1,700,120 0); insert into salgrade VALUES, 1400); insert into salgrade VALUES (2000, 3000); insert into salgrade VALUES (9999,); insert into salgrade VALUES ); # randomly generate a string # define a new command to end with, so that subsequent stored procedures can be executed normally without; as the terminator delimiter $ # Here I created a function. create function rand_string (n INT) returns varchar (255) begin declare chars_str varchar (100) default 'hangzhou'; declare return_str varchar (255) default ''; declare I int default 0; while I <n do set return_str = concat (return_str, substring (chars_str, floor (1 + rand () * 52), 1); set I = I + 1; end while; return return_str; end $ delimiter; select rand_string (6); # randomly generate the Department number delimiter $ # Here we have customized a function create function rand_num () returns int (5) begin declare I int default 0; set I = floor (10 + rand () * 500); return I; end $ delimiter; select rand_num (); #*************************************** * ** # insert records into the emp table (massive data) delimiter $ create procedure insert_emp (in start int (10), in max_num int (10) begindeclare I int default 0; set autocommit = 0; repeat set I = I + 1; insert into emp values (start + I), rand_string (6), 'salesman', 0001, curdate (), 2000,400, rand_num (); until I = max_num end repeat; commit; end $ delimiter; # call insert_emp (records 1800000) starting from 100001 when calling the previously written function ); #*************************************** * ********************* # Insert the record delimiter $ create procedure insert_dept (in start int (10 ), in max_num int (10) begindeclare I int default 0; set autocommit = 0; repeat set I = I + 1; insert into dept values (start + I ), rand_string (10), rand_string (8); until I = max_num end repeat; commit; end $ delimiter; call insert_dept (100,10 ); # partitions # insert data to the salgrade table delimiter $ create procedure insert_salgrade (in start int (10), in max_num int (10) begindeclare I int default 0; set autocommit = 0; alter table emp disable keys; repeat set I = I + 1; insert into salgrade values (start + I), (start + I), (start + I )); until I = max_num end repeat; commit; end $ delimiter; # call insert_salgrade (optional, optional 00 );#----------------------------------------------
View Code

Change the slow query time to 0.2 s, and set long_query_time = 0.2;

(This setting is only valid for this connection. If the setting is permanently valid, You need to configure the file my. ini)

 

Run the query: select * from emp where empno = 123456; 0.281 s

Slow Query Count: show status like 'slow _ queries ';

It can be seen that the number of slow queries is already 1.

 

Search: show variables like '% quer % ';

 

You can view the settings of the query. You can see that the show_query_log slow query log is enabled. The record information is as follows.

 

Now we are optimizing: Adding indexes (high cost performance, no need to open up new memory)

 

Indexing is the best and inexpensive. No need to add memory, no need to change the program, no need to call SQL, as long as a correct 'create Index ',

The query speed may increase by times, which is tempting.

However, there is no free lunch in the world, and the query speed is improved at the price of insert, update, and delete (Because index information files need to be maintained), These write operations increase a lot of I/O.

(Four Indexes)

 

Index empno In the emp table.

Alter table emp add primary key (empno );

Before creating an index:

 

After the index is created:

You can see that the. MYI index file is very large.

Query again:

It takes 0 s to increase the speed of visibility.

 

Indexing principles:

Why is the indexing speed faster?

An index is a special file (an index on an InnoDB data table is an integral part of a tablespace) that contains reference pointers to all records in the data table.

More broadly speaking, database indexes are like directories in front of a book, which can speed up database queries. If no index is available, the database traverses all the data and selects the data that meets the conditions;

With the corresponding index, the database will directly find the matching options in the index.

Index cost:

1. disk usage (generating index files)

2. Impact on efficiency of dml (update delete insert) Statements

Although the index greatly improves the query speed, it also reduces the speed of updating the table, such as performing INSERT, UPDATE, and DELETE operations on the table. When updating a table, MySQL not only needs to save data, but also stores the index file.

 

This article introduces explain, a very important tool, which can analyze SQL statements and predict the efficiency of SQL Execution.

Its basic usage is:

Explain SQL statement \ G (\ G can make the display column easy to view)

// Based on the returned information, we can see whether the SQL statement uses indexes and how many records are retrieved (rows only need to retrieve the records because the index is created ), you can see the sorting method.

After the index is deleted:

 

 

After an index is created on a column, the query speed is improved only when this column is queried, but the query speed of other columns remains unchanged.

Which columns are suitable for adding indexes?

① Add an index to the column frequently queried.

② There are only a few values in the column data, which is not suitable for indexing.

③ The content changes frequently and is not suitable for indexing.

 

Indexes may be used in the following situations:
1. For the created multi-column index, the index is generally used as long as the leftmost column is used in the query condition.

Only indexes created for the left column:

2. For queries using like, if the query is '% aaa', the index is not used.

'Aaa % 'will use the index.

 

The following tables do not use indexes:
1. If there is or in the condition, it will not be used even if there is a condition with an index.
2. If multiple-column indexes are not the first part, they are not used.
3. The like Query starts with %.
4. If the column type is a string, you must quote the data using quotation marks in the condition. Otherwise, no index is used.
5. If mysql estimates that full table scan is faster than indexing, no index is used.

 

How to check whether your index is valid

Show status like 'handler _ read % ';

Conclusion: The larger the value of Handler_read_key, the less

The smaller the value of Handler_read_rnd_next, the better.

 

Differences between MyISAM and Innodb

  1. MyISAM does not support foreign keys, Innodb supports
  2. MyISAM does not support transactions, and Innodb does.
  3. Data is stored in different ways. (If the storage engine is MyISAM, create a table for three files. frm. myd. myi. If it is Innodb, there is only one file *. frm, data is stored in ibdata1)
  • MyISAM: the default MySQL storage engine. If the application is dominated by read and insert operations, there are only a few update and delete operations, and the transaction integrity and concurrency requirements are not very high. The advantage is that the access speed is fast.
  • InnoDB: Provides transaction security with commit, rollback, and crash recovery capabilities. However, compared with MyISAM, the write processing efficiency is less efficient and will occupy more disk space.

For the MyISAM database, delete data information, only Delete content, but do not release space. The table needs to be sorted regularly.

Optimize table name

The following are the optimization policies of MYISAM and Innodb:

InnoDB is the only mysql engine that supports transaction. By default, all user behaviors occur in the transaction. By default, when a new mysql connection is established, innoDB uses the autocommit mode for automatic submission. Each SQL statement forms a separate transaction on its own, that is, the insert statement is committed once, innoDB must refresh the log to the disk when committing the transaction, so the efficiency is limited by the disk read/write efficiency. You can disable the automatic submission mode. If your table has an index, the index will slow down the insert speed. When a large amount of data is inserted, you can disable the index and then recreate the index. For myisam tables, common optimization methods are as follows: 1. Disable indexes for non-empty tables. When a record is inserted, mysql indexes the inserted Record Based on the table index. If a large amount of data is inserted, creating an index will reduce the insert record speed. To solve this problem, you can DISABLE the index before inserting the record. After the data is inserted, enable the index to DISABLE the index statement as follows: alter table table_name disable keys; the statement "alter table table_name enable keys" is used to re-ENABLE the index for the TABLE name where the index is disabled. This operation is not required for batch data import from empty tables, because the myisam table is indexed after the data is imported! 2. When the uniqueness check is disabled, mysql checks the uniqueness of the inserted records. This uniqueness check also reduces the insert record speed. To reduce the impact of this situation on the query speed, you can disable the uniqueness check before inserting the record. After the record is inserted, the statement to enable the uniqueness check is as follows: SET UNIQUE_CHECKS = 0; the statement to enable the uniqueness check is as follows: SET UNIQUE_CHECKS = 1; 3. When you INSERT multiple records in batches, you can use an INSERT statement to INSERT one record, you can also use an INSERT statement to INSERT multiple records. First, insert into emp (id, name) VALUES (1, 'suse'); insert into emp (id, name) VALUES (2, 'lily '); insert into emp (id, name) VALUES (3, 'Tom '); insert into emp (id, name) VALUES (1, 'suse'), (2, 'lily'), (3, 'Tom ') the second case is faster than the first case. 4. Use load data infile to import DATA in batches, if you can use the load data infile statement, try to use it. Because the load data infile statement imports DATA much faster than the INSERT statement, for INNODB Engine tables, the common optimization method is as follows: 1. When the uniqueness check is disabled, mysql checks the uniqueness of the inserted records. This uniqueness check also reduces the insert record speed. To reduce the impact of this situation on the query speed, you can disable the uniqueness check before inserting the record. After the record is inserted, the statement to enable the uniqueness check is as follows: SET UNIQUE_CHECKS = 0; the statement to enable the uniqueness check is as follows: SET UNIQUE_CHECKS = 1; 2. Disable the foreign key constraint to check the foreign key before inserting data. After the data is inserted, the foreign key check is resumed. The statement to disable the foreign key check is as follows: SET FOREIGN_KEY_CHECKS = 0; the statement to restore the foreign key is SET FOREIGN_KEY_CHECKS = 1; 3. Disable Automatic commit of transactions before automatic commit of inserted data, after the data is imported, execute the restore automatic commit operation or explicitly specify the transaction use test; start transaction; insert into emp (name) VALUES ('ming'); insert into emp (name) VALUES ('lily'); commit;
View Code

Optimize group by statements
By default, MySQL sorts all group by col1 and col2.

This is similar to specifying order by col1 and col2 in a query.

If the query contains group by but you want to avoid consumption of sorting results, you can use order by null to disable sorting.

 

File Sorting is no longer used:

In some cases, you can use a connection to replace subqueries.

Because join is used, MySQL does not need to create a temporary table in memory.

If you want to use indexes in query statements containing or, indexes must be used for each condition column between or. If there is no index, you should consider adding indexes or using the union all operation.

The UNION operator is used to merge the result sets of two or more SELECT statements. Note that the SELECT statement inside the UNION must have the same number of columns. Columns must also have similar data types. In addition, the columns in each SELECT statement must be in the same order. SELECT column_name (s) FROM table_name1UNIONSELECT column_name (s) FROM table_name2 Note: by default, the UNION operator selects different values. If repeated values are allowed, use union all.
View Code

 

Select the appropriate data type:

In applications with high precision requirements, we recommend that you use decimal to store values to ensure the accuracy of the results.

10000 million

Create table sal (t1 float (10, 2 ));

The result of visible storage is inaccurate. After saving with decimal:

Create table sal2 (t1 decimal (10, 2 ));

 

 

Table Division

Horizontal table Division:

If a table contains too many records, for example, tens of millions of records and needs to be searched frequently, it is necessary to convert them to zero.

If I split 100 tables, each table has only 0.1 million records. Of course, data can be logically divided.

A good division basis is conducive to the simple implementation of the program, and can also make full use of the advantages of horizontal table sharding. For example, the system interface only provides the monthly query function, split the table into 12 by month, and query only one table for each query.

If you want to split the table by region, even if the table is smaller, you still need to join all the tables for query. It is better not to split the table. Therefore, a good splitting basis is the most important.(UNION)

 

Vertical table Division:

Some tables have a small number of records, which may contain 2 or 30 thousand records, but the fields are very long. The table occupies a large amount of space and requires a large number of I/O operations during table retrieval, seriously reducing the performance.

In this case, you need to split the large field into another table, and the table has a one-to-one relationship with the original table.(JOIN)

Union queries and join queries exactly express the usage of these two methods.

 

Database parameter configuration

The most important parameter is the memory. We mainly use the innodb engine, so the following two parameters are very tuned.

Innodb_additional_mem_pool_size = 64 M innodb_buffer_pool_size = 1G

For myisam, you need to adjust the key_buffer_size. Of course, the adjustment parameter still depends on the status. You can use the show status statement to view the current status and decide which parameters to adjust.

 

Read/write splitting technology

If the database is under great pressure and cannot be supported by one machine, you can use mysql replication to synchronize multiple machines to distribute the database pressure.

The basic principle is to allow the primary database to process transactional queries, but to process SELECT queries from the database. Database Replication is used to synchronize changes caused by transactional queries to the slave database in the cluster.

 

Why database read/write splitting can improve performance

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.