MySQL Learning notes--database paradigm and MySQL optimization overall idea

Last Update:2014-10-29 Source: Internet

Author: User

Tags time 0

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

first, the database paradigm

in order to build a database with small redundancy and reasonable structure, we must follow certain rules when designing database. In a relational database, this rule is called a paradigm. A paradigm is a summary of a design requirement. In order to design a relational database with reasonable structure, a certain paradigm must be met.

1.1, the first paradigm (1NF: Each column cannot contain multiple values)
The so-called First paradigm (1NF) refers to the fact that each column of a database table is an indivisible basic data item and cannot have multiple values in the same column, that is, an attribute in an entity cannot have multiple values or cannot have duplicate properties. If duplicate attributes are present, you may need to define a new entity, which is composed of duplicate attributes, and a one-to-many relationship between the new entity and the original entity. In the first normal form (1NF), each row of a table contains only one instance of information.
In any relational database, the first paradigm (1NF) is the basic requirement for relational schemas, and a database that does not meet the first normal form (1NF) is not a relational database.

1.2, the second paradigm (2NF: The non-main attribute part depends on the primary key)
The second paradigm (2NF) is established on the basis of the first paradigm (1NF), i.e. satisfying the second normal form (2NF) must first satisfy the first paradigm (1NF). The second normal form (2NF) requires that each instance or row in a database table must be divided by a unique region. For the implementation of the distinction, it is common to add a column to the table to store unique identities for each instance.
The second normal form (2NF) requires that the attributes of an entity depend entirely on the primary key. The so-called full dependency is the inability to have a property that depends only on the primary key, and if so, this part of the property and the primary key should be separated to form a new entity, and the new entity is a one-to-many relationship with the original entity. For the implementation of the distinction, it is common to add a column to the table to store unique identities for each instance. In short, the second paradigm is that the non-master attribute part relies on the primary key.

1.3. The third paradigm (3NF: attributes are not dependent on other non-principal attributes)
Satisfying the third normal form (3NF) must first satisfy the second normal form (2NF). In short, the third paradigm (3NF) requires that a database table not contain non-primary key information already contained in other tables. For example, there is a departmental information table where each department has information such as department number, department name, department profile, and so on. After the department number is listed in the Employee Information table, the department name, department profile and other department related information can no longer be added to the Employee Information table. If there is no departmental information table, it should be built according to the third paradigm (3NF), otherwise there will be a lot of data redundancy. In short, the third paradigm is that properties do not depend on other non-principal properties.

1.4, anti-three paradigm (anti-3NF: For performance, increase redundancy)

The purpose of 3NF is to reduce redundancy and reduce unnecessary storage, which is necessary for the expensive past of storage devices, but with the reduction of storage devices and the increasing of people's performance, there are some anti-three paradigms.

The so-called anti-three paradigm is for performance, increase redundancy. Take the Department information table as an example, each department has department number, department name, department profile and other information, in accordance with the requirements of 3NF, in order to avoid redundancy, we in the employee table should not join the department name, department profile and other departments related information, the cost is to query two times each time the database. The anti-three paradigm allows us to redundant important information into the employee table, such as the department name, so that we can take the employee information every time we get the department name directly, do not need to query two database, improve program performance.

Second, MySQL optimization of the overall thinking

MySQL optimization should first locate the problem, may lead to MySQL low performance reasons are: too many business logic query, table structure is unreasonable, SQL statement optimization and hardware optimization, from the perspective of optimization effect, The optimization effects of these four optimization points are reduced in turn: Clearing the business logic can help us avoid unnecessary queries, and a reasonable design of the table structure can also help us to query the database less. For SQL statement optimization, we can first use the slow query log (slow query log usage: http://www.cnblogs.com/timlearn/p/4052523.html) to locate the slow query and then optimize for that query. The most common and effective optimization paradigm is to add a reasonable index, which has been explained in the previous blog http://www.cnblogs.com/timlearn/p/4055512.html, and this blog will explain other optimizations or points of attention.

2.1. Use Text/blob type carefully

when the column type is text or blob, we should pay particular attention to the fact that the memory table cannot be created when the selected field has a Text/blob type. You can only create a temporary disk table, and the performance of the hard disk temporary table is worse than the performance of the memory table, so if you want to use the Text/blob type, you should build the table separately, and do not mix the Text/blob type with the core attribute in a single table. To do this, do an experiment as follows:

Create a data table T1 (num int,intro text (1000));//Insert data inserts into T1 values (3, "This is USA"), (4, ' China ');//Query temporary table creation /note, here created_tmp_disk_tables=4,created_tmp_tables=10mysql> show status like '%tmp% '; +------------------------- +-------+| variable_name | Value |+-------------------------+-------+| Created_tmp_disk_tables | 4 | | Created_tmp_files | 9 | | Created_tmp_tables | |+-------------------------+-------+//use GROUP by to query data mysql> SELECT * from t1 GROUP by num;+------+-------------+| num |    Intro |+------+-------------+| 3 |    This is USA | | 4 | China |+------+-------------+2 rows in Set (0.05 sec)//query temporary table creation again//Now, created_tmp_disk_tables=5,created_tmp_tables= 11mysql> Show status like '%tmp% '; +-------------------------+-------+| variable_name | Value |+-------------------------+-------+| Created_tmp_disk_tables | 5 | | Created_tmp_files | 9 | | Created_tmp_tables | |+-------------------------+-------+

2.2, cautious use of sub-query

Almost all subqueries can be rewritten as connection queries, and sometimes the efficiency of the connection query is higher than the subquery, so it is a good idea to rewrite the query into a connection query. If a SELECT statement that uses a subquery takes too long to execute, try rewriting it as a connection query to see if he performs better.

In the following example, we use the subquery time consuming 5s, after emptying the cache, use a connection query instead, only need 0.02s:

Subquery time consuming 5smysql> SELECT * from emp where ename in (select Ename from ename) +--------+--------+----------+-----+------- -----+---------+--------+--------+| Empno | ename | Job | Mgr | HireDate | Sal | Comm |      Deptno |+--------+--------+----------+-----+------------+---------+--------+--------+| 2 | Asoqnr |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 |      466 | | 3 | CAUVTJ |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 |      155 | | 6 | Ooeekl |   Salesman | 1 | 2014-10-29 | 2000.00 |     400.00 |      13 | | 9 | Mfpixn |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 225 | | 103219 | Mfpixn |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 251 | | 318098 | Mfpixn |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 480 | | 333225 | Asoqnr |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 466 | | 443919 | Asoqnr |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 446 | | 458077 | Ooeekl |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 266 || 473649 | Asoqnr |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 448 | | 769138 | CAUVTJ |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 252 | | 826307 | Mfpixn |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 242 |+--------+--------+----------+-----+------------+---------+--------+--------+12 rows in Set (5.04 sec)// Empty cache Reset Query cache;//connection query time 0.02smysql> select emp.* from emp inner join ename on emp.ename=ename.ename;+--------+-- ------+----------+-----+------------+---------+--------+--------+| Empno | ename | Job | Mgr | HireDate | Sal | Comm |      Deptno |+--------+--------+----------+-----+------------+---------+--------+--------+| 6 | Ooeekl |   Salesman | 1 | 2014-10-29 | 2000.00 |     400.00 | 13 | | 458077 | Ooeekl |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 |      266 | | 9 | Mfpixn |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 225 | | 103219 | Mfpixn |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 251 | | 318098 | Mfpixn |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 480 | | 826307 | Mfpixn |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 |      242 | | 3 | CAUVTJ |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 155 | | 769138 | CAUVTJ |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 |      252 | | 2 | Asoqnr |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 466 | | 333225 | Asoqnr |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 466 | | 443919 | Asoqnr |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 446 | | 473649 | Asoqnr |   Salesman | 1 | 2014-10-29 | 2000.00 |    400.00 | 448 |+--------+--------+----------+-----+------------+---------+--------+--------+12 rows in Set (0.02 sec)

2.3. Increase the index on the distinct column

If you do not increase the index on the distinct column, MySQL will create a temporary table at query time, and if we increase the index on that column, you can avoid using temporary tables:

CREATE TABLE message (user_id int,group_id int);//insert 10 data inserts into message values (24,67), (15,76), (134,986), (6,98), ( 46,988), (13,2), (12,89), (17,34), (63,19), (92,74);//not indexed, extra with using temporarymysql> explain select distinct User_ ID from message where group_id=2\g;*************************** 1.  Row *************************** id:1 select_type:simple table:message Type:allpossible_keys: NULL key:null key_len:null ref:null rows:12 extra:using where; Using temporary1 Row in Set (0.01 sec)//Add index CREATE INDEX key_gid_uid on message (group_id, user_id);//Add index, finally do not create temporary table Mys ql> explain select distinct user_id from message where group_id=2\g;*************************** 1.  Row *************************** id:1 select_type:simple table:message Type:refpossible_keys: Key_gid_uid key:key_gid_uid key_len:5 ref:const rows:1 extra:using where; Using IndeX1 Row in Set (0.00 sec)

2.4. Add order BY null after GROUP by

When a group by group query is used, the default grouping is also sorted and may slow down. If sorting is not required, you can add order by null after the group by to avoid sorting after grouping:

Note that there is a using filesortmysql> explain select * from the EMP GROUP by deptno\g;************************ when no ORDER by NULL is added to the extra. 1. Row ***************************           id:1  select_type:simple        table:emp         Type:ALLpossible_keys:NULL          key:null      key_len:null          ref:null         rows:4296846        extra:using temporary; With the using filesort//added order by null, there is no using filesort in extra mysql> explain select * from EMP Group BY DEPTNO ORDER by null \g;*************************** 1. Row ***************************           id:1  select_type:simple        table:emp         Type:ALLpossible_keys:NULL          key:null      key_len:null          ref:null         rows:4354494        extra:using temporary1 row in Set (0.00 sec)

MySQL Learning notes--database paradigm and MySQL optimization overall idea

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More