MySQL Database knowledge Summary (see the data summarized)

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Graduation to now calculate to do more than 3 years of service development, after graduation very little time to think in the school as can spare some spare time to a summary of knowledge, to now is also a summary of some key knowledge, today take advantage of more time, First to the use of more than 3 years of open source relational database MySQL to summarize, collated a bit of knowledge can be divided into the following points:

First, the basic knowledge

Ii. SQL optimization and indexing

Third, the database specification recommendations

Iv. Database Design

V. Database architecture

First, the basic knowledge

Knowledge points mainly include: data type common function character set TRANSACTION ISOLATION level lock mechanism

(1), data type

Numeric types--TINYINT, SMALLINT, Mediumint, INT, BIGINT, DECIMAL

Time Type--date, DATETIME, TIMESTAMP

String type –char, VARCHAR, TEXT

Unique types-ENUM, SET

Strings are case-sensitive--VARBINARY

Note:

* All variable-length strings use VARCHAR types, similar to state, finite categories of fields, but also use a string that can be used to make a significant representation of the actual meaning, instead of using numbers such as int;

* Fixed-length strings using the char type, all single characters using the char type, instead of using the varchar type;

* When the number of characters can be more than 20,000, you can use the TEXT type to hold character class data. All fields that use the TEXT type must be separated from the original table, and the primary key of the original table is composed of a separate table for storage;

* fields that need to be accurate to the time (month, day, and minute) can use DateTime or timestamp, but be aware of the scope of each expression and whether you need to use the timestamp feature, as far as possible timestamp Type replaces datetime to reduce the consumption of data storage space;

* Only fields that need to be accurate to days are all using the DATE type and should not use timestamp or datetime;

* The field of the auto-increment sequence attribute can only use INT or BIGINT, must be explicitly identified as an unsigned type (keyword: UNSIGNED), unless there is a negative number, only if the value of the field exceeds 4.2 billion, the BIGINT type is used, and the self-increment field must be part of the primary key or primary key;

(2) Common functions

Includes: Mathematical function string function date manipulation function type conversion function

* Math function:

ABS (Value): Returns the absolute value, as an example: ABS (-101), return: 101;

Floor (value): minus the decimal number of value, example: Floor (2013.8), return: 2013;

MoD (N,M): Returns the remainder of N divided by M, example: MoD (2013,10), return: 3;

ROUND (value): Returns the rounded value of values, for example: Floor (2013.8), return: 2014;

ROUND (Value,num): Retains the rounding value of num decimal value, example: Floor (2013.867,1), return: 2013.9;

RAND (): Returns a random number;

* String function

String Join function: The string connection method in MySQL, using the CONCAT () or concat_ WS () function, the syntax is as follows: CONCAT (String1,string2,...) Concat_ WS (separator,string1,string2,..)

* String Length statistics

LENGTH (String) #返回string所占的字节数

Char_length (String) #返回string中的字符个数

Statistics of the number of characters, do not distinguish between Chinese characters or letters or numbers, also with the character set is not related, if the statistics are the number of bytes, the characters are Chinese characters, letters or number types, and the character set together.

Note:

All of our MySQL databases will be UTF8 encoded, so a kanji account is 3 bytes, and the English input method has a byte or number of letters.

* String function

String Judgment function

if (EXP1,EXP2,EXP3): If EXP1 is true, return to EXP2, if EXP1 is false, return to EXP3;

Ifnull (EXP1,EXP2): If EXP1 is not NULL, return EXP1, otherwise return EXP2;

Nullif (EXP1,EXP2): If EXP1=EXP2, return null, otherwise return EXP1;

* String substitution function

LTRIM (EXP1): Remove the space at the beginning of the string in EXP1

RTRIM (EXP1): Remove the space at the end of the string in EXP1

TRIM (EXP1): Remove the opening and closing spaces in the EXP1

TRIM (EXP2,EXP1): Remove strings that exist in Exp1 EXP2

* String Lookup function

Substring_index (Exp1,delim,count): Exp1 is a string, Delim is a split symbol, and count represents the first few style symbols, for example: Substring_index (' yto56.com.cn ', '. ', 1), return: yto56

SUBSTRING (Exp1,pos,len): Exp1 is a string, pos is position, Len is length, for example: SUBSTRING (' yto56.com.cn ', 1,9), return: yto56.com

LOCAL (SUBSTR,STR): Find substr in str first position, example lcoal (yto56.com.cn ', '. '), return: 6

* Other String functions

UPPER (EXP1): Converts the string exp1 to uppercase;

LOWER (EXP1): Converts the string exp1 to lowercase;

* Date manipulation function

Date format Conversion Functions:

Converts a string to a date, date_format () or str_to_date (),

The two functions are formatted as follows:

Date_format (Expr1,format)

Str_to_date (EXPR1, format)

The usual date format YYYY-MM-DD HH:MM:SS corresponds with format%y-%m-%d%h:%i:%s

* Date arithmetic function

Curdate (): Return date of day, format: YYYY-MM-DD

Now (): Returns the present datetime, format: Yyyy-mm-dd HH:MM:SS

Date_add (Date,interval expr unit): expr expression, unit, for example: Date_add (' 2013-08-20 ', INTERVAL-1 Day), return: 2013-08-19

* Date arithmetic function

MONTY (date): Returns the number of months of date, such as MONTY (' 2013-08-20 '), returning: 8

Day: Returns the number of days of date, such as day (' 2013-08-20 '), which returns: 20

Year (date): Returns the number of years of date, for example (' 2013-08-20 '), returning: 2013

To_days (date): date corresponding to the number of days, for example to_days (' 2008-10-07 '), return: 733687

WEEK (date): Returns the date corresponding to the week, for example: WEEK (' 2013-08-20 '), return: 2

DATEDIFF (DATE1,DATE2): Returns the number of days difference of date1-date2, for example: DATEDIFF (' 2013-08-20 ', ' 2013-08-18 13:30:14 '), return: 2

* Type conversion function

Cast (expr as type): expr converts to a type

Convert (expr, type): expr converts to a type

Convert (expr USING transcoding_name): expr character set converted to a character set

* Character Set

For wireless application user-selectable portions, it is recommended to use the UTF8 variant, which is UTF8MB4, four bytes storage. In other cases, the default uniform uses the character set as UTF8, and the character set collation rule is utf8_general_ci:

A Chinese character occupies 3 bytes

A symbol, character, and number in the Chinese input method, the storage space is 3 bytes

A symbol, character, and number in the English input method, the storage space is 1 bytes

Case sensitivity is not recommended to set the checksum rule utf8_bin, but the data type VARBINARY

n Interpretation in CHAR (n) or varchar (n)

MySQL database these two types of string definitions when the length of n, not the meaning of the number of bytes, but the meaning of the number of characters, for example: CREATE tabel gl_user (username VARCHAR (40)); The username can store a maximum of 40 characters.

(3), transaction ISOLATION LEVEL

Transactions have four characteristics: atomicity (atomiocity), consistency (consistency), isolation (isolation), and persistence (durability), which are referred to as the ACID properties of the four properties.

* An atomic transaction is a logical unit of work for a database, all operations included in a transaction are either done, or none of the results of a consistent transaction execution must be to change the database from one consistent state to another consistent state

* Isolation the execution of a transaction cannot be disturbed by other transactions. That is, the operations within one transaction and the data used are isolated to other transactions, and the transactions performed concurrently do not interfere with each other.

* Persistent when a transaction is successfully committed, the modification of the data in the database is persistent. Next other operations or failures should not have any effect on their execution results

(4), transaction ISOLATION LEVEL

* Read Uncommited Select allows dirty reads, that is, select reads other transaction modifications without committing the data

* Read commited Select can not be repeated reading, that is, the same transaction two times the same query, if the time between the first and second query period, the other transaction just modified its query data and submitted, two read the data inconsistent

* Repeatable Read Select can be repeated, that is, the same transaction executed two times the same query statement, the resulting data is always consistent

* The only difference between SERIALIZABLE and repeatable reads is that by default the normal SELECT statement is changed to select .... LOCK in SHARE MODE. That is, the data involved in the query statement plus shared locks, blocking other transactions to modify the real data

(5) Lock mechanism

MySQL database, currently mainly 2 kinds of transaction engine: NDB distributed storage engine, InnoDB storage engine

NDB Distributed Storage Engine

Only transaction ISOLATION LEVEL supported: Read commited, row lock available

InnoDB Storage Engine

Supports four isolation levels, where the default isolation level is repeatable READ

The implementation of the lock is implemented by an index, and an update or delete without an index can be upgraded to a table lock

Isolation level, also known as data visibility, is implemented through MVCC

Ii. SQL optimization and indexing

Knowledge points mainly include: index of the principle of the function of the index optimization of order subquery optimization paging optimization left connection right and wrong index overlay Technology index creation guidelines

(1), the principle of the index

In fact, it is similar to a dictionary, the Directory of the dictionary is an analogy to the index, through the directory query specific content analogy to the record in the table;

(2), the role of the index:

* Go to Heavy

* Speed up Find location

* Avoid sorting

* Overlay Index

(3), the Order of optimization

Business optimization first, next design and program optimization, final database and SQL optimization, etc.

SQL, index, and database-side optimizations are processes that need to be rebalanced as data grows and data is distributed

(4), sub-query optimization

Original SQL statement

Select ID from t_exp_op_record_taking_success WHERE ORDERID not in (SELECT ORDERID from T_exp_op_record_taking_success WHE RE Aux_op_code in (' NEW ', ' UPDATE ', ' DELETE ') and Exp_type = Ten and Op_code in (310,311)) LIMIT 10000;

Optimized SQL statement SELECT ID from t_exp_op_record_taking_success WHERE aux_op_code not in (' NEW ', ' UPDATE ', ' DELETE ') OR exp_type! = Ten OR Op_code not in (310,311) LIMIT 10000;

Summary reduction of one full table scan reduces back to table read data 10,000 times only need to scan the table, the record value satisfies the condition reaches 10,000 to stop scanning

(5), the left join of the answer

Correct wording: SELECT m.columnname......,n.* columnname ..... From left_table m left JOIN right_table N in M. Columnname_join=n. Columnname_join and N. columnname=xxx WHERE M.columnnam E=xxx and M. ...

Interpretation: The ON clause joins the condition in M. Columnname_join=n. Columnname_join, which is used to connect the equivalent records in two tables, but does not affect the number of recordsets. If a record in table left_table cannot be found in the table right_table, the record is still displayed in the result set, and the table right_table the values of the columns that need to be displayed in the query result set are substituted with null;

The ON clause joins the table right_table in the condition. Columnname=xxx is used to control whether the Right_table table has the required column values or is displayed in the query column in null substitution, without affecting the number of recordsets;

The WHERE clause controls whether the record conforms to the query request, does not conform and then filters out; one of the common mistakes: SELECT m.columnname......,n.* columnname .... From left_table m left JOIN right_table N in M. Columnname_join=n. Columnname_join and N. Columnname=xxx and M.columnname= XXX;

(6), Index coverage technology

Index overrides the condition of the SQL statement where the condition is part of the index the SQL statement's order or group field is part of the index. The value returned by the SQL statement or the computed metadata column is part of the index. After the index is overwritten, all operations are indexed by index completion, You do not need to access the metadata sample idx_uid_gmtcreate (uid,gmt_create) SELECT uid, gmt_create from tablename WHERE uid>100;

(7), Index creation criteria

The number of indexes in a single table does not exceed 5

The number of fields in a single index does not exceed 5

Use prefix index for string, prefix index length not exceeding 10 characters

Index structure and statistical information maintenance are costly to maximize index reuse

Filter rate for each field in the composite index

The sequential information of the compound index field appearing in all SQL, mainly refers to the where, GROUP by, and ORDER BY clauses

The pre-and post-order of each field in a composite index is extremely important

Third, the database specification recommendations

Knowledge points include: Naming specification, field definition specification , Application account permissions, usage specification of SELECT *, adding function usage specifications on fields, paging query specification, data type conversion specification, DML statement Availability index, product development and on-line process recommendations

(1), naming specification

* Index Name:

MySQL index name capitalization causes performance problems when the optimizer makes an index selection, the index naming convention is all uppercase or all lowercase, the index name begins with Idx_ or IDX_, the end is _u (or _u), or _n (or _n), where: _u or _u represents a uniqueness index, _n or _ n represents the normal index, the default recommended index name lowercase format

* Object naming and capitalization:

Improve the efficiency of developers, reduce the cost of research and development, avoid the failure point of the program, database objects: Database name, table name, stored procedure name, custom function name, trigger name, view name, and task schedule name are set by default to case-insensitive, but it is recommended to use the uniform lowercase

The library name, table name, field name are recommended lowercase letters and are split with underscores

The library name, table name, field name are forbidden for more than 40 characters. Must be known as a name

Library name, table name, field name prohibit use of MySQL reserved word temporary library, table name must be prefixed with TMP and date suffix

Backup library, table must be prefixed with bak and date suffix

* Field Definition Specification

Field properties are forced to be defined as not NULL

The default value for a field property of a string type is a space, or '

field property of numeric type default value is 0 Time Type field property default value is ' 1970-01-01 08:00:01 '

If the date of birth is a useful field, the default value is a space, or the data constraint of the field is unique except for uniqueness, other constraints are guaranteed by the application

(2), Application account permissions

Normal permissions in order to reduce the risk of data security, the application account is granted only four database permissions: Select, INSERT, UPDATE, DELETE for read-only types of applications, the permissions that should be granted to read-only accounts are: select;

Special permissions Given that we have an event, PROCEDURE, and may need to authorize the relevant permissions, the permissions granted are: EXECUTE;

(3), SELECT * The use of the Code

Basic principle: Use "*" to list all fields in an unnecessary query, and when a group by or order by is present, it is forbidden to use SELECT * To remove all fields at once. For JOIN statements with table joins, SELECT * is forbidden to query. A table with a text field that is also forbidden to query with SELECT * when it is not necessary to remove the text field.

Detailed Description:

1> when a group by or order by is not allowed to use SELECT * is to ensure that MySQL is able to use the latest optimized sorting algorithm

The 2> JOIN statement does not allow SELECT * to prevent queries that simply need to be indexed to be returned to the table

3> has the Text field table and does not allow SELECT * when it is not necessary to remove the text field, because the text field is stored in a physical location that is not the same as the normal record, resulting in a large number of IO operations

4> SELECT * Increases CPU, IO overhead, and consumes more network bandwidth, and also affects performance

5> Avoid program bugs caused by adding or deleting fields without modifying related SQL and related program code, and disable SELECT *

(4), add Function usage specification on field

Basic principle: Prohibit the conversion of a type or format using any function on a filter field that appears in the Where condition, and it is a good practice to convert the value of the incoming comparison into a column type.

Wrong wording: SELECT uid, gmt_create from Gl_user WHERE UPPER (username) =upper (' yto56.com ');

Correct wording: SELECT uid, gmt_create from Gl_user WHERE username= UPPER (' yto56.com ');

(5), paging query specification

Basic principle: Paging query statements all need to be sorted, unless the business party explicitly requires that no sorting be used to randomly present the data. Detailed Description:

1> Regular Paging statements (start: Number of start Records, Page_offset: Records per page): SELECT id,username from Gl_user WHERE username like '%@163.com ' ORDER by M. Gmt_create LIMIT start, page_offset;

2> a paginated statement for a multi-table join, if the filter condition is on a single table, it needs to be paged first, then join low-performance notation: SELECT m.username,p.rolename from gl_user m INNER Join gl_role P on M. Id=p.userid WHERE username like '%@163.com ' ORDER by M.gmt_create LIMIT start, page_offset; High-performance notation: Select M.username,p.rolename from (select Id,username from Gl_user WHERE username like '%@163.com ' ORDER by M.gmt_c reate LIMIT Start, page_offset) m,gl_role P WHERE M.id=p.userid;

This is written on the premise that record one by one corresponds to the associated table, otherwise the number of records that may be returned is less than or more than the Page_offset value.

(6), data type conversion specification

Basic principle: In all Query's where condition must use and filter the field exactly consistent data type, eliminates any implicit type conversion, avoids causes the query execution plan error because of the data type mismatch, causes the performance problem.

Detailed Description:

1> the use of a function for type conversion is not allowed on all fields in the Where condition, and if you need to convert the type, you can only convert the filter value, not the converted field.

2> the most common implicit type conversions are common between time types and string types, and it is recommended that all time type fields be passed in ibatis with a time type, or as a string, and then convert the string to a valid time format by a time function, as follows: SELECT * from member WHERE gmt_create=date_formate (' 2009010101:02:03 ', '%y-%m-%d%h:%i:%s ');

3> in table join Query, if the data types at both ends of the join condition are inconsistent, you must ensure that the connection condition data type of the driver table is converted to a data type that is consistent with the driven table.

(7), DML statements have indexes available

The *DML statement refers to: INSERT, update, and delete, this article refers to the following 2 types of operations

(2) InnoDB storage Engine locks with index, no index causes upgrade to table lock

(3) DML is upgraded to a table-level lock, and an exclusive lock is added, which can clog other business of the same table

(8), product development and on-line process recommendations

Iv. Database Design

Knowledge points include: field type Storage engine primary key selection main sub-table Design large section separation hot and cold data separation

(1), field type

The value of the field is as digital as possible, and the data type is integer

The length of the storage space of the field, using the data type that is short and meets the requirements

Time type use timestamp as much as possible, except for date of birth using datetime

Try not to use data types: TEXT and BLOBs

String type try to use varchar instead of char

Replace the float and double types with the decimal type

(2), storage engine

* InnoDB Storage Engine:

Support Transactions support row and table lock data security strong BP support simultaneous cache metadata and index data support MVCC support B + Tree, only adaptive index for hash support full-text search;

* MyISAM Storage Engine:

KB only cached index data supports only table locks and does not support transactional support for full-text retrieval of geodata storage;

* Memory Storage Engine:

Metadata and indexes are cached in memory that do not support transactions, only table locks are supported support for data persistence supports B + tree and hash index;

* Sphinx supports full-text indexing:

Support for Chinese word segmentation support to create full-scale indexes and incremental indexes to support distributed deployment;

(3), primary key selection

(4), main sub-table design

Example of receipt data for a front-facing database

T_exp_op_record_sgnature

T_exp_op_record_sgnature_deal

T_exp_op_record_sgnature_fail

T_exp_op_record_sgnature_success

T_exp_op_record_sgnature_success_02

T_exp_op_record_sgnature_success_03

T_exp_op_record_sgnature_unsuccess

T_exp_op_record_sgnature_success1

T_exp_op_record_sgnature_update

The number of summary fields is up to 56, the minimum is 54 fields, the same number of fields is 54 for each table field value: 52 The same field value exists in one table, and the secondary table records 8 types of expanded data

(5), large print segment separation

varchar Type field is less than 255 bytes in length and 1 bytes marked

varchar Type field is longer than 255 bytes, 2 bytes marked

Text Type field, 1-4 byte mark, 8 bytes as pointer

The value of the text Type field is stored in the Invisible column

Summary: Large print segments are generally stored separately in a single table, and as much as possible with varchar instead of text

(6), hot and cold data separation

Suitable for hot and cold separation scenarios:

Table data fields have a large number of infrequently accessed and partially frequently accessed fields

The table's data fields have frequently changed fields and lots of very few modified fields

The table's data fields are displayed together in a small amount or low frequency

How to Detach: store data with low frequent access and access in 2 tables, associated with a table's primary key ID

Store frequently modified fields and rarely modified fields in different tables

Increase data access speed and data modification speed to reduce IO operations

Note: The hot and cold separation of the scene must be combined with business characteristics and actual data manipulation situation

V. Database architecture

Knowledge points include: Dual master replication architecture a master multi-slave architecture distributed database pseudo-distributed database

(1), Dual master replication architecture

(2), one master multi-slave architecture

(3), distributed database

(4), pseudo-distributed database architecture

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MySQL Database knowledge Summary (see the data summarized)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

MySQL Database knowledge Summary (see the data summarized)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support