Graduation to now calculate to do more than 3 years of service development, after graduation very little time to think in the school as can spare some spare time to a summary of knowledge, to now is also a summary of some key knowledge, today take advantage of more time, First to the use of more than 3 years of open source relational database MySQL to summarize, collated a bit of knowledge can be divided into the following points:
First, the basic knowledge
Ii. SQL optimization and indexing
Third, the database specification recommendations
Iv. Database Design
V. Database architecture
First, the basic knowledge
Knowledge points mainly include: data type common function character set TRANSACTION ISOLATION level lock mechanism
(1), data type
Numeric types--TINYINT, SMALLINT, Mediumint, INT, BIGINT, DECIMAL
Time Type--date, DATETIME, TIMESTAMP
String type –char, VARCHAR, TEXT
Unique types-ENUM, SET
Strings are case-sensitive--VARBINARY
Note:
* All variable-length strings use VARCHAR types, similar to state, finite categories of fields, but also use a string that can be used to make a significant representation of the actual meaning, instead of using numbers such as int;
* Fixed-length strings using the char type, all single characters using the char type, instead of using the varchar type;
* When the number of characters can be more than 20,000, you can use the TEXT type to hold character class data. All fields that use the TEXT type must be separated from the original table, and the primary key of the original table is composed of a separate table for storage;
* fields that need to be accurate to the time (month, day, and minute) can use DateTime or timestamp, but be aware of the scope of each expression and whether you need to use the timestamp feature, as far as possible timestamp Type replaces datetime to reduce the consumption of data storage space;
* Only fields that need to be accurate to days are all using the DATE type and should not use timestamp or datetime;
* The field of the auto-increment sequence attribute can only use INT or BIGINT, must be explicitly identified as an unsigned type (keyword: UNSIGNED), unless there is a negative number, only if the value of the field exceeds 4.2 billion, the BIGINT type is used, and the self-increment field must be part of the primary key or primary key;
(2) Common functions
Includes: Mathematical function string function date manipulation function type conversion function
* Math function:
ABS (Value): Returns the absolute value, as an example: ABS (-101), return: 101;
Floor (value): minus the decimal number of value, example: Floor (2013.8), return: 2013;
MoD (N,M): Returns the remainder of N divided by M, example: MoD (2013,10), return: 3;
ROUND (value): Returns the rounded value of values, for example: Floor (2013.8), return: 2014;
ROUND (Value,num): Retains the rounding value of num decimal value, example: Floor (2013.867,1), return: 2013.9;
RAND (): Returns a random number;
* String function
String Join function: The string connection method in MySQL, using the CONCAT () or concat_ WS () function, the syntax is as follows: CONCAT (String1,string2,...) Concat_ WS (separator,string1,string2,..)
* String Length statistics
LENGTH (String) #返回string所占的字节数
Char_length (String) #返回string中的字符个数
Statistics of the number of characters, do not distinguish between Chinese characters or letters or numbers, also with the character set is not related, if the statistics are the number of bytes, the characters are Chinese characters, letters or number types, and the character set together.
Note:
All of our MySQL databases will be UTF8 encoded, so a kanji account is 3 bytes, and the English input method has a byte or number of letters.
* String function
String Judgment function
if (EXP1,EXP2,EXP3): If EXP1 is true, return to EXP2, if EXP1 is false, return to EXP3;
Ifnull (EXP1,EXP2): If EXP1 is not NULL, return EXP1, otherwise return EXP2;
Nullif (EXP1,EXP2): If EXP1=EXP2, return null, otherwise return EXP1;
* String substitution function
LTRIM (EXP1): Remove the space at the beginning of the string in EXP1
RTRIM (EXP1): Remove the space at the end of the string in EXP1
TRIM (EXP1): Remove the opening and closing spaces in the EXP1
TRIM (EXP2,EXP1): Remove strings that exist in Exp1 EXP2
* String Lookup function
Substring_index (Exp1,delim,count): Exp1 is a string, Delim is a split symbol, and count represents the first few style symbols, for example: Substring_index (' yto56.com.cn ', '. ', 1), return: yto56
SUBSTRING (Exp1,pos,len): Exp1 is a string, pos is position, Len is length, for example: SUBSTRING (' yto56.com.cn ', 1,9), return: yto56.com
LOCAL (SUBSTR,STR): Find substr in str first position, example lcoal (yto56.com.cn ', '. '), return: 6
* Other String functions
UPPER (EXP1): Converts the string exp1 to uppercase;
LOWER (EXP1): Converts the string exp1 to lowercase;
* Date manipulation function
Date format Conversion Functions:
Converts a string to a date, date_format () or str_to_date (),
The two functions are formatted as follows:
Date_format (Expr1,format)
Str_to_date (EXPR1, format)
The usual date format YYYY-MM-DD HH:MM:SS corresponds with format%y-%m-%d%h:%i:%s
* Date arithmetic function
Curdate (): Return date of day, format: YYYY-MM-DD
Now (): Returns the present datetime, format: Yyyy-mm-dd HH:MM:SS
Date_add (Date,interval expr unit): expr expression, unit, for example: Date_add (' 2013-08-20 ', INTERVAL-1 Day), return: 2013-08-19
* Date arithmetic function
MONTY (date): Returns the number of months of date, such as MONTY (' 2013-08-20 '), returning: 8
Day: Returns the number of days of date, such as day (' 2013-08-20 '), which returns: 20
Year (date): Returns the number of years of date, for example (' 2013-08-20 '), returning: 2013
To_days (date): date corresponding to the number of days, for example to_days (' 2008-10-07 '), return: 733687
WEEK (date): Returns the date corresponding to the week, for example: WEEK (' 2013-08-20 '), return: 2
DATEDIFF (DATE1,DATE2): Returns the number of days difference of date1-date2, for example: DATEDIFF (' 2013-08-20 ', ' 2013-08-18 13:30:14 '), return: 2
* Type conversion function
Cast (expr as type): expr converts to a type
Convert (expr, type): expr converts to a type
Convert (expr USING transcoding_name): expr character set converted to a character set
* Character Set
For wireless application user-selectable portions, it is recommended to use the UTF8 variant, which is UTF8MB4, four bytes storage. In other cases, the default uniform uses the character set as UTF8, and the character set collation rule is utf8_general_ci:
A Chinese character occupies 3 bytes
A symbol, character, and number in the Chinese input method, the storage space is 3 bytes
A symbol, character, and number in the English input method, the storage space is 1 bytes
Case sensitivity is not recommended to set the checksum rule utf8_bin, but the data type VARBINARY
n Interpretation in CHAR (n) or varchar (n)
MySQL database these two types of string definitions when the length of n, not the meaning of the number of bytes, but the meaning of the number of characters, for example: CREATE tabel gl_user (username VARCHAR (40)); The username can store a maximum of 40 characters.
(3), transaction ISOLATION LEVEL
Transactions have four characteristics: atomicity (atomiocity), consistency (consistency), isolation (isolation), and persistence (durability), which are referred to as the ACID properties of the four properties.
* An atomic transaction is a logical unit of work for a database, all operations included in a transaction are either done, or none of the results of a consistent transaction execution must be to change the database from one consistent state to another consistent state
* Isolation the execution of a transaction cannot be disturbed by other transactions. That is, the operations within one transaction and the data used are isolated to other transactions, and the transactions performed concurrently do not interfere with each other.
* Persistent when a transaction is successfully committed, the modification of the data in the database is persistent. Next other operations or failures should not have any effect on their execution results
(4), transaction ISOLATION LEVEL
* Read Uncommited Select allows dirty reads, that is, select reads other transaction modifications without committing the data
* Read commited Select can not be repeated reading, that is, the same transaction two times the same query, if the time between the first and second query period, the other transaction just modified its query data and submitted, two read the data inconsistent
* Repeatable Read Select can be repeated, that is, the same transaction executed two times the same query statement, the resulting data is always consistent
* The only difference between SERIALIZABLE and repeatable reads is that by default the normal SELECT statement is changed to select .... LOCK in SHARE MODE. That is, the data involved in the query statement plus shared locks, blocking other transactions to modify the real data
(5) Lock mechanism
MySQL database, currently mainly 2 kinds of transaction engine: NDB distributed storage engine, InnoDB storage engine
NDB Distributed Storage Engine
Only transaction ISOLATION LEVEL supported: Read commited, row lock available
InnoDB Storage Engine
Supports four isolation levels, where the default isolation level is repeatable READ
The implementation of the lock is implemented by an index, and an update or delete without an index can be upgraded to a table lock
Isolation level, also known as data visibility, is implemented through MVCC
Ii. SQL optimization and indexing
Knowledge points mainly include: index of the principle of the function of the index optimization of order subquery optimization paging optimization left connection right and wrong index overlay Technology index creation guidelines
(1), the principle of the index
In fact, it is similar to a dictionary, the Directory of the dictionary is an analogy to the index, through the directory query specific content analogy to the record in the table;
(2), the role of the index:
* Go to Heavy
* Speed up Find location
* Avoid sorting
* Overlay Index
(3), the Order of optimization
Business optimization first, next design and program optimization, final database and SQL optimization, etc.
SQL, index, and database-side optimizations are processes that need to be rebalanced as data grows and data is distributed
(4), sub-query optimization
Original SQL statement
Select ID from t_exp_op_record_taking_success WHERE ORDERID not in (SELECT ORDERID from T_exp_op_record_taking_success WHE RE Aux_op_code in (' NEW ', ' UPDATE ', ' DELETE ') and Exp_type = Ten and Op_code in (310,311)) LIMIT 10000;
Optimized SQL statement SELECT ID from t_exp_op_record_taking_success WHERE aux_op_code not in (' NEW ', ' UPDATE ', ' DELETE ') OR exp_type! = Ten OR Op_code not in (310,311) LIMIT 10000;
Summary reduction of one full table scan reduces back to table read data 10,000 times only need to scan the table, the record value satisfies the condition reaches 10,000 to stop scanning
(5), the left join of the answer
Correct wording: SELECT m.columnname......,n.* columnname ..... From left_table m left JOIN right_table N in M. Columnname_join=n. Columnname_join and N. columnname=xxx WHERE M.columnnam E=xxx and M. ...
Interpretation: The ON clause joins the condition in M. Columnname_join=n. Columnname_join, which is used to connect the equivalent records in two tables, but does not affect the number of recordsets. If a record in table left_table cannot be found in the table right_table, the record is still displayed in the result set, and the table right_table the values of the columns that need to be displayed in the query result set are substituted with null;
The ON clause joins the table right_table in the condition. Columnname=xxx is used to control whether the Right_table table has the required column values or is displayed in the query column in null substitution, without affecting the number of recordsets;
The WHERE clause controls whether the record conforms to the query request, does not conform and then filters out; one of the common mistakes: SELECT m.columnname......,n.* columnname .... From left_table m left JOIN right_table N in M. Columnname_join=n. Columnname_join and N. Columnname=xxx and M.columnname= XXX;
(6), Index coverage technology
Index overrides the condition of the SQL statement where the condition is part of the index the SQL statement's order or group field is part of the index. The value returned by the SQL statement or the computed metadata column is part of the index. After the index is overwritten, all operations are indexed by index completion, You do not need to access the metadata sample idx_uid_gmtcreate (uid,gmt_create) SELECT uid, gmt_create from tablename WHERE uid>100;
(7), Index creation criteria
The number of indexes in a single table does not exceed 5
The number of fields in a single index does not exceed 5
Use prefix index for string, prefix index length not exceeding 10 characters
Index structure and statistical information maintenance are costly to maximize index reuse
Filter rate for each field in the composite index
The sequential information of the compound index field appearing in all SQL, mainly refers to the where, GROUP by, and ORDER BY clauses
The pre-and post-order of each field in a composite index is extremely important
Third, the database specification recommendations
Knowledge points include: Naming specification, field definition specification , Application account permissions, usage specification of SELECT *, adding function usage specifications on fields, paging query specification, data type conversion specification, DML statement Availability index, product development and on-line process recommendations
(1), naming specification
* Index Name:
MySQL index name capitalization causes performance problems when the optimizer makes an index selection, the index naming convention is all uppercase or all lowercase, the index name begins with Idx_ or IDX_, the end is _u (or _u), or _n (or _n), where: _u or _u represents a uniqueness index, _n or _ n represents the normal index, the default recommended index name lowercase format
* Object naming and capitalization:
Improve the efficiency of developers, reduce the cost of research and development, avoid the failure point of the program, database objects: Database name, table name, stored procedure name, custom function name, trigger name, view name, and task schedule name are set by default to case-insensitive, but it is recommended to use the uniform lowercase
The library name, table name, field name are recommended lowercase letters and are split with underscores
The library name, table name, field name are forbidden for more than 40 characters. Must be known as a name
Library name, table name, field name prohibit use of MySQL reserved word temporary library, table name must be prefixed with TMP and date suffix
Backup library, table must be prefixed with bak and date suffix
* Field Definition Specification
Field properties are forced to be defined as not NULL
The default value for a field property of a string type is a space, or '
field property of numeric type default value is 0 Time Type field property default value is ' 1970-01-01 08:00:01 '
If the date of birth is a useful field, the default value is a space, or the data constraint of the field is unique except for uniqueness, other constraints are guaranteed by the application
(2), Application account permissions
Normal permissions in order to reduce the risk of data security, the application account is granted only four database permissions: Select, INSERT, UPDATE, DELETE for read-only types of applications, the permissions that should be granted to read-only accounts are: select;
Special permissions Given that we have an event, PROCEDURE, and may need to authorize the relevant permissions, the permissions granted are: EXECUTE;
(3), SELECT * The use of the Code
Basic principle: Use "*" to list all fields in an unnecessary query, and when a group by or order by is present, it is forbidden to use SELECT * To remove all fields at once. For JOIN statements with table joins, SELECT * is forbidden to query. A table with a text field that is also forbidden to query with SELECT * when it is not necessary to remove the text field.
Detailed Description:
1> when a group by or order by is not allowed to use SELECT * is to ensure that MySQL is able to use the latest optimized sorting algorithm
The 2> JOIN statement does not allow SELECT * to prevent queries that simply need to be indexed to be returned to the table
3> has the Text field table and does not allow SELECT * when it is not necessary to remove the text field, because the text field is stored in a physical location that is not the same as the normal record, resulting in a large number of IO operations
4> SELECT * Increases CPU, IO overhead, and consumes more network bandwidth, and also affects performance
5> Avoid program bugs caused by adding or deleting fields without modifying related SQL and related program code, and disable SELECT *
(4), add Function usage specification on field
Basic principle: Prohibit the conversion of a type or format using any function on a filter field that appears in the Where condition, and it is a good practice to convert the value of the incoming comparison into a column type.
Wrong wording: SELECT uid, gmt_create from Gl_user WHERE UPPER (username) =upper (' yto56.com ');
Correct wording: SELECT uid, gmt_create from Gl_user WHERE username= UPPER (' yto56.com ');
(5), paging query specification
Basic principle: Paging query statements all need to be sorted, unless the business party explicitly requires that no sorting be used to randomly present the data. Detailed Description:
1> Regular Paging statements (start: Number of start Records, Page_offset: Records per page): SELECT id,username from Gl_user WHERE username like '%@163.com ' ORDER by M. Gmt_create LIMIT start, page_offset;
2> a paginated statement for a multi-table join, if the filter condition is on a single table, it needs to be paged first, then join low-performance notation: SELECT m.username,p.rolename from gl_user m INNER Join gl_role P on M. Id=p.userid WHERE username like '%@163.com ' ORDER by M.gmt_create LIMIT start, page_offset; High-performance notation: Select M.username,p.rolename from (select Id,username from Gl_user WHERE username like '%@163.com ' ORDER by M.gmt_c reate LIMIT Start, page_offset) m,gl_role P WHERE M.id=p.userid;
This is written on the premise that record one by one corresponds to the associated table, otherwise the number of records that may be returned is less than or more than the Page_offset value.
(6), data type conversion specification
Basic principle: In all Query's where condition must use and filter the field exactly consistent data type, eliminates any implicit type conversion, avoids causes the query execution plan error because of the data type mismatch, causes the performance problem.
Detailed Description:
1> the use of a function for type conversion is not allowed on all fields in the Where condition, and if you need to convert the type, you can only convert the filter value, not the converted field.
2> the most common implicit type conversions are common between time types and string types, and it is recommended that all time type fields be passed in ibatis with a time type, or as a string, and then convert the string to a valid time format by a time function, as follows: SELECT * from member WHERE gmt_create=date_formate (' 2009010101:02:03 ', '%y-%m-%d%h:%i:%s ');
3> in table join Query, if the data types at both ends of the join condition are inconsistent, you must ensure that the connection condition data type of the driver table is converted to a data type that is consistent with the driven table.
(7), DML statements have indexes available
The *DML statement refers to: INSERT, update, and delete, this article refers to the following 2 types of operations
(2) InnoDB storage Engine locks with index, no index causes upgrade to table lock
(3) DML is upgraded to a table-level lock, and an exclusive lock is added, which can clog other business of the same table
(8), product development and on-line process recommendations
Iv. Database Design
Knowledge points include: field type Storage engine primary key selection main sub-table Design large section separation hot and cold data separation
(1), field type
The value of the field is as digital as possible, and the data type is integer
The length of the storage space of the field, using the data type that is short and meets the requirements
Time type use timestamp as much as possible, except for date of birth using datetime
Try not to use data types: TEXT and BLOBs
String type try to use varchar instead of char
Replace the float and double types with the decimal type
(2), storage engine
* InnoDB Storage Engine:
Support Transactions support row and table lock data security strong BP support simultaneous cache metadata and index data support MVCC support B + Tree, only adaptive index for hash support full-text search;
* MyISAM Storage Engine:
KB only cached index data supports only table locks and does not support transactional support for full-text retrieval of geodata storage;
* Memory Storage Engine:
Metadata and indexes are cached in memory that do not support transactions, only table locks are supported support for data persistence supports B + tree and hash index;
* Sphinx supports full-text indexing:
Support for Chinese word segmentation support to create full-scale indexes and incremental indexes to support distributed deployment;
(3), primary key selection
(4), main sub-table design
Example of receipt data for a front-facing database
T_exp_op_record_sgnature
T_exp_op_record_sgnature_deal
T_exp_op_record_sgnature_fail
T_exp_op_record_sgnature_success
T_exp_op_record_sgnature_success_02
T_exp_op_record_sgnature_success_03
T_exp_op_record_sgnature_unsuccess
T_exp_op_record_sgnature_success1
T_exp_op_record_sgnature_update
The number of summary fields is up to 56, the minimum is 54 fields, the same number of fields is 54 for each table field value: 52 The same field value exists in one table, and the secondary table records 8 types of expanded data
(5), large print segment separation
varchar Type field is less than 255 bytes in length and 1 bytes marked
varchar Type field is longer than 255 bytes, 2 bytes marked
Text Type field, 1-4 byte mark, 8 bytes as pointer
The value of the text Type field is stored in the Invisible column
Summary: Large print segments are generally stored separately in a single table, and as much as possible with varchar instead of text
(6), hot and cold data separation
Suitable for hot and cold separation scenarios:
Table data fields have a large number of infrequently accessed and partially frequently accessed fields
The table's data fields have frequently changed fields and lots of very few modified fields
The table's data fields are displayed together in a small amount or low frequency
How to Detach: store data with low frequent access and access in 2 tables, associated with a table's primary key ID
Store frequently modified fields and rarely modified fields in different tables
Increase data access speed and data modification speed to reduce IO operations
Note: The hot and cold separation of the scene must be combined with business characteristics and actual data manipulation situation
V. Database architecture
Knowledge points include: Dual master replication architecture a master multi-slave architecture distributed database pseudo-distributed database
(1), Dual master replication architecture
(2), one master multi-slave architecture
(3), distributed database
(4), pseudo-distributed database architecture