I. Problems with MySQL
Optimizer does not support complex SQL
Poor support for SQL standard
Large-scale cluster scheme is immature, mainly refers to middleware
ID generator, global self-increment ID
Asynchronous logical replication, data security issues
Online DDL
Ha scenario is not perfect
Backup and recovery scenarios are more complex and need to rely on external components
Show too little information to the user, difficult to troubleshoot problems
Many branches that make it difficult to choose.
Ii. Introduction of the database environment
Generally speaking, the databases of each Internet company are divided into 5 database environments:
Dev : Development environment, development of read-write, can modify table structure; 163 of the commonly used database tables; Developers can modify the table structure, you can modify the data therein; But you need to ensure that other development colleagues are not affected
QA : Test environment, development readable, developer can modify table structure by tool
SIM: Simulation environment, development can read and write, through the Web platform, when initiating an on-line request, will first in this environment pre-execution, this environment is also available for deployment on-line drill or stress test use can read and write
Real: Production database from the library (quasi-real-time synchronization), read-only environment, does not allow the modification of data, does not allow the modification of table structure; For online problem finding, data query and other use
Online: Online environment; developers are not allowed to perform database operations directly on-line, and if required, they must be operated on and logged by the DBA
These environments of the machine, must be clear, read and write account separation, and identification, can distinguish the specific business. For example, user name W_wap, R_wap can see, read and write account is WAP application
Third, the database development code
The development specification itself also contains several parts: basic naming and constraint specification, field design specification, index specification, usage specification, etc.
Normative existence meaning
Guaranteed Online Database Schema specification
Reduce the probability of a problem
Easy and automated management
Specifications need to persist for a long time, for development and DBA is a winning thing
Constraint specification
Table Character Set selection UTF8, if need to store emoj emoticons, need to use UTF8MB4 (MySQL 5.5.3 later support)
The storage engine uses InnoDB
Variable length string use varchar VARBINARY as much as possible
Do not store pictures, files in the database
- You need to add comments when designing a table
Single-table data volume control under 100 million, single table physical size not more than 10GB, line average length of not more than 8KB
Prohibit online database pressure test
Prohibit direct database from test, development environment
Create DATABASE Test_crm default character Set=utf8;
Basic Naming conventions
Library name, table name, field name prohibit use of reserved words
The library name, table name, field name, index name use lowercase letters, the following lines are split, need to see the name of the understanding
Library name, table name, field name, index name do not design too long, prohibit more than 32 characters, as far as possible with a minimum number of characters to express the purpose of the table
Temporary library, temporary table name must be prefixed with TMP and date suffix
- Backup library, table must be prefixed with bak and date suffix
Library name, table name, field name, index name use noun as database name, and only in English, no Chinese pinyin
The library name uses the English alphabet, all lowercase, controls within 3-7 letters
Library name if you have more than one word, use an underscore to separate it, not the hump named
Specification of sub-table
Prohibit the use of partitioned tables
Split large and low-frequency fields, separating hot and cold data
Hash table using hash, table name suffix using decimal number, subscript starting from 0
According to the period of time table should conform to YYYY[MM][DD][HH] format
Using the appropriate sub-database table strategy
Field specification
All fields are defined as NOT NULL, unless you really want to save null, but I don't think I need null.
The smaller the field type is in satisfying the requirement, the better, the use of unsigned to store nonnegative integers, the actual use of the storage negative number of scenes are not many
With timestamp storage time, use unsigned int to store IPV4 addresses instead of char (15), which only stores IPv4 and cannot store IPV6
Using varchar to store variable-length strings, it is important to note that the M in varchar (m) refers to the number of characters is not the number of bytes;
Use decimal instead of float and double to store exact floating-point numbers
As much as possible without BLOB TEXT
Use tinyint instead of enum type to convert characters to numbers
Prohibit storing plaintext passwords in the database
Use varbinary to store case-sensitive variable strings
Index specification
The number of single indexed fields does not exceed 5, the number of single-table indexes does not exceed 5, and the index design follows the principle of the leftmost prefix matching B + Tree index
Select a high-sensitivity column as an index, with a high degree of sensitivity placed in front
- Use prefix index for string, prefix index length not exceeding 8 characters
It is recommended that you prioritize the prefix index, add pseudo-columns if necessary, and make an index
The established index can cover 80% main queries, not perfection, and solve the major contradiction of the problem
DML and order by and group by fields to establish an appropriate index
Avoid an implicit conversion of an index
Avoid redundant indexing
- About PRIMARY key: Table must have primary key; Do not use updated columns; Do not select string columns; no UUID MD5 hash; default use of non-null unique keys, we recommend that you choose the self-increment or the generator
Important SQL must be indexed: the Where Condition column of the UPDATE, delete statement; fields for ORDER by, GROUP by, distinct; fields for multi-table joins
Core SQL overrides the overriding index
Do not index on low cardinality columns, such as "gender"
Do not perform mathematical operations and function operations on indexed columns
Try not to make it. Foreign keys, which are used to protect referential integrity, can be implemented on the business side, interacting with fathers and child tables, and reducing usability; InnoDB itself limits online DDL
A query, such as "%ab", that does not make the% leading
Do not use negative queries, such as not in/like "cannot use index, resulting in full table scan
Implicit conversion example, the field is defined as varchar, but the incoming value is an int, which results in a full table scan, requiring the terminal to do type checking
Field: Remark varchar (not Null)
Mysql>select ID, gift_code from gift WHERE deal_id = 640 and Remark=115127;1 row in Set (0.14 sec) Mysql>select ID, g Ift_code from Pool_gift wheredeal_id = 640 and remark= ' 115127 '; 1 row in Set (0.005 sec)
SQL class Specification
-
Make precompiled statement, pass parameters only, more efficient than pass SQL statements, reduce SQL injection probability
-
Make full use of prefix index
-
Try not to use stored procedures, triggers, functions, etc. Let the database do what's best for you
-
Avoid using large table Join,mysql optimizer for join optimization policy too simple
-
Avoid math and other computational tasks in the database
-
SQL Merge, mainly refers to DML when multiple value merges, decreases and database interaction
-
Reasonable paging, especially for the Oita page
-
UPDATE, DELETE statements do not use limit and are prone to master-slave inconsistencies
-
Use in instead of or,in for a value of less than 1000
-
Prohibit the use of order by rand ()
-
SQL statements avoid using temporary tables
-
Use UNION ALL instead of union
-
Program should have a handling mechanism for catching SQL exceptions
-
Prevents single SQL statements from updating multiple tables at the same time
- When reading data, select only the required columns, do not select * Every time, avoid serious random read problems, especially read some text/blob columns
-
Typically, subqueries have poor performance and suggest a change to join notation
-
Multi-table join query, the associated field type is as consistent as possible and must have an index
-
Multi-table join query, a table with a small result set (note that this refers to the filtered result set, not necessarily a small amount of data in the whole table) as the driver table
-
When a multiple table join is sorted, the sort field must be the driver table, otherwise the column list cannot be used for the index
-
Multi-use composite indexes, less multiple independent indexes, especially if some cardinality (cardinality) is too small (for example, the column has a total of less than 255 unique values) do not create a separate index
-
A paging-like SQL that is highly efficient by first associating with a primary key and then returning a result set
Iv. DBA Specification Version selection
MySQL Community Edition, the largest user group
MySQL Enterprise Edition, fee
Percona Server Edition, many new features
MARIADB version, not many domestic users
Recommended selection priority is: MySQL Community edition > Percona Server > MariaDB > MySQL Enterprise Edition
Main content
SQL Audit, DDL Audit and operation time, especially ONLINEDDL
High-risk operation check, before drop Do data backup
Log analysis, mainly refers to the MySQL slow log and error log
Data Backup Scenarios
Online DDL
Native MySQL requires a lock table when executing the DDL, and the business is unable to write data during the lock table, the service has a large impact on the services, MySQL support is relatively poor
Recommended use of Pt-online-schema-change
The advantages of using Pt-online-schema-change are:
The limitations of using Pt-online-schema-change are:
Change table time will be longer (compared to direct ALTER TABLE)
The modified table requires a unique key or primary key
Concurrent modifications on the same port are not too many
MySQL Cluster scenario
Priority recommendation MHA: can adopt a master multi-slave , or dual master multi-slave mode, this mode can be used MHA or MMM to manage the entire cluster, the latest MHA has also supported the MySQL 5.6 gtid mode
The advantages of MHA are obvious:
Open source, with Perl development, code structure clear, two times development easy;
When the scheme is mature and failover, MHA will make more stringent judgments, minimize data loss and ensure data consistency.
Provide a general framework to customize the development according to your own situation, especially the judgment and switch operation steps;
Support for Binlog server can increase binlog transfer efficiency and further reduce the risk of data loss.
But MHA also has some limitations:
The need to get SSH trust across nodes is a challenge for some corporate security systems, because if a node is compromised by hackers, other nodes will suffer as well;
The script that comes with it also needs to be supplemented, of course, the general use is sufficient.
Split problem
Troubleshooting single-machine write pressure and capacity issues
There are two ways of vertical splitting and horizontal splitting
Split to moderate, do not over-split
It is best to have a middle-tier control split logic, otherwise the cost of splitting the administration is high
Data backup
Full-scale backup VS incremental backup
Hot standby VS Cold standby
Physical Backup VS Logical Backup
Delayed backup
Full-Volume Binlog backup
Suggested Way:
Hot standby + Physical backup
Core business: Delayed backup + logical backup
Full-Volume Binlog backup
The main points to do:
Centralized scheduling management for backup policies
Xtrabackup Hot Spare
Statistical analysis of Backup results
Backup Data Consistency check
Storage backup with Distributed File system
The backup system uses Distributed File system reasons:
Troubleshoot storage allocation issues
Troubleshooting Storage NFS Backup inefficiencies
Storage Centralized Management
Better data reliability
Some of the specifications used by the MySQL database