I have a table w1000, there are 10 million data, this table structure is as follows:
CREATE TABLE ' w1000 ' (
' ID ' varchar ($) Not NULL,
' Name ' varchar () DEFAULT NULL,
' Age ' int (3) DEFAULT NULL,
' Money ' double (8,2) DEFAULT NULL,
' Address ' varchar DEFAULT NULL,
' Create_date ' datetime (3) DEFAULT NULL,
' Modify_date ' datetime (3) DEFAULT NULL,
PRIMARY KEY (' id '),
KEY ' index_modify_date ' (' modify_date ') USING BTREE
) Engine=innodb DEFAULT Charset=utf8;
There is nothing to say in table fields, but to choose a variety of different field types.
The highlights are in Engine=innodb, so the table is also available for which engine to choose, not only for database creation. So here's the question:
One, the table why choose InnoDB Engine
Please refer to: http://blog.csdn.net/com360/article/details/7692117
1 ISAM
ISAM is a well-defined and time-tested form of data management that, at design time, takes into account that the number of times the database is queried is much larger than the number of updates. As a result, ISAM performs read operations quickly and does not consume large amounts of memory and storage resources. The two main disadvantages of ISAM are that it does not support transactional processing or fault tolerance: If your hard drive crashes, the data file cannot be recovered. If you are using ISAM in mission-critical applications, you must always back up all of your real-time data, and with its replication features, MySQL can support such a backup application.
2 InnoDB
It provides transaction control capabilities that ensure that a set of commands are executed successfully, or that the results of all commands are rolled back when there is an error in any command, and it is important to imagine the transaction control capability in an electronic bank. Supports commit, rollback, and other transactional features. The latest version of MySQL is already planning to remove support for BDB and to develop InnoDB in full.
PS: That is, when you choose ISAM as the data engine for MySQL, the spring database thing management configuration in your app will fail completely because it doesn't support things at all!
See which engines support things
Show engines; In the MySQL5.7, there is support for many engines, but there is only one engines to support things:
Show variables like '%storage_engine% '; View the MySQL default storage engine.
The importance of things, the acid properties of things
The transaction has an acid attribute (ref. http://www.cnblogs.com/jerryxing/archive/2012/04/24/2468986.html):
o atomicity (Atomic): A transaction is composed of one or more behaviors, as if it were a separate unit of work. Atomicity ensures that all operations in the transaction either occur or do not occur.
o Consistency (consistent): Once a transaction has ended (whether successful or not), the state of the system is consistent with its business rules. That data should not be compromised.
o Isolation (Isolated): Transactions should allow multiple users to manipulate the same data, and one user's actions will not be confused with the actions of other users.
o Persistence (Durable): Once the transaction is complete, the result of the transaction should be persisted.
The ACID properties of a transaction are implemented by a relational database management system (RDBMS).
o The database management system uses logs to ensure the atomicity, consistency, and durability of transactions. The log records the updates that the transaction makes to the database, and if a transaction has an error during execution, it can undo the updates that the transaction has made to the database based on the log, returning the database to the initial state before the transaction was executed.
o The database management system adopts the lock mechanism to realize the isolation of the transaction. When multiple transactions update the same data for the database at the same time, only the transaction holding the lock can update the data, and the other transaction must wait until the previous transaction releases the lock, and the other transaction has the opportunity to update the data.
Second, modify the My.ini, optimize the MySQL database configuration
My local installation directory C:\ProgramData\MySQL\MySQL Server 5.7
Find the My.ini in the directory, the parameters are optimized, the following = The parameters below are the parameters I have modified, that is, locally optimized.
datadir= Modify the actual MySQL Data directory storage address, in order to avoid the default to save to C drive phenomenon
max_connections=1500 increase the number of database connections. Changed from the original 151 to 1500
query_cache_size=2m increase the query cache. If you query the same SQL many times, then execute SQL again directly from the cache, if your table changes frequently or SQL changes frequently, it may slow down the query speed, use caution.
tmp_table_size=200m The maximum value of this value is the memory value, and if a table data is greater than this value, it is automatically converted to disk-based tables. The limit for this parameter is for the table.
key_buffer_size=0m This value is only for the table engine when MyISAM. In general, we build tables and databases are for things to support, so the default is InnoDB.
The innodb_flush_log_at_trx_commit=1 has a value of three 0,1,2.
When the value is 0 o'clock, the log is written to the file about every second, and the file is flushed to disk, that is, the log file is saved.
When the value is 1 o'clock, innodb each commit operation flushes the thing log to disk, that is, log persistence, such as a log file.
A value of 2 o'clock indicates that the log file is written to each commit, but the log flush to disk operation is performed approximately every second.
innodb_log_buffer_size=5m InnoDB log data buffer size, if the InnoDB is set to Flash to disk every second, then this value is not very significant, even if it is a long thing.
Cond......
third, optimize SQL before
Let's take a look at the non-optimized sql,--followed by this SQL execution time
SELECT * from w1000 limit 0,1550; --0.015s
SELECT * from w1000 where name= ' Orient Genshi '; --35.355s First time query
SELECT * from w1000 where name= ' Orient Genshi '; --The query cache (query_cache_size) takes effect when the second query is 0.001s
SELECT * from w1000 where id = ' 0000c51e-0ede-4dc2-871a-78c1eebd0041 '; --0.003s query condition as primary key field
SELECT * from w1000 where age = n or age = 20; --32.234s
SELECT * from w1000 where age! = 20; --75.222s access violation at address 00007ffe6fcc0bd0 in module ' Scilexer.dll '.
SELECT * from w1000 where modify_date are NOT null; --73.387SS access violation at address 00007ffe6fcc0bd0 in module ' Scilexer.dll '.
SELECT * from w1000 where name is like ' Zhang% '; --25.519s
SELECT * from w1000 where age in (20,21); --34.47s
SELECT * from w1000 where age/2 = 50; --30.352s
SELECT * from w1000 where SUBSTRING (name from 1 for 1) = ' Zhang '; --26.091s
Red is navicat error after execution, because the query result set data is too large, has exceeded the navicat memory use.
In the above SQL, I have not defined any table indexes, nor changed the database isolation level, only defined the table engine as InnoDB. All others are default.
Notice what we need to optimize now? Is the query speed, not the insertion speed, nor the update speed.
so all of the following solutions are just for how to make queries faster.
Iv. Selecting the appropriate database isolation level
At the time of the recent interview, I was often asked what is the isolation level of the database?
Answer reference: http://blog.csdn.net/gaopu12345/article/details/50868501 The following to reprint the article:
(0) No concurrency control
The database certainly supports no isolation level, which is the fastest query.
But there is a problem: the update is lost.
The workaround is the following "readable unacknowledged".
(i) Readable non-acknowledgement (READ UNCOMMITTED)
Write transactions block other write transactions, avoiding updates being lost. But no other read transactions are blocked.
Existing problem: dirty read. The incorrect data is read, because another transaction may not have submitted the final data, and the read transaction reads the halfway data, which may be incorrect.
The solution is the following "readable acknowledgment".
(ii) readable acknowledgement (Read Committed)
Write transactions block other read and write transactions. Read transactions do not block any other transactions.
Problem: Non-repeatable read. That is, between a transaction, two reads, but the results are different, it may be the first time ID 1 person called "lie Triple", the second reading of the ID 1 is called "John Doe". Because the read operation does not block other transactions.
The solution is the following "repeatable read".
(iii) REPEATABLE reading (REPEATABLE read)
Read transactions block other write transactions, but do not block other read transactions.
Existing problem: Phantom reading. Repeatable read-blocked write transactions include update and delete (lock only on existing tables), but do not include insert (new row does not exist, so there is no way to locking), so a transaction may read 10 records in the first read, but the second may read to 11, which is the Phantom read.
The solution is the following "serialization".
(iv) serializable (Serializable)
Read plus share lock, write plus exclusive lock. So the read transaction can be concurrent, but read and write, writes the transaction is mutually exclusive, basically is executes the transaction, therefore is called the serialization.
Summary: What if we need the database to support things and the database isolation level to switch to the fastest query mode?
It appears that you can select READ UNCOMMITTED, but it is not recommended.
The default isolation level of the database (repeatable read) is generally maintained.
View and set the database isolation level
In the MySQL5.7 version, the default database isolation level is repeatable-read, which is generally not changed in application development.
SELECT @ @tx_isolation; --View the current session database isolation level
SELECT @ @global. tx_isolation; --View the global database isolation level
--Set Current session thing isolation level
Set session transaction ISOLATION level READ uncommitted;
Set session transaction ISOLATION level READ COMMITTED;
Set session transaction ISOLATION level repeatable READ;
Set session transaction isolation level SERIALIZABLE;
--Set the global object isolation level
Set global transaction ISOLATION level READ uncommitted;
Set global transaction ISOLATION level READ COMMITTED;
Set global transaction ISOLATION level repeatable READ;
Set global transaction isolation level SERIALIZABLE;
The row lock inside the table-pessimistic lock of the database: relationship between for update and autocommit
row lock-pessimistic lock using for update
use for update to do a row lock that matches a condition, not a table lock, which makes the query faster.
A row lock, such as this: SELECT * from w1000 where name = ' Zhangsan ' for update; --You want to query all row data for name = ' Zhangsan ' and add row locks to those rows.
But when you execute the next UPDATE statement: Update w1000 set age = WHERE name = ' Zhangsan ' finds no execution succeeded.
Why? Because your autocommit is turned on by default! So once the select * from w1000 where name = ' Zhangsan ' for update, this sentence is executed, wait until the following update is executed, it is finished!
So what? Change autocommit to OFF. Here's how the autocommit is set up(0 is off, 1 is turned on)
Set autocommit
SHOW VARIABLES like '%autocommit% '; --View the autocommit status in the current session
SET @ @autocommit = 0; --set Auttocommit off status in current session off
SET @ @autocommit = 1; --Set Auttocommit open status on in current session
SHOW GLOBAL VARIABLES like '%autocommit% '; --View global autocommit status
Set GLOBAL init_connect= ' Set autocommit=0 '; --Set global auttocommit off state off
Set GLOBAL init_connect= ' Set autocommit=1 '; --Set global auttocommit open state on
Complete Row Lock Example(Refer to http://blog.csdn.net/qq_36617521/article/details/53379846)
Set autocommit=0; --Turn off automatic submission of things, must explicitly commit to commit things
Begin /begin work; /start transaction; --Start the transaction (choose one of the three to do it)
Select status from w1000 where id= ' XXX ' for update; The rows that match the query criteria are locked, and if the whole table is the same, then the table lock, so the SELECT * from table_name is meaningless.
Update w1000 Set age = where id = ' XXX '; --to lock the line update, or to take the lock row ID to do other inserts can
Commit;/commit work; --because we shut things down, we need to explicitly commit things manually.
Set autocommit=1; --Open things automatically submit
PS: This section of SQL can generally be written in the stored procedure, if the frequency is very high, it is easy to execute such a series of commands through Java, it is time-consuming
Vi. Adding an index
It is already a common understanding that the index of a column can increase the speed of SQL queries that are listed as query criteria.
To optimize queries, avoid full table scans, and first consider indexing on the columns involved in where and order by.
MySQL indexing requires four items to be specified:
1, index name (custom index name)
2, index column (specifies which column to index on)
3, index type (normal normal index, unique value does not allow duplicate index, fulltext full-text index, for longer columns, in mysql5.7 version, support InnoDB and MyISAM, official reference https://dev.mysql.com/ doc/refman/5.7/en/fulltext-search.html)
4, the index method (divided into btree and hash two, btree for use in the comparison operators such as =,>,>=,<,<= and between and not the beginning of the fuzzy matching query, hash can be used on the character column)
MySQL automatically establishes the primary key index :
I set up the index_modify_date for w1000 manually, but at show index from w1000; When you view the full index of a table, you find that the primary key has been automatically created.
Information description for columns in index information(Refer to http://blog.51yip.com/mysql/1222.html)
Table: The names of the tables.
Non_unique: 0 If the index cannot include a repeating word. 1 if it is possible.
Key_name: The name of the index.
Seq_in_index: The column sequence number in the index, starting at 1.
COLUMN_NAME: Column name.
Collation: How the columns are stored in the index. In the Mysqlshow index syntax, there is a value of ' A ' (ascending) or null (no classification).
Cardinality: An estimate of the number of unique values in the index. You can update by running analyze table or myisamchk-a. The cardinality is counted according to the statistics stored as integers, so even for small tables, this value is not necessarily accurate. The larger the cardinality, the greater the chance that MySQL will use the index when it is federated.
Sub_part: The number of characters that are indexed if the column is only partially indexed. Null if the entire column is indexed.
Packed: Indicates how the keyword is compressed. Null if it is not compressed.
Null: Contains Yes if the column contains null. If not, it is empty.
Index_type: Storage Index data structure method (BTREE, HASH)
Optimize Table Usage
If the data is heavily altered, and the table is not optimized, you can use optimize table table_name; To optimize the table's index ordering.
The default does not support InnoDB, need to be modified, I did not verify whether the modified support InnoDB.
Reference http://blog.csdn.net/e421083458/article/details/39522513
Index Build Example(Refer to http://www.cnblogs.com/bk7788/p/6050184.html)
1. Add primary key (primary key index)
Mysql>alter TABLE ' table_name ' ADD PRIMARY KEY (' column ')
2. Add unique (unique index)
Mysql>alter TABLE ' table_name ' ADD UNIQUE (' column ')
3. Add index (normal index)
Mysql>alter TABLE ' table_name ' ADD INDEX index_name (' column ')
4. Add fulltext (full-text index)
Mysql>alter TABLE ' table_name ' ADD fulltext (' column ')
5. Add multi-column indexes, that is, aggregate indexes
Mysql>alter TABLE ' table_name ' ADD INDEX index_name (' column1 ', ' column2 '); It is recommended that these columns be indexed only if they are used frequently as query criteria, such as a date range.
Vii. use of indexes in SQL
We have added an index, but we cannot assume that as long as the indexed column is used in the Where condition, the index must be used. The following conditions, even if the query condition field after the where is defined as an indexed field, is not used for indexing, let's review the previous SQL before those optimizations:
1) Try to avoid querying for redundant fields, as long as the query with the field can be
SELECT name,age from w1000 where id = ' 0000c51e-0ede-4dc2-871a-78c1eebd0041 '; --0.003s query condition as primary key field
2) try to avoid using or in the WHERE clause to join the condition, or it will cause the engine to abandon using the index for a full table scan
SELECT * from w1000 where age = n or age = 20; --32.234s
The or statement can be changed to
Select ID from w1000 where age=19
UNION ALL
Select ID from w1000 where age=20
3) Try to avoid using the! = or <> operator in the WHERE clause, or discard the engine for full table scanning using the index
SELECT * from w1000 where age! = 20; --75.222s
4) Try to avoid null values for the field in the WHERE clause, otherwise it will cause the engine to abandon using the index for a full table scan, and you can set the default value.
SELECT * from w1000 where modify_date are NOT null; --73.387SS
5) Try to avoid matching wildcard characters in the like condition, which can be indexed at the end of a wildcard.
SELECT * from w1000 the where name like '% Zhang% '; --25.519s
6) in and not in also use caution, otherwise it will cause a full table scan
SELECT * from w1000 where age in (20,21); --34.47s
A lot of times it's a good choice to replace in with exists:
Select age from w1000 where exists (...)
7) Try to avoid the index column on the left of the = number to do the operation or use the function, otherwise the index on the index column is invalidated
SELECT * from w1000 where age/2 = 50; --30.352s
SELECT * from w1000 where SUBSTRING (name from 1 for 1) = ' Zhang '; --26.091s
8) You can force the index to be used when the argument is a variable
Select ID from t where [email protected]
You can force the query to use the index instead:
Select ID from T with (index name) where [email protected]
Note that if a large number of data in an indexed column is duplicated, then even indexing on that column does not make a lot of sense for query efficiency.
The above arguments refer to the MySQL SQL Millions database optimization scheme (http://www.cnblogs.com/huangye-dream/archive/2013/05/21/3091906.html)
Viii. explain viewing index usage in SQL
Since the index is important, but there is no index in the SQL execution process, this can be viewed using explain, using the method: Explain SELECT * FROM table_name; is to add explain before the SELECT statement.
My local test explain SELECT * from w1000 where id = ' 0000c51e-0ede-4dc2-871a-78c1eebd0041 ';
We can see that the type value is const, the query has been optimized, and the key column actually uses the index primary.
Refer to "MySQL Query statement analysis explain usage" http://www.cnblogs.com/lijiasnong/p/5486111.html
explanation of the explain column
Table: Shows which table the data for this row is about
Type: This is an important column that shows what type of connection is used. The best to worst connection types are const, EQ_REG, ref, range, Indexhe, and all
Possible_keys: Displays the indexes that may be applied to this table. If it is empty, there is no possible index. You can select an appropriate statement from the where statement for the related domain
Key: The actual index used. If NULL, the index is not used. In rare cases, MySQL chooses an index that is poorly optimized. In this case, use Index (indexname) can be used in the SELECT statement to force an index or use ignore index (indexname) to force MySQL to ignore the index
Key_len: The length of the index used. The shorter the length the better, without loss of accuracy
Ref: Shows which column of the index is being used and, if possible, a constant
Rows:mysql the number of rows that must be checked to return the requested data
Extra: Additional information on how MySQL resolves queries. It will be discussed in Table 4.3, but the bad examples you can see here are the using temporary and using filesort, meaning that MySQL simply cannot use the index, and the result is that the retrieval will be slow
The meaning of the description returned by the extra column
Distinct: Once MySQL finds a row that matches a row, it no longer searches for
Not Exists:mysql optimizes the left join, and once it finds a row that matches the left join standard, it no longer searches for
Range checked for each record (index map:#): No ideal index was found, so for every combination of rows from the preceding table, MySQL examines which index to use and uses it to return rows from the table. This is one of the slowest connections to use the index
Using Filesort: When you see this, the query needs to be optimized. MySQL requires additional steps to find out how to sort the rows that are returned. It sorts all rows based on the connection type and the row pointers for all rows that store the sort key values and matching criteria.
Using index: Column data is returned from a table that uses only the information in the index and does not read the actual action, which occurs when all the request columns of the table are part of the same index
Using temporary when you see this, the query needs to be optimized. Here, MySQL needs to create a temporary table to store the results, which usually occurs on an order by on a different set of columns, rather than on the group by
Where used uses a WHERE clause to restrict which rows will match the next table or return to the user. If you do not want to return all rows in the table, and the connection type all or index, this occurs, or the query has a problem different connection types of interpretation (in order of efficiency)
The system table has only one row: the system table. This is a special case of the const connection type
Const: The maximum value of a record in a table can match this query (the index can be a primary key or a unique index). Because there is only one row, this value is actually a constant, because MySQL reads the value first and treats it as a constant.
Eq_ref: In a connection, when MySQL queries, from the previous table, the union of each record reads a record from the table, which is used when the query uses the index as the primary key or the unique key.
Ref: This connection type occurs only if the query uses a key that is not a unique or primary key or is part of these types (for example, using the leftmost prefix). For each row union of the previous table, all records are read from the table. This type is heavily dependent on how many records are matched against the index-the less the better
Range: This connection type uses an index to return rows in a range, such as what happens when you use > or < to find something
Index: This connection type is fully scanned for each record in the previous table (better than all because the index is generally less than the table data)
All: This connection type is fully scanned for each previous record, which is generally bad and should be avoided as much as possible.
Nine, after indexing query comparison
Before the third chapter of SQL optimization, we saw that SQL was executing for a long time. The Where field for these SQL queries is age and name, and we now build the index on these two columns:
ALTER TABLE ' w1000 ' ADD Normal (' age '); --204.562s to perform this effect on a table of level 10 million the change table statement for the full table is slow
ALTER TABLE ' w1000 ' ADD Normal (' name '); --222.041s
SQL that was previously slow to query, not only because there is no index, but even if an index is established, the query methods in these WHERE clauses invalidate the index.
After the index is set up, the SQL is pulled out again, of course, the value of the query condition to change, otherwise, SQL will directly return the query cache data.
Note that only select the fields we actually use, not select *
comparison results:
--Not before optimization
SELECT Id,name from w1000 where age = 40; --9.893s
SELECT Id,name from w1000 where age = Limit 0,100; --0.008s
SELECT id,name from w1000 where name like ' Ancient% '; --11.182s
SELECT id,name from w1000 where name= ' Song Angan '; --13.384s
SELECT id,name from w1000 where age=25 limit 8000000,1550; --10.168s
SELECT Id,name from w1000 where age =11 limit 0,100; --0.027s
--After optimization
SELECT Id,name from w1000 where age = 18; --14.802s
SELECT Id,name from w1000 where age = 0,100 limit; --0.011s
SELECT id,name from w1000 where name like ' History% '; --0.070s
SELECT id,name from w1000 where name= ' Zi '; --0.001s
SELECT id,name from w1000 limit 8000000,1550; --8.643s
SELECT Id,name from w1000 where age =14 limit 0,100; --0.013s
Summarize
As we can see from the results above, after the index is indexed and the index is used correctly, the varchar column does greatly improve the query efficiency, but the int type seems to be slower, and does not understand why.
However, in the case of not indexing, some such as the gender field, only the male and female values, should be the type int (0 or 1) faster.
X. Viewing current connection Information
Mysql> Show Processlist\g
or execute it directly in the Navicat tool.
Show Processlist (or show full processlist)
This command allows you to see the information that is currently connected to the database, so that you can easily observe the deadlock situation. If the state is empty or null, the representation has been executed, otherwise it will display updating, sending data, etc.
For example, if a database connection is hung, you can see which SQL is blocked.
MySQL optimization-database optimization, SQL optimization