MySQL engine, index and optimization (li), mysqlli

Source: Internet
Author: User

MySQL engine, index and optimization (li), mysqlli

I. Storage Engine

Storage engine, MySQL data is stored in files (or memory) using different technologies. Each of these technologies uses different storage mechanisms, indexing techniques, locking levels, and ultimately provides a wide range of different features and capabilities. By choosing different technologies, you can get additional speeds or features to improve the overall functionality of your application. The InnoDB Storage engine is the default Mysql database after Mysql 5.5. It is the preferred engine for transactional databases and supports ACID transactions and row-level locking. There is also a common MyISAM storage engine, which has a high insertion and query speed, but does not support transactions. Therefore, it is obvious that insertion is not frequent, queries are very frequent, no transactions, and MyISAM is used; high reliability requirements, frequent table updates, many transactions, and InnoDB are used.

// # View the storage engine provided by MySQL on the local machine // show ENGINES; // # view the current default storage engine of Mysql // show variables like '% storage_engine % '; //// # view the storage engine used by the current table (DDL final) // show create table idc_work_order_main;

// # Modify the storage engine of the current table
// Alter table idc_work_order_main ENGINE = 'myisam'

MySQL officially explained this to InnoDB. InnoDB provides MySQL with a transaction security storage engine with the ability to submit, roll back, and crash recovery. InnoDB is designed for the maximum performance when processing a large amount of data. Its CPU efficiency may be unmatched by any other disk-based relational database engine. The InnoDB Storage engine is fully integrated with the MySQL server. the InnoDB Storage engine maintains its own buffer pool to cache data and indexes in the main memory.

If the innodb Storage engine is used, we know that the main features of the engine are transactional and row lock ). It is reasonable to say that there will be no table lock, but in fact there will still be a situation of table lock, it will be more serious, the following is mainly to discuss this problem. You can view the mysql document and find that, although innodb uses row lock (row-Level lock), a special table lock is used when processing tables with auto increment fields: AUTO-INC. Simply put, Innodb will save a counter in the memory to record the value of auto_increment. when data is inserted, a table lock will be used to lock the counter until the insertion ends. One insert statement is not a problem, but if it is highly concurrent, it will cause SQL blocking.

Solution: 1. Do not use the auto increment field. Maintain the primary key generation by yourself. In this method, it is very important to select a primary key generation policy. The problem of simplicity and efficiency should be considered comprehensively. Assuming that uuid is used, although simple, the efficiency of the primary key of the table will be very low (the primary key of innodb is a special index, and other indexes will reference the primary key ). 2. upgrade to the latest version 5.2.

// # Before MySQL5.1.22, this method was characterized by "table-level locking" and poor concurrency // innodb_autoinc_lock_mode = 0 ("traditional" lock mode: Use table locks all) //// # we recommend that you use consecutive, which features a high concurrency, that is, ensure that the auto_increment IDs newly inserted in the same insert statement are continuous // innodb_autoinc_lock_mode = 1 ("consecutive" lock mode) // # This mode is used to allocate one, instead of locking the table, only the * process * of id allocation will be locked. The difference between * process * And innodb_autoinc_lock_mode = 1 is that // # multiple tables will not be pre-allocated. This method has the highest concurrency. However, in replication, when binlog_format is statement-based, // # (SBR statement-based replication) has a problem, because one is allocated, so that when concurrent execution is performed, // # "Bulk inserts" will be allocated to other INSERT statements at the same time during the allocation, and the master will be inconsistent (the slave database execution result is different from the master database execution result ), because binlog only records the start insert id. // Innodb_autoinc_lock_mode = 2 ("interleaved" lock mode: all new methods are used, which is not safe and not suitable for replication)

For data copying, the common data table engine isMyISAM andInnoDB。MyISAMThe suffix of the data table is.frm(Table structure ),.myd(Data) and.myi(Index), the index and data are separated and can be directly copied;InnoDBThe suffix of the data table is.frm(Table structure) and.ibd(Data), index and data are both in the same fileibdata*And cannot be copied directly. You need to export the file before importing it. After the copy, do not forget to restart the database service.

Since it is a storage engine, let's look at the storage of these databases. Block is relative to disk, and page is relative to memory. The first figure is to create a txt file, write the file to 1, and then check the space occupied in the attribute, that is, the size of a block is 4 K bytes. You can view the page size in memory in the way shown on the right.

Disks are block-based. Data Pages of the same table are linked together in the form of a linked list. database data is stored in each block by row and accessed in blocks. When executing an SQL statement, you can parse the command, determine the execution plan, and add, delete, modify, and query the execution plan. In this way, disk I/O brings performance problems. How can I reduce the number of disk I/O operations?

// 1. Ensure that the read data volume is in a reasonable size // 2. Ensure that the accessed data can be read in sequence // 3. Reduce the space occupied by scanning data

Ensure that the data size to be read is reasonable; ensure that the accessed data can be read in sequence; reduce the space occupied by scanning data. The solution is to use index. Dense index is a dense index, also called a full index. Sparse index is a sparse index. Dense indexes is used to improve query efficiency by persistently saving some additional data for each record on the disk. Sparse index combines the advantages of sequential file and dense index file. By saving part of the key K as its record, Sparse index can support binary search for record quickly, in addition, the required disk I/O can be further reduced.

Ii. Indexing

Index is a data structure that helps MySQL efficiently obtain data. This article introduces the index structure and principles of MySQL, and then learns how to optimize indexes. The index structure of MySQL includes: B-tree index, Tree index, Hash index, Bitmap index, and hop table.

// # View the execution result of the current INDEX of the table (Index_type: BTREE) // show index from idc_work_order_main

The command is used to view the current index of the table in our database. The execution result shows that the index used by the current table structure is BTREE. You can index a table as follows:

// # Query the index currently used by the table (the unique INDEX unique index is automatically created for the table's primary key) // show index from idc_work_order_main; //// # CREATE index INDEX // CREATE index aaa ON idc_work_order_main (remark) // drop index aaa ON idc_work_order_main // # create a unique INDEX (the unique INDEX means that two rows cannot have the same INDEX value; otherwise, the creation fails) // create unique index aaa ON idc_work_order_main (id) // drop index aaa ON idc_work_order_main // # CREATE a composite INDEX // create index aaa ON idc_work_order_main (id, remark) // drop index aaa ON idc_work_order_main

We know that indexes are not randomly created. when considering whether to create indexes, we generally consider the following situations:

1. There are too few table records. If a table has only five records and uses indexes to access the records, you must first access the index table and then access the data table through the index table. Generally, the index table and the data table are not in the same data block, in this case, you must read data blocks at least twice. ORACLE reads all the data once without an index, and the processing speed is obviously faster than that with an index.

2. frequently inserted, deleted, and modified tables. For business tables that are frequently processed, minimize the number of indexes allowed by queries.

3. Table fields with duplicate data and average distribution. Assume that A table has 0.1 million rows of records and one field A has only T and F values, and the distribution probability of each value is about 50%, creating an index for this table field A generally does not increase the database query speed.

An index is usually created for data with more than one million data records, in order to improve performance. When creating an index, consider whether the tablespace and disk space are sufficient. We know that indexes are also a kind of data, which will occupy a large number of tablespaces when creating indexes. Therefore, when creating an index for a large table, we should first consider the space capacity issue. Second, you need to lock the table when creating an index. Therefore, you should pay attention to the operation when the business is idle. The second factor is disk I/O. Physically, try to distribute the index and data to different disks. Logically, the data table space is separated from the index tablespace. This is the basic principle to be followed when creating an index.

// Generally, like operations are not encouraged. If they are not usable, how to use them is also a problem. Like "% aaa %" does not use indexes, but like "aaa %" can use indexes. // Neither the not in nor the operation will use the index to scan the entire table. Not in can be replaced by not exists, while id3 can use id> 3 or id

Index structure. Currently, Mysql only supports two index types: B tree and Hash. The default value is BTree. Oracle or other types of databases have Bitmap indexes ). The following describes three index structures: B-tree index, hash index, and bitmap index.

B-Tree indexes. The prototype of the data structure is a multi-path search Tree. It is a common data structure and is often used as a database index. Using the B-tree structure can significantly reduce the intermediate process experienced when locating records, thus accelerating access.

// The scenario where the Blance Tree index is not suitable: // 1. the column of a single index cannot contain null records, and each column of the composite index cannot contain records with both null values; otherwise, the entire table is scanned. // 2. it is not suitable for columns with fewer key values (columns with more duplicate data, is_deleted "y" "n"); // 3. indexes cannot be used for leading fuzzy queries (like '% xx' or like' % XX % ')

The figure shows the impact of B-Tree indexes on the Blance Tree during insertion and deletion.
Hash indexes are indexes built based on the Hash algorithm. Although the Hash index is highly efficient, the Hash index itself also brings many restrictions and drawbacks due to its particularity, mainly including the following: accurate search is very fast (including = <> and in ), the retrieval efficiency is very high, and index retrieval can be performed at a time. Unlike B-Tree indexes, the query efficiency of Hash indexes is much higher than that of B-Tree indexes.

// Scenarios where Hash indexes are not suitable: // 1. Fuzzy queries and range queries (including like,>, <, ...... And), because the Hash Index compares the Hash value after the Hash operation, it can only be used for equivalent filtering, it cannot be used for range-based filtering. // The relationship between the size of Hash values processed by the corresponding Hash algorithm cannot be identical with that before Hash calculation. // 2. it is not suitable for sorting. The database cannot use index data to improve the sorting performance. It is also because the size of the Hash value is uncertain. // 3. composite indexes cannot be used to query partial index fields. When calculating Hash values, Hash indexes combine index bonding and then calculate Hash values together instead of separately calculating Hash values, // Therefore, when one or more index keys are used to query a combined index, the Hash index cannot be used. // 4. Columns with fewer key values (columns with more duplicate values)

Bitmap is an index represented by a Bitmap. a Bitmap is created for each key value of a column. Compared with the B-tree index, it occupies a very small amount of space and can be created and used very quickly. Bitmap indexes only store the start and end rowids and bitmaps of key values, which occupy a very small amount of space.

For example, if the test table has a column such as state, 10 rows of data are as follows: 10 20 30 20 30 10 30 20 30 then three bitmaps are created, as shown below: BLOCK1 KEY = 10 1 0 0 0 1 0 0 0 0 0 BLOCK2 KEY = 20 1 0 0 0 1 0 1 0 0 0 BLOCK3 KEY = 30 1 0 0 0 1 0 1 0 0 0 0 0 0 // bitmap index applicable scenarios: // 1. suitable for decision support systems; // 2. when select count (XX), you can directly access a bitmap in the index to quickly obtain the statistical data; // 3. perform and, or in (x, y,...) based on the key value ,..) during query, the bitmap of the index is used for or calculation to quickly obtain the result row data. // The bitmap index is not applicable to the following scenarios: // 1. it is not suitable for columns with more key values (columns with fewer duplicate values); // 2. it is not suitable for columns with frequent update, insert, and delete operations.

The jump table uses the structure of the linked list. A node stores information about the next node, doubling the performance.

Iii. MySQL Optimization

1. limit start, count paging optimization. The query time of the limit statement is proportional to the start position. We recommend that you add an index and use the index cache provided by MySQL to check the index instead of the data address.

2. The data type of the table, which can be small or small, and can be char (1) without varchar. Avoid using null, count (column) does not count the number of rows whose value is null, and is not conducive to indexing.

3. char is a fixed size, and varchar can dynamically store data. Tinyint and smallint are used first, and int and bigint are used.

4. When storing data with the same value range, float of the floating point type usually uses less space than decimal type. The float field uses 4 bytes to store data. The double type requires 8 bytes and has a higher accuracy and a greater value range. Data of the decimal type will be converted to the double type.

 

 

Appendix:

1. Full-text indexing: Full-text indexing is currently a key technology of search engines. Imagine it may take several seconds to search for a word in a 1 M file, and dozens of seconds to search for a M file, if you search for a larger file, you need more system overhead. This overhead is unrealistic, so the full-text indexing technology is emerging under such conflicts. InnoDB does not support mysql. Myisam supports good performance and is generally created on CHAR, VARCHAR, or TEXT columns.

2. Clustered index. The logical order of the key values in the index determines the physical order of the corresponding rows in the table. The clustered index determines the physical sequence of data in the table. In Mysql, myisam tables do not have clustered indexes. innodb has (primary keys are clustered indexes). Clustered indexes are described in detail in the following section about the innodb structure.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.