Chapter 2 Introduction to MySQL storage engine

Source: Internet
Author: User
Chapter 2 Introduction to MySQL storage engine 3rd overview of MySQL storage engine

Later, MySQL realized the need to change the architecture and stored the front-end business logic and back-end data
While clear hierarchies are split up, ISAM is extended and the code is restructured. this is MyISAM.
The origin of the storage engine.

MySQL plug-in storage engines mainly include MyISAM, Innodb, NDB Cluster, Maria, Falcon,

Memory, Archive, Merge, Federated, among others, the most famous and widely used MyISAM and Innodb
Two storage engines. MyISAM is the upgraded version of MySQL's earliest ISAM storage engine, and is also the default storage of MySQL.
Engine. In fact, Innodb is not MySQ, but a third-party software company, Innobase (in 2005
Developed by Oracle), the biggest feature of which is to provide features such as transaction control, so users are not
Widely used.

Some of the other storage engines are relatively less used in some specific scenarios, such
Although NDB Cluster supports transactions, it is mainly used in distributed environments and is a share nothing
The distributed database storage engine. Maria is the latest MySQL development (the final GA version has not yet been released) for MyISAM
To replace the current Innodb storage engine, Falcon is independently developed by MySQL.
A database storage engine with advanced features such as transactions. it is currently under development. All Memory storage engine data
And indexes are stored in the memory, so it is mainly used for some temporary tables, or extremely high performance requirements, but allowed in the West
Well, he lost data in specific scenarios during Crash. Archive is a highly compressed storage of data.
The storage engine is mainly used to store historical information that has expired and has little access, and does not support indexing. Merge and Federated
Strictly speaking, it cannot be regarded as a storage engine. Because the Merge storage engine is mainly used
Merge is used together to provide services as a table. the base table can be based on several other storage engines. While
Federated actually does something similar to Oracle's dblink, which is mainly used to remotely access other MySQL databases.
The data on the server.

3.2 Introduction to MyISAM storage engine

MyISAM storage engine tables are stored in the database. Each table is stored as three physical files named after the table.
First, there will certainly be any storage engine that is indispensable for storing the. frm file of table structure definition information, and there will also be. MYD
And. MYI files, respectively, store the table data (. MYD) and index data (. MYI ). Each table has only three
Files are stored as MyISAM Storage Tables. that is to say, no matter how many indexes the table has, they are stored in
In the same. MYI file.

1. B-Tree indexes
B-Tree indexes, as the name implies, mean that all index nodes follow the data structure of the balance tree.
Storage, all indexed data nodes are on the leaf node.

2. R-Tree indexes
The storage method of the R-Tree index is different from that of the B-tree index. it is mainly designed for storage space and
The dimension data fields are indexed. Therefore, currently, MySQL only supports geometry fields as indexes.

3. Full-text index
The Full-text index is the Full-text index we have long mentioned. its storage structure is also B-tree. Mainly
To solve the inefficiency of like queries.

Among the three index types above MyISAM, B-Tree indexes are most frequently used, and Fulltext is occasionally used,
However, R-Tree indexes are rarely used in General Systems. In addition, the B-Tree index of MyISAM has a relatively large value.
The limit is that the length of all fields involved in an index cannot exceed 1000 bytes.

Although every MyISAM table is stored in a. MYD file with the same suffix
The put format may not be exactly the same, because the data storage format of MyISAM is divided into static (FIXED) FIXED
Fixed length, DYNAMIC (DYNAMIC) variable length, and COMPRESSED formats. Of course, the three formats are
Whether or not to compress data can be selected by ourselves. you can specify it by using ROW_FORMAT when creating a table.
{COMPRESSED | DEFAULT}. you can also use the myisampack tool to compress the data. by DEFAULT, the data is not COMPRESSED. While
In the case of non-compression, whether it is static or dynamic is related to the definition of fields in our table. As long as the table has a variable
If a field of the length type exists, the table must be in DYNAMIC format. If no variable length field exists,
It is in the FIXED format. of course, you can use the alter table command to force a word with the VARCHAR type
Segment DYNAMIC table to FIXED, but the result is that the original VARCHAR field type will be automatically converted
CHAR type. On the contrary, if FIXED is converted to DYNAMIC, the CHAR field is also converted to VARCHAR,
Therefore, you must be cautious about manual forced conversion.

Is the MyISAM storage engine table reliable? The following situations are listed in the MySQL User Reference Manual:
The table file may be damaged:
1. if mysqld is killed when it is performing a write operation or otherwise causes an exceptional termination;
2. host Crash;
3. disk hardware faults;
4. is there a bug in the MyISAM storage engine?

If an error occurs in a table file of the MyISAM storage engine, only the table is affected, but other tables are not affected.
It will affect other databases. If an error occurs in a MyISAM table during running,
You can use the check table command online to verify it, and use the repair table command to try to repair it.
. When the database is closed, you can use the myisamchk tool
For detection or repair. However, it is strongly recommended that you do not perform repair operations on the table as much as possible before the repair.
Make possible backups to avoid unnecessary consequences.
In addition, the tables of the MyISAM storage engine can be used and operated simultaneously by multiple database instances. However
We do not recommend that you do this either, and we also mention in the MySQL official user manual that we do not recommend that you
Mysqld shares MyISAM storage files.

3.3 Introduction to the Innodb storage engine

In MySQL, the most widely used except MyISAM is not Innodb. Innodb is the third
The storage engine developed by the company complies with the same open-source License protocol as MySQL.
The reason why Innodb is so favored is that it has many features:
1. supports transaction installation
The most important aspect of Innodb's functionality is its support for transaction security, which undoubtedly makes Innodb a MySQL
One of the most popular storage engines is very important. In addition, all four levels defined by the SQL92 standard are implemented.
Do not (read uncommitted, read committed, repeatable read and SERIALIZABLE ). Transaction Security
Full support will undoubtedly cause many users who had to give up using MySQL due to special business requirements to turn to support
MySQL, as well as users who have a wait-and-see attitude towards Database selection, has also greatly increased their preference for MySQL.

2. Multi-version data reading
While Innodb supports transactions, to ensure the performance of data consistency when concurrency is reached
Information to achieve multi-version data reading.

3. Improvement of locking mechanism
Innodb changed the lock mechanism of MyISAM and implemented the row lock. Although the implementation of Innodb's row lock mechanism is through
Index, but after all, 99% of SQL statements in the database use indexes to retrieve data. So,
The row locking mechanism also undoubtedly enhances the competitiveness of Innodb in a highly concurrent environment.

4. implement foreign keys
Innodb implements the important feature of the foreign key reference database, enabling the database to control the integrity of some data.
Possible. Although many database system tuning experts do not recommend this, for many users
Adding a foreign key to the database may still be the lowest cost option.

In addition to the highlights of the above functions, Innodb also has many other features that are often used.
But it also brings more customers to MySQL.

The Innodb storage engine is not the same as MyISAM, although there are also. frm files to store.
Table structure definition-related metadata, but table data and index data are stored together. Therefore, each table is stored separately.
Or all tables are stored together, which is determined by the user (through specific configuration) and supports symbolic links.

The physical structure of Innodb is divided into two parts:
1. data files (table data and index data)
Store data in a data table and all index data, including primary keys and other common indexes. In Innodb
In the concept of table space, the table space in Oracle is quite different. First
First, Innodb tablespaces are divided into two forms. One is the shared tablespace, that is, all tables and index data are stored.
In the same tablespace (one or more data files), innodb_data_file_path is used to specify the increase.
Data files need to be shut down and restarted. The other is exclusive tablespace, that is, the data and indexes of each table are stored in one
In a separate. ibd file.

Although we can use the shared tablespace or exclusive tablespace to store our tables, the shared table is empty.
Because Innodb's undo information and other metadata information are stored in the shared tablespace.
. The data file of the shared tablespace can be set to a fixed size or automatically scalable size.
You can set the maximum file size and each extension volume for an extended file. When creating auto-extended data files
We recommend that you add the maximum size attribute, one reason is that the file system itself has a certain size limit (
Innodb does not know). Another reason is the convenience of self-maintenance. In addition, Innodb can not only use text
System, you can also use the original block device, which we often call the bare device.

When our file tablespace is about to run out, we must add data files for it, of course, only share
Table space has this operation. It is easy to add data files to the shared tablespace.
After the innodb_data_file_path parameter, set the file path and related attributes according to the standard format.
One thing to note is that Innodb does not create directories when creating new data files. if you specify
If the record does not exist, an error is reported and cannot be started. Another headache is that Innodb is increasing the number of table spaces to be shared.
After adding a data file, you must restart the database system to take effect. if you are using a bare device, you must restart the database twice.
This is one of the reasons why I have never liked to use a shared tablespace and select a dedicated tablespace.

2. log files
The log files of Innodb are similar to the redo logs of Oracle. you can also set multiple log groups (at least 2
In the old version, the log archiving feature is the same as that in Oracle.
If you have created Innodb tables in your database, do not delete all innodb log files because
It may cause your database crash to fail to start or lose data.

Because Innodb is a transaction-safe storage engine, the system Crash cannot cause serious problems for him.
Loss. due to the existence of redo logs and the checkpoint mechanism, Innodb can use redo logs
The database Crash has been completed, but it has not been able to restore the transaction data written to the disk.
Partially completed and written to the disk unfinished transactions are rolled back and the data is restored.
Innodb is not only significantly different from the MyISAM storage engine in terms of features, but also independently processed in configuration.
. In the MySQL startup parameter file settings, all Innodb parameters are basically prefixed with "innodb _", no
Whether innodb data is log-related, or other performance, transaction, and other related parameters are the same. And all
Like the system variables related to Innodb, all the system status values related to Innodb are also "Innodb _"
Prefix. Of course, we can use only one parameter (skip-innodb) to shield Innodb in MySQL.
Storage engine, so that even if we install and compile the Innodb storage engine, the user cannot
Create an Innodb table.

3.4 NDB Cluster Storage Engine introduction

The NDB storage engine is also called the NDB Cluster Storage Engine. it is mainly used in the MySQL Cluster distributed Cluster environment,
Cluster is a new feature provided by MySQL 5.0. This part is not just about NDB.
The storage engine, because it leaves the entire MySQL CLuster environment, the NDB storage engine will also lose too much meaning. So
This section describes the related content of MySQL Cluster.

Simply put, Mysql Cluster is actually a kind of memory data that is implemented without shared storage devices.
Database Cluster environment, mainly implemented through the NDB Cluster (NDB) storage engine.

Generally, a Mysql Cluster environment consists of the following three parts:
A) manages the Manage node hosts of each node:
The management node is responsible for the management of all nodes in the Cluster, including Cluster configuration, startup and shutdown
Each node, as well as data backup and recovery. The management node obtains the status and
Error information, and the information of each node in each Cluster is fed back to all other nodes in the whole Cluster. By
The configuration of the entire Cluster environment is maintained on the management node, and serves as the basic communication work for each node in the Cluster,
Therefore, it must be the first node to be started.

B) SQL Server nodes at the SQL layer (hereinafter referred to as SQL nodes), also known as Mysql Server:
It is mainly responsible for implementing all things of a database on the storage layer, such as connection management, query optimization and response
Yes, cache management, etc. only the work at the storage layer is handed over to the NDB data node for processing. That is to say, in pure
The SQL node in the Mysql Cluster environment can be considered as a Mysql without any storage engine.
Server because its storage engine has NDB nodes in the Cluster environment. Therefore, Mysql servers at the SQL layer
The server startup is different from the normal Mysql startup. you must add the ndbcluster entry.
In the my. cnf configuration file, you can also specify it by starting the command line.

C) NDB data nodes in the Storage layer, that is, the NDB Cluster mentioned above:
NDB is a memory-based storage engine that loads all data and index data into the memory,
However, the data will also be persisted to the storage device. However, the latest version supports incomplete data selection by users.
Load the data to the memory, which is too large for some data or insufficient memory space for storage based on the cost.
It is indeed good news for all data users.

NDB nodes are mainly used to store underlying data and store Cluster data. Save each NDB node
A part of the complete data (or a complete data, depending on the number of nodes and configuration), in the MySQL CLuster
It is called a fragment. For every fragment, normally there will be one copy (or
The same image exists. These are done through configuration, so as long as the configuration is proper, Mysql
The Cluster does not have a single point of failure at the storage layer. Generally, NDB nodes are organized into one NDB Group,
An NDB Group is actually a Group of NDB nodes with identical physical data.

As mentioned above, the data organization of each NDB node may have all the data stored on each node.
Some data is mainly controlled by the number of nodes and parameters. First, in the Mysql Cluster master configuration file (
Above the management node, usually config. ini), there is a very important parameter called NoOfReplicas, this parameter
This parameter specifies the number of copies of each piece of data that are stored on different nodes in redundancy. this parameter should be set to at least 2, or
You only need to set it to 2. Normally, the probability that two nodes are redundant at the same time are not
Usually small. of course, if there are enough machines and memory, you can continue to increase. A node stores all the numbers
Data is still part of the data, but it is also limited by the number of storage nodes. The NDB storage engine first guarantees the NoOfReplicas parameter
Data redundancy is required for the number configuration to use storage nodes, and then data is segmented based on the number of nodes to continue to use more
The remainder NDB node. the number of segments is the total number of nodes divided by NoOfReplicas.

MySQL Cluster itself contains a lot of content. for the sake of space, it is not very in-depth here
Shao, the high availability design chapter in the architectural design part of this book will provide more detailed introduction and implementation details.
You can also use the MySQL official documentation to learn more about some details.

3.5 introduction to other storage engines

3.5.1 Merge storage engine:

The MERGE storage engine, also mentioned in the MySQL User Manual, is also known as the MRG_MyISAM engine.
Why? Because the MERGE storage engine can be simply understood as its function is to implement a MyISAM table with the same structure.
Some special packages provide a single access portal to reduce the complexity of the application. To create
In a MERGE table, the structure of the base table must be completely consistent, including the order of Fields. the base table indexes must also be completely consistent.
The MERGE table does not store data, but only provides an agreed storage portal for multiple base tables. Therefore
When creating a MERGE table, MySQL generates only two small files, one is the. frm structure definition file, and the other is
. MRG files, used to store the names of tables involved in MERGE (including the schema of the database ). Yes
Database schema, because the MERGE table can not only implement the tables in the same database of Merge, but also
For tables in different databases of Merge, Merge can be performed as long as the permissions are permitted and under the same mysqld.
After a MERGE table is created, you can use related commands to change the underlying base table.

MERGE tables not only provide Read services, but also write services. To enable the MERGE table to provide INSERT
When a table is created, you must specify the base table to which the INSERT data is to be written.
Insert_method parameter. If this parameter is not specified, any attempt to INSERT data to the MERGE table
Operation. In addition, you cannot directly use the full-text index on the base table through the MERGE table. you must use the full-text index,
You must access the base table.

3.5.2 Memory storage engine:

The Memory storage engine is easy to know by name. it is a storage that stores data in the Memory.
Engine. The Memory storage engine does not store any data on disks, but only stores information about a table structure.
The. frm file is on the disk. Therefore, after MySQL Crash or host Crash, only Memory tables are left.
Next structure. Memory tables support indexes in both Hash and B-Tree formats. Because
Is stored in the Memory, so Memory stores data according to the fixed length of space, and does not support BLOB and TEXT
Type field. The Memory storage engine locks pages.
Since all the data is stored in the memory, the memory consumption can be imagined. Users in MySQL
The manual has such a formula to calculate the Memory size actually consumed by the Memory table:
SUM_OVER_ALL_BTREE_KEYS (max_length_of_key + sizeof (char *) * 4)
+ SUM_OVER_ALL_HASH_KEYS (sizeof (char *) * 2)
+ ALIGN (length_of_row + 1, sizeof (char *))

3.5.3 BDB storage engine

The BDB storage engine is fully called the BerkeleyDB storage engine. like Innodb, it is not developed by MySQL itself.
A storage engine is provided by Sleepycat Software. of course, it is also an open source storage engine.
Transaction Security is supported.
The BDB storage engine stores data in two physical files, one. frm file and one. db file.
And index information are stored in the. db file. In addition, BDB also has its own redo log to ensure transaction security,
Like Innodb, you can also specify the location where log files are stored using parameters. In terms of the locking mechanism, BDB and Memory
Same as the storage engine, page-level locking is achieved.
Because the BDB storage engine implements transaction security, it must also have its own check point mechanism. BDB
At each startup, a check point is performed and all the previous redo logs are cleared. Running
During the process, we can also perform the flush logs to manually check the BDB.

3.5.4 FEDERATED storage engine:

The functions implemented by the FEDERATED Storage Engine are similar to those implemented by Oracle DBLINK, which is mainly used to provide
Data access excuse on the MySQL server. If we use source code compilation to install MySQL
Work-specified enable
FEDERATED storage engine, because MySQL cannot use this storage engine by default.
When we create a FEDERATED table, we only create a structure definition file for the table locally.
All data is taken from the database on the remote MySQL server in real time.
When we operate a FEDERATED table through SQL, the implementation process is basically as follows:
A. SQL call is locally released
B. MySQL Processor API (data is in processor format)
C. MySQL client API (data is converted into SQL calls)
D. Remote Database-> MySQL client API
E. convert the result package (if any) to the processor format
F. Processor API-> Result line or local count affected by line

3.5.5 ARCHIVE storage engine:
The ARCHIVE storage engine is mainly used to store expired historical data that is rarely accessed through a small storage space.
ARCHIVE tables do not support indexing. through A. frm structure definition file, A. ARZ data compression file also has one
. ARM meta information files. Because of the special nature of the data it stores, ARCHIVE tables cannot be deleted.
Only insert and query operations are supported. The locking mechanism is row-level locking.

3.5.6 BLACKHOLE storage engine:
The BLACKHOLE storage engine is a very interesting storage engine. its function is just like its name, and it is a "black hole ".
Like the "/dev/null" device in our unix system, no response is returned no matter what information we write.
So what is the use of the BLACKHOLE storage engine for us? I had the same questions when I first came into contact with MySQL.
Q: Why does MySQL provide such a storage engine? However, after another data migration
In the process, BLACKHOLE has brought me a lot of benefits. During the data migration process
After some conversions are performed on a transit MySQL server, and then the data is copied and transplanted to the new server.
. But at that time, I didn't have enough space to support the operation of this transit server. At this time, BLACKHOLE is displayed.
It does not record any data, but records all SQL statements in binlog. These SQL statements are the most
It will eventually be used by replication and implemented to the final slave end.
The user manual of MySQL also describes the usage of the BLACKHOLE storage engine as follows:
A. verify the SQL file syntax.
B. overhead measurement from binary log records by comparing the BLACKHOLE features that allow binary log functions
The performance of BLACKHOLE that can be used to disable binary logs.
C. because BLACKHOLE is essentially a "no-op" storage engine, it may be used for searching and storage engines.
Non-relevant performance bottlenecks.

3.5.7 CSV storage engine:
The CSV storage engine is actually operating on a standard CSV file, which does not support indexing. The main purpose is
In some cases, you may need to export data from the database into a report file, while the CSV file is a lot of software.
A standard format is supported, so we can create a CVS table in the database first, and then generate
Insert the generated report information to the table to obtain a CSV report file.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.