MySQL storage engine Introduction

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Document directory

3.5.1 merge storage engine:
3.5.2 memory storage engine:
3.5.3 bdb storage engine:
3.5.5 archive storage engine:
3.5.6 blackhole storage engine:
3.5.7 CSV storage engine:

MySQL storage engine Introduction

Source: http://simpleframework.net/blog/v/35130.html

3.1 MySQL storage engine Overview

The MyISAM storage engine is the default storage engine for MySQL and one of the most widely used storage engines for MySQL. His predecessor is the isam we mentioned in the MySQL development process. It is an upgraded version of isam. At the beginning of MySQL release, it was the isam storage engine. In fact, at the beginning, MySQL did not even have the concept of storage introduction. MySQL does not have a clear hierarchy of the SQL layer and storage engine layer as it is now. At that time, no matter the code or system architecture, it is a very painful thing for developers. Later, MySQL realized the need to change the architecture and split the front-end business logic and backend data storage with a clear hierarchy, extended functions and code Refactoring for isam, which is the origin of the MyISAM storage engine.

In versions earlier than MySQL 5.1 (excluded), the storage engine must be compiled and installed together with MySQL during MySQL installation. That is to say, in versions earlier than 5.1, although the coupling between the storage engine layer and the SQL layer is very small, it basically achieves interaction through interfaces, however, the two layers cannot be separated, even during installation.

However, since mysql5.1, MySQL AB has made great changes to its structural system and introduced a new concept: plug-in-type storage engine architecture. During architecture transformation, MySQL AB makes the storage engine layer and SQL layer more independent and less coupled. It can even load messages online, that is, you can load a new storage engine to a running MySQL without affecting the normal operation of MySQL. The plug-in storage engine architecture makes loading and removal of storage engines more flexible and convenient, and makes it easier to develop a storage engine. At this point, no database management system is available.

MySQL plug-in storage engines mainly include MyISAM, InnoDB, NDB cluster, Maria, Falcon, memory, archive, merge, and federated, among them, the most famous and widely used MyISAM and InnoDB Storage engines. MyISAM is the upgraded version of MySQL's earliest isam storage engine and the default storage engine of MySQL. InnoDB is not actually developed by mysq, but by a third-party software company, Innobase (acquired by Oracle in 2005). The biggest feature of InnoDB is that it provides features such as transaction control, therefore, users are also very extensive.

Some other storage engines are relatively less used in some specific scenarios. Although NDB cluster supports transactions, it is mainly used in distributed environments, it is a distributed database storage engine with share nothing. Maria is an upgraded storage engine for MyISAM developed by MySQL (the final ga version has not yet been released, falcon is a database storage engine developed by MySQL to replace the current InnoDB Storage engine with advanced features such as transactions. It is currently under development. All data and indexes of the memory storage engine are stored in the memory. Therefore, it is mainly used for some temporary tables or has high performance requirements, however, in specific scenarios where data is lost during the time of xioh, he or she crash. Archive is a storage engine that stores data in a high proportion of compression. It is mainly used to store historical information that has expired and is rarely accessed, and does not support indexing. In a strict sense, merge and federated cannot be regarded as a storage engine. Because the merge storage engine is mainly used to bring several base tables together and serve as an external table, the base table can be based on several other storage engines. Federated actually does something similar to Oracle's dblink, which is mainly used to remotely access data on other MySQL servers.

3.2 introduction to MyISAM storage engine

MyISAM storage engine tables are stored in the database. Each table is stored as three physical files named after the table. First, there will certainly be any storage engine that is indispensable for storing the table structure definition information. FRM file. MYD and. the myi file stores the table data (. MYD) and index data (. myi ). Each table has only three such files for the storage of MyISAM tables. That is to say, no matter how many indexes the table has, it is stored in the same. myi file.

MyISAM supports the following three types of indexes:

1. B-tree indexes

B-tree indexes, as the name suggests, mean that all index nodes are stored according to the balance tree data structure, and all index data nodes are on the leaf nodes.

2. R-tree indexes

The storage method of the R-tree index is different from that of the B-tree index. It is mainly designed to index the fields of the storage space and multi-dimensional data. Therefore, in the current MySQL version, only geometry fields are supported for indexing.

3. Full-text index

The full-text index is the full-text index we have long mentioned. Its storage structure is also B-tree. This is mainly to solve the problem of inefficient like queries.

Among the three index types above MyISAM, B-tree indexes are most frequently used and Fulltext is occasionally used. However, R-tree indexes are rarely used in general systems. In addition, MyISAM's B-tree index has a large limit, that is, the total length of all fields involved in an index cannot exceed 1000 bytes.

Although each MyISAM table is stored in the same suffix. in the MYD file, the storage format of each file may not be exactly the same, because the data storage format of MyISAM is divided into static (fixed) fixed length, dynamic (dynamic) variable Length and compressed format. Of course, we can choose whether to compress the three formats. You can specify {compressed | default} by using row_format when creating the table, or use myisampack to compress the table, by default, data is not compressed. In the case of non-compression, whether it is static or dynamic is related to the definition of fields in our table. As long as a field of Variable Length exists in the table, the table must be in dynamic format. If no variable length field exists, it is in fixed format. Of course, you can also use the alter table command to forcibly convert a dynamic table with varchar fields to fixed, however, the result is that the original varchar field type is automatically converted to the char type. On the contrary, if fixed is converted to dynamic, char fields are also converted to varchar fields. Therefore, you must be cautious when manually converting them.

Is the MyISAM storage engine table reliable? In the MySQL user reference manual, table File Corruption may occur in the following situations:

When mysqld is being written, it is killed or aborted due to exceptions in other situations;
Host crash;
Disk hardware faults;
Bug in MyISAM storage engine?

If an error occurs in a table file of the MyISAM storage engine, only the table is affected, but other tables are not affected, and other databases are not affected. If an error occurs in a MyISAM table during running, you can use the check table command online to verify it, you can also use the repair table command to try to fix the problem. When the database is closed, we can also use the myisamchk tool to detect or repair a (or some) table in the database. However, it is strongly recommended that you do not perform repair operations on the table as much as possible before the repair, so as to avoid unnecessary consequences.

In addition, the tables of the MyISAM storage engine can be used and operated simultaneously by multiple database instances. However, we do not recommend that you do this, and the MySQL official User Manual also mentions that, we recommend that you do not share MyISAM files among multiple mysqld files.

3.3 introduction to the InnoDB Storage Engine

In MySQL, the most widely used except MyISAM is not InnoDB. As a storage engine developed by a third-party company, InnoDB complies with the same open-source license protocol as MySQL. The reason why InnoDB is so favored is that it has many features:

1. Supports transaction Installation

The most important aspect of InnoDB's functionality is its support for transaction security, which is undoubtedly a very important reason for InnoDB to become one of MySQL's most popular storage engines. In addition, all four levels defined by the sql92 standard are implemented (read uncommitted, Read committed, Repeatable read and serializable ). The support for transaction security has undoubtedly led many users who had to give up using MySQL due to special business requirements to support MySQL and those who had a wait-and-see attitude towards database selection, it also greatly increased the favor of MySQL.

2. Multi-version Data Reading

While InnoDB supports transactions, in order to ensure the performance when data consistency has been concurrent, it implements Multi-version Data Reading through undo information.

3. Improvement of Locking Mechanism

InnoDB changed the lock mechanism of MyISAM and implemented the row lock. Although the implementation of InnoDB's row lock mechanism is achieved through indexes, after all, 99% of SQL statements in the database use indexes for data retrieval. Therefore, the row locking mechanism has undoubtedly increased the competitiveness of InnoDB in a highly concurrent environment.

4. Implement Foreign keys

InnoDB implements the important feature of the foreign key reference database, making it possible to control the integrity of some data on the Database End. Although many database system tuning experts do not recommend this, adding foreign key control on the Database End may still be the lowest cost for many users.

In addition to the highlights of the above features, InnoDB also has many other features that often bring great surprises to users, while also bringing more customers to MySQL.

The InnoDB Storage engine is not the same as MyISAM, although it also has. FRM files are used to store table structure definition-related metadata, but table data and index data are stored together. As a result, each table is stored separately or all tables are stored together, which is determined by the user (through specific configuration) and supports symbolic links.

The physical structure of InnoDB is divided into two parts:

1. Data Files (table data and index data)

Store data in a data table and all index data, including primary keys and other common indexes. In InnoDB, the table space concept is stored, but it is quite different from that in Oracle. First, there are two types of InnoDB tablespaces. One is the shared tablespace, that is, all tables and index data are stored in the same tablespace (one or more data files) and specified through innodb_data_file_path, data files need to be shut down and restarted. The other is exclusive tablespace, that is, the data and indexes of each table are stored in a separate. IBD file.

Although we can use shared or exclusive tablespaces to store our tables on our own, empty shared tables must exist, because InnoDB's undo information and other metadata information are stored in the shared tablespace. Data files in the shared tablespace can be set to fixed size or automatically scalable. Files in the automatic scaling mode can be set to the maximum file size and the amount of each extension. We recommend that you add the maximum size attribute when creating automatically Scalable Data Files. One reason is that the file system itself has a certain size limit (but InnoDB does not know ), another reason is the convenience of self-maintenance. In addition, InnoDB can use not only the file system, but also the original block devices, which we often call bare devices.

When our file tablespace is about to run out, we must add data files for it. Of course, only the shared tablespace has this operation. It is easy to add data files to the shared tablespace. You only need to set the file path and related attributes following the innodb_data_file_path parameter in the standard format. However, note that, innoDB does not create directories when creating new data files. If the specified directory does not exist, an error is reported and cannot be started. Another headache is that after InnoDB adds data files to the shared tablespace, it must restart the database system to take effect. If it is a bare device, it needs to be restarted twice. This is one of the reasons why I have never liked to use a shared tablespace and select a dedicated tablespace.

2. Log Files

The log files of InnoDB are similar to the redo logs of Oracle. You can also set multiple log groups (at least two) and write data in sequence using the round robin policy, even the old version has the same log archiving feature as oracle. If you have created an InnoDB table in your database, do not delete all the InnoDB log files, because it is likely to cause crash in your database to fail to start or lose data.

As InnoDB is a transaction-safe storage engine, system crash cannot cause very serious losses to it. Due to the existence of redo logs, it is protected by the checkpoint mechanism, innoDB can use the redo log to restore the transaction that has completed the database crash but has not yet completed writing data to the disk, it can also roll back unfinished transactions that have been partially completed and written to the disk and restore data.

InnoDB is not only significantly different from the MyISAM storage engine in terms of features, but also independently processed in configuration. In the MySQL startup parameter file settings, all InnoDB parameters basically carry the prefix "InnoDB _", not to mention InnoDB data and log-related, or other performance, transaction and other related parameters are the same. Like all InnoDB-related system variables, all InnoDB-related system status values are also prefixed with "InnoDB. Of course, we can use only one parameter (skip-InnoDB) to shield the InnoDB Storage engine in MySQL, so that even if we install the InnoDB Storage engine during compilation, you cannot create InnoDB tables either.

3.4 NDB Cluster Storage engine Introduction

The NDB storage engine is also called the NDB Cluster Storage engine, which is mainly used in the MySQL cluster distributed cluster environment. Cluster is a new function provided by MySQL from version 5.0. This part may not only introduce the NDB storage engine, because the NDB storage engine will lose too much meaning after leaving the entire MySQL cluster environment. So this section mainly introduces the related content of MySQL cluster.

Simply put, MySQL cluster is actually a memory database cluster environment without shared storage devices. It is mainly implemented through NDB cluster (NDB for short) storage engine. Generally, a MySQL cluster environment consists of the following three parts:

A) manages the manage node hosts of each node:

The management node is responsible for managing all nodes in the cluster, including cluster configuration, enabling and disabling nodes, and implementing data backup and recovery. The management node obtains the status and error information of each node in the cluster environment, and sends the information of each node in each cluster to all other nodes in the cluster. The management node maintains the configuration of the entire cluster environment and serves as the basic communication for each node in the cluster. Therefore, it must be the first node to be started.

B) SQL server nodes at the SQL layer (hereinafter referred to as SQL nodes), which is also known as MySQL Server: mainly responsible for implementing all the things of a database on the storage layer, for example, connection management, query optimization, response, and Cache Management, only the work at the storage layer is handed over to the NDB data node for processing. That is to say, the SQL node in the pure MySQL cluster environment can be considered as a MySQL server that does not need to provide any storage engine, because his storage engine has NDB nodes in the cluster environment. Therefore, the startup of MySQL server at the SQL layer is different from that of normal MySQL server. You must add the ndbcluster item, which can be added to my. in the CNF configuration file, you can also specify it by starting the command line.

C) NDB data nodes in the storage layer, that is, the NDB cluster mentioned above:

NDB is a memory-based storage engine that loads all the data and index data into the memory, but it also persists the data to the storage device. However, in the latest version, you can choose not to load all data to the memory, this is indeed good news for users who have too much data or have insufficient memory space to store all data based on costs.

NDB nodes are mainly used to store underlying data and store cluster data. Each NDB node stores a part of the complete data (or a complete data, depending on the number of nodes and configuration). It is called a fragment in the MySQL cluster. In normal cases, each fragment will have an identical image (or multiple points) on other hosts. These are all done through configuration, so as long as the configuration is proper, MySQL cluster will not have a single point of failure in the storage layer. Generally, NDB nodes are organized into one NDB group. An NDB group is actually a group of NDB nodes with identical physical data.

As mentioned above, the data organization of each NDB node may have all the data in each node, or only a portion of the data may be saved, mainly controlled by the number of nodes and parameters. First, in the MySQL cluster main configuration file (on the management node, usually config. INI), there is a very important parameter named noofreplicas, this parameter specifies the number of copies of each piece of data that are stored in redundancy on different nodes, this parameter should be set to 2 or 2 at least. Normally, the failure probability of two redundant nodes at the same time is still very small. Of course, if there are enough machines and memory, it can continue to increase. Whether a node stores all data or a part of data is also limited by the number of storage nodes. The NDB storage engine first ensures that the noofreplicas parameter configuration requires data redundancy to use the storage node, and then uses the data segment based on the number of nodes to continue using multiple NDB nodes, the number of segments is the total number of nodes divided by noofreplicas.

MySQL cluster contains a lot of content. For the sake of space, we will not introduce it in depth for the moment, the High Availability design section in the architectural design section of this book will provide a more detailed introduction and implementation details. You can also learn some details through the MySQL official documentation.

3.5 introduction to other storage engines 3.5.1 merge storage engines:

The merge storage engine, also mentioned in the MySQL user manual, is also known as the mrg_myisam engine. Why? The merge storage engine can be simply understood as its function is to implement a MyISAM table with the same structure and provide a single access portal through some special packaging, to reduce the complexity of the application. To create a MERGE table, not only must the structure of the Base table be completely consistent, including the order of fields, but also the index of the base table be completely consistent. The MERGE table does not store data, but only provides an agreed storage portal for multiple base tables. Therefore, when creating a MERGE table, MySQL generates only two small files, one of which is. the structure definition file of frm. MRG file, used to store the name of the table involved in Merge (including the schema of the database ). The database schema is required because the MERGE table not only supports the tables in the same database but also the tables in different databases of merge, as long as the permission permits, and in the same mysqld, you can perform merge. After a MERGE table is created, you can use related commands to change the underlying base table.

Merge tables not only provide read services, but also write services. To enable the Insert Service for a MERGE table, you must specify the base table to which the insert data will be written when the table is created. You can use the insert_method parameter to control the Insert Service. If this parameter is not specified, any attempt to insert data to the MERGE table will result in an error. In addition, you cannot directly use the full-text index on the base table through the MERGE table. To use full-text indexes, you must access the base table.

3.5.2 memory storage engine:

Memory storage engine is a storage engine that stores data in the memory. The memory storage engine does not store any data on the disk. It only stores a. frm file with table structure information on the disk. Therefore, after MySQL crash or host crash, only one structure is left in the memory table. Memory tables support indexes in both hash and B-tree formats. Because it is stored in the memory, memory stores data based on a fixed length of space, and does not support blob and text fields. The memory storage engine locks pages.

Since all the data is stored in the memory, the memory consumption can be imagined. In the MySQL user manual, there is such a formula to calculate the actual memory size consumed by the memory table:

Aggregate (max_length_of_key + sizeof (char *) * 4) + sum_over_all_hash_keys (sizeof (char *) * 2) + align (length_of_row + 1, sizeof (char *))

3.5.3 bdb storage engine:

The bdb storage engine is called the berkeleydb storage engine. Like InnoDB, it is not a storage engine developed and implemented by Mysql itself, but provided by sleepycat software. Of course, it is also an open source storage engine, transaction Security is also supported.

The data storage of the bdb storage engine is also two physical files for each table, one. frm file and one. DB file. The data and index information are stored in the. DB file. In addition, bdb also has its own redo logs for transaction security. Like InnoDB, you can also specify the location where the log files are stored through parameters. In terms of the locking mechanism, bdb and the memory storage engine implement page-level locking.

Because the bdb storage engine implements transaction security, it must also have its own check point mechanism. Bdb performs a Check Point every time it starts and clears all the previous redo logs. During the running process, we can also manually perform the Check Point Operation on bdb by executing flush logs.

3.5.4 federated storage engine:

The functions implemented by the federated storage engine are similar to those of Oracle dblink, which is mainly used to provide an excuse for accessing the data on the remote MySQL server. If we use source code compilation to install MySQL, you must manually specify to enable it.

Federated storage engine, because MySQL cannot use this storage engine by default. When we create a federated table, we only create a schema definition file for the table locally. All data is taken from the database on the remote MySQL server in real time.

When we operate a federated table through SQL, the implementation process is basically as follows:

SQL call is locally released
MySQL processor API (data is in Processor Format)
Mysql client API (data is converted into SQL calls)
Remote database> MYSQL client API
Convert the result package (if any) to the Processor Format
Processor API-> result line or line-affected local count

3.5.5 archive storage engine:

The archive storage engine is mainly used to store expired historical data that is rarely accessed through a small storage space. Archive tables do not support indexing. Through a. frm structure definition file, A. Arz Data Compression file also has a. Arm meta information file. Because of the special nature of the stored data, archive tables do not support deletion and modification operations. They only support insert and query operations. The locking mechanism is Row-level locking.

3.5.6 blackhole storage engine:

The blackhole storage engine is a very interesting storage engine. Its function is just like its name, and it is a "black hole ". Like the "/dev/null" device in our Unix system, no response is returned no matter what information we write. So what is the use of the blackhole storage engine for us? I had the same question when I first came into contact with MySQL. I don't know why MySQL provided such a storage engine for us? But in the next data migration process, it was blackhole that brought me great benefits. During the data migration process, data needs to be converted through a transit MySQL server, and then copied and transplanted to the new server. But at that time, I didn't have enough space to support the operation of this Transit server. At this time, it shows the effectiveness of blackhole. He will not record any data, but will record all SQL statements in BINLOG. These SQL statements will eventually be used by replication and implemented to the final slave end.

The user manual of MySQL also describes the usage of the blackhole storage engine as follows:

SQL File Syntax verification.
Measure the overhead of binary logs by comparing the performance of blackhole that allows binary logs and the performance of blackhole that disables binary logs.
Because blackhole is essentially a "no-op" storage engine, it may be used to find performance bottlenecks unrelated to the storage engine itself.

3.5.7 CSV storage engine:

The CSV storage engine is actually operating on a Standard CSV file, which does not support indexing. In some cases, you may need to export data from the database into a report file. CSV files are a standard format supported by many software, therefore, you can create a CVS table in the database and insert the generated report information to the table to obtain a CSV report file.

Conclusion 3.6

Multi-storage engine is the biggest feature of MySQL, which is different from other database management software. Different storage engines have different features and can cope with different application scenarios, this allows us to select the most favorable storage engine based on different application characteristics in actual applications, giving us sufficient flexibility. Through this chapter, I have a preliminary understanding of the various storage engines of MySQL. I think you may have some knowledge about the main storage engines of MySQL, some common storage engines will be further introduced in the subsequent chapter.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MySQL storage engine Introduction

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

MySQL storage engine Introduction

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support