3rd. Introduction to MySQL Storage Engine

Last Update:2014-11-27 Source: Internet

Author: User

Tags mysql client

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Section 3 Chapters MySQL Introduction to the storage engine

Objective

3． 1 MySQLStorage Engine Overview

The MyISAM storage engine is the default storage engine for MySQL and is one of the most widely used storage engines in MySQL today. His predecessor was the ISAM we mentioned in the MySQL development process, which is an upgraded version of ISAM. When MySQL was first released, it was the ISAM storage engine, and in fact, at first, MySQL didn't even have the concept of a storage engine. MySQL does not have a clear hierarchy of SQL layer and storage engine layer on the architecture, either the code itself or the system architecture, which is a painful thing for developers. By then, MySQL realized the need to change the architecture, separating the business logic of the front-end and the back-end data storage with a clear hierarchy, and doing the same for the ISAM extension and refactoring of the code, which is the origin of the MyISAM storage engine.

MySQL in versions prior to 5.1 (not included), the storage engine needs to be compiled and installed at the same time as MySQL when it is installed. In other words, the???? before 5.1 version, although the storage engine layer and the SQL layer of the coupling is very little, basically completely through the interface to achieve interaction, but there is still no way to separate the two layers, even at the time of installation is the same.

But starting with MySQL5.1, MySQL AB has made a major transformation of its architecture and introduced a new concept: the plug-in storage engine architecture. MySQL AB in the framework of the transformation, let the storage engine layer and the SQL layer respectively more independent, coupled smaller, even can do online load the letter of the storage engine, that is, it is completely possible to load a new storage engine into a running MySQL, without affecting the normal operation of MySQL. The architecture of the plug-in storage engine is more flexible and convenient for loading and removing the storage engine, and makes it easier to develop the storage engine yourself. At this point, there is no database management system to do so far.

MySQL's plug-in storage engine mainly includes myisam,innodb,ndb Cluster,maria,falcon, memory,archive,merge,federated, among them the most famous and most widely used MyISAM and Innodb two types of storage engines. MyISAM is the upgraded version of MySQL's oldest ISAM storage engine and is the default storage engine for MySQL. And Innodb is not actually mysq, but the third-party software company Innobase (acquired by Oracle in 2005), its biggest feature is the transaction control and other features, so the user is also very extensive.

Some of the other storage engines are relatively small in usage scenarios, and are applied to certain scenarios, such as NDB Cluster, although it also supports transactions, but is mainly used for distributed environments and belongs to a distributed database storage engine that share nothing. Maria is the latest development of MySQL (not yet released the final GA version) of the MyISAM upgrade storage engine, Falcon is a MySQL company developed in order to replace the current INNODB storage engine with advanced features such as transaction database storage engine, is currently developing the order Paragraph Memory storage engine All data and indexes are stored in RAM, so it is mainly used for some temporary tables, or very high performance requirements, but allow in the West Oh he Crash when the loss of data in a specific scenario. Archive is a storage engine that data is stored in a high proportion of compression and is primarily used to store outdated and infrequently accessed historical information and does not support indexing. Merge and federated are not strictly a storage engine. Because the merge storage engine is primarily used to merge several base tables together and provide services as a table, the base table can be based on several other storage engines. What federated actually does is a bit like Oracle's Dblink, which is primarily used to remotely access data on other MySQL servers.

3. 2 MyISAMIntroduction to the storage engine

MyISAM Storage Engine table in the database, each table is stored as a three physical file named after the table name. First of all, there must be a. frm file that holds the table structure definition information that is indispensable for any storage engine, plus. MYD and. MYI files, respectively, storing the table's data (. MYD) and index data (. MYI). Each table has and only such three files are stored as a table of the MyISAM storage type, meaning that no matter how many indexes the table has, it is stored in the same one. The MYI file.

MyISAM supports the following three types of indexes:

1. B-tree Index

The B-tree index, as the name implies, is that all index nodes follow balance???? The tree data structure is stored, and all the index data nodes are in the leaf node.

2. R-tree Index

The R-tree index is stored in a number of different ways than the B-tree index, and is primarily designed for storage space and multiple

The fields of the dimension data are indexed, so the current MySQL version only supports geometry types of fields for indexing.

3. Full-text Index

The Full-text index is the full-text index that we say, and his storage structure is b-tree. The main thing is to solve inefficient problems that we need to use like queries.

MyISAM the above three types of indexes, the most commonly used is the B-tree index, occasionally used to Fulltext, but the R-tree index general system is rarely used. In addition, MyISAM's B-tree index has a large limit, which is that all fields participating in an index cannot exceed 1000 bytes in length.

Although each MyISAM table is stored in the same suffix name. MYD file, but the format of each file may not actually be exactly the same, because the MyISAM data storage format is divided into three formats: static (fixed), dynamic variable length, and compression (compressed). Of course, whether or not compression in three formats is entirely optional and can be specified by Row_format when creating a table {compressed | Default}, which can also be compressed using the Myisampack tool, which is not compressed. In the case of non-compression, it is static or dynamic, which is related to the definition of a field in our table. As long as a field with a variable-length type exists in the table, the table must be in dynamic format, or FIXED if there are no variable-length fields, or, of course, you can force a table of DYNAMIC with a VARCHAR type field through the ALTER TABLE command Converted to FIXED, but the result is that the original VARCHAR field type is automatically converted to the CHAR type. Conversely, if you convert a FIXED to DYNAMIC, the CHAR type field is converted to a VARCHAR type, so you must be cautious about manually forcing conversions.

Is the table of the MyISAM storage engine reliable enough? In the MySQL user reference manual, the table file corruption may occur when the following conditions are encountered:

When the mysqld is doing a write operation is killed or other circumstances caused abnormal termination;
Host Crash;
Disk hardware failure;
MyISAM a bug in the storage engine?

After an error occurs with one of the table files of the MyISAM storage engine, it affects only the table, without affecting the other tables, and does not affect other databases. If there is a problem with a MyISAM table when our out-of-work is running, you can try verifying it online through the check Table command, and you can try to fix it with the repair Table command. In the case of a database shutdown, we can also detect or repair some (or some) of the tables in the database by using the Myisamchk tool. However, it is strongly recommended that you do not need to repair the table easily, before repair, as far as possible to do the backup work, so as not to bring unnecessary consequences.

In addition, MyISAM storage engine tables can theoretically be used simultaneously by multiple DB instances simultaneously, but neither we recommend it, and the MySQL official user manual also mentions that it is recommended that you try not to share MyISAM storage files between multiple mysqld.

3. 3 InnodbIntroduction to the storage engine

The most widely used in MySQL, except for MyISAM, is not Innodb. Innodb as a storage engine developed by a third party company, and MySQL complies with the same open source License protocol.

The reason why Innodb is so favored is mainly due to its many features:

Support for transactional installation

Innodb The most important aspect of functionality is the support for transactional security, which is undoubtedly one of the most important reasons to make Innodb one of the most popular storage engines for MySQL. It also implements all four levels defined by the SQL92 standard (read Uncommitted,read committed,repeatable read and SERIALIZABLE). The support for transactional security has undoubtedly left many users who have previously had to abandon MySQL with a special business requirement to turn to support

MySQL, as well as the previous view of the database selection of users, also greatly increased the goodwill towards MySQL.

Data Multi-version read

Innodb in the transaction support, in order to ensure the consistency of data has been the performance of the time, through the undo information, to achieve multi-version of the data read.

Improvement of locking mechanism

Innodb changed the locking mechanism of MyISAM and implemented the row lock. Although the implementation of INNODB's locking mechanism is done by index, after all, 99% of SQL statements in the database are indexed to retrieve data. Therefore, the line locking mechanism is undoubtedly for the Innodb in the environment under high concurrency pressure to enhance the competitiveness.

Implementing foreign keys

INNODB implements the important features of the foreign key reference database, which makes it possible to control the integrity of some data at the database end. Although many database system tuning experts do not recommend this, but for many users in the database side plus such as foreign key control may still be the lowest cost option.

In addition to the highlights of the above features, INNODB has many other features that often bring a lot of surprises to users, as well as more customers for MySQL.

On physical storage side, Innodb storage engine is not the same as MyISAM, although there are. frm files to hold the metadata associated with the table structure definition, but the table data and index data are stored together. As for whether each table is stored separately or all tables are stored together, it is up to the user to decide (through a specific configuration) and also support symbolic links.

The physical structure of INNODB is divided into two parts: 1. Data files (table data and index data)

Holds data in the data table and all index data, including primary keys and other normal indexes. In Innodb, there is a concept of tablespace (tablespace), but there is a big difference between the table space and Oracle. First, the Innodb table space is divided into two forms. One is a shared tablespace, in which all tables and index data are stored in the same tablespace (one or more data files), specified by Innodb_data_file_path, and additional data files require an outage restart. Another is the exclusive table space, where the data and indexes of each table are stored in a separate. ibd file.

Although we can set our own tables using shared tablespace or exclusive table space, shared tablespace must exist because Innodb's undo information and some other metadata information are stored in the shared table space itself. The data file for a shared tablespace is a file that can be set to a fixed size and can be automatically scaled up to two forms, and automatically expands in the form of files that can set the maximum size of the file and the amount of each extension. When creating an auto-extended data file, it is recommended that you add the maximum size of the property, one reason is that the file system itself is a certain size limit (but Innodb do not know), there is another reason is the convenience of self-maintenance. In addition, INNODB can not only use the file system, but also use the original block device, which we often say the bare device.

When our file table space is about to run out, we have to add data files for it, of course, only shared table space has this action. It is simpler to add data files to a shared tablespace, just set the file path and related properties after the Innodb_data_file_path parameter in a standard format, but here's a little bit to be aware that InnoDB is creating new data The file does not create a directory, and if the specified directory does not exist, an error will be made and cannot be started. Another more troubling thing is that Innodb after adding data files to the shared table space, it is necessary to restart the database system to take effect, and if you are using a bare device, you need to restart two times.

That's one of the reasons I've never been too fond of using shared tablespace for exclusive table space.

2. log files

Innodb log files are similar to Oracle's redo logs, and can also be set up with multiple log groups (at least 2), followed by a round-robin strategy for sequential writes, even in older versions with the same log archive characteristics as Oracle. If you have a table with Innodb created in your database, do not delete all Innodb log files, because it is likely that your database will crash, fail to start, or lose data.

Since Innodb is a transaction-safe storage engine, the system Crash is not a very serious loss to him, due to the existence of redo log, checkpoint mechanism of protection, INNODB can completely through the redo log the database Crash time has been completed but There has not been time to recover data to disk, but it is also possible to rollback and restore the data for all incomplete transactions that have been completed and written to disk.

Innodb not only in terms of functional characteristics and MyISAM storage engine has a large difference, in the configuration is also handled separately. In the MySQL boot parameter file settings, all parameters of Innodb are basically prefixed with "innodb_", whether it is INNODB data and log related, or some other performance, transaction, etc. related parameters are the same. As with all INNODB-related system variables, all INNODB-related system State values are also prefixed with "innodb_". Of course, we can also completely block the InnoDB storage engine in MySQL with just one parameter (SKIP-INNODB), so that even if we install the InnoDB storage Engine when we install the build, the user cannot create a InnoDB table.

3. 4 NDB ClusterIntroduction to the storage engine

???? The NDB storage engine is also called NDB???? Cluster storage engine, mainly used for MySQL???? Cluster distributed cluster environment,

Cluster is a new feature that MySQL has only begun to provide from version 5.0. In this part we may not just introduce the NDB storage engine because we left MySQL???? CLuster the entire environment, the NDB storage engine will also lose too much meaning.???? So this section is mainly about MySQL???? Cluster of the relevant content.

Simply put, Mysql Cluster is actually a memory database Cluster environment that is implemented without shared storage devices, mainly through the NDB Cluster (abbreviated NDB) storage engine.

In general, a Mysql???? Cluster environment mainly consists of the following three parts: a)???? The Manage node hosts that are responsible for managing each node:

The Management node is responsible for the management of the nodes in the whole Cluster cluster, including the configuration of the cluster, starting and closing the nodes, and the backup and recovery of the implementation data. The management node obtains the status and error information of each node in the entire Cluster environment, and feeds the individual nodes in each Cluster cluster to all other nodes in the entire cluster. Because of the configuration of the management node in the entire Cluster environment, as well as the basic communication work of the nodes in the cluster, he must be the first node to be started.

SQL Server node (hereafter referred to as SQL node), which is what we often call Mysql???? Server: primarily responsible for the implementation of a database on the storage layer of all things, such as connection management, query optimization and response, cache management, and so on, only the work of the storage layer to the NDB data node to deal with. In other words, in the pure Mysql???? The SQL node in the Cluster environment can be thought of as a Mysql server that does not need to provide any storage engine because his storage engine has a NDB node in the Cluster environment. Therefore, the SQL layer of the MySQL server startup and the normal MySQL boot has a certain difference, you must add the Ndbcluster key, you can add in the my.cnf configuration file, you can also start the command line to specify.
The NDB data node of the Storage layer, which is said above NDB???? Cluster:

NDB is an in-memory storage engine in other words, he will load all the data and index data into memory, but also persist the data to the storage device. However, the latest version, already supported by the user's own choice of data can not be fully Load into memory, this is a lot of data is too large or based on cost considerations and not enough memory space to hold all the data for the user is indeed a good news.

NDB node is mainly to realize the function of the underlying data storage, and save the Cluster data. Each NDB node holds part of the complete data (or a complete set of data, depending on the number of nodes and configuration), and is called a fragment in MySQL CLuster. Each fragment, in normal circumstances, will have a copy (or multiple) of the exact same image on the other host. This is done through configuration, so as long as it is properly configured, Mysql Cluster does not have a single point of issue at the storage layer. In general, NDB nodes are organized into a single NDB group, and a NDB group is actually a group of NDB nodes with exactly the same physical data.

The above mentioned NDB each node to the data organization, perhaps each node has all the data may also save only a subset of data, mainly by the number of nodes and parameters to control. First in Mysql???? Cluster The main configuration file (above the management node, typically Config.ini), there is a very important parameter called Noofreplicas, which specifies the number of copies of each piece of data that is stored redundantly on different nodes, which is generally at least set to 2, also only need to set to 2 on it. Because normally, the probability of two redundant nodes failing at the same time is very small, of course, if the machine and memory enough, you can continue to grow. Whether a node holds all or part of the data is limited by the number of storage nodes. The NDB storage engine first guarantees that the Noofreplicas parameter configuration requires data redundancy, uses the storage node, and then segments the data based on the number of nodes to continue using the redundant NDB nodes, dividing the total number of nodes by Noofreplicas.

MySQL Cluster itself contains a lot of content, for the sake of length, here is not a very deep introduction, in the design of the book in the High-availability Design section of the chapter will be more detailed introduction and implementation details, you can also use the official MySQL document to further understand some of the details.

3． 5Other Storage Engine Introduction3.5.1 MergeStorage Engine:

The MERGE storage engine, also mentioned in the MySQL user manual, is also known as the Mrg_myisam engine.

Why? Because the MERGE storage engine can be simply understood as its function is to implement the same structure of the MyISAM table, through some special packaging to provide a single access to the portal, to reduce the complexity of the application of the purpose. To create

MERGE???? Table, not only the structure of the base table is exactly the same, including the order of the fields, the index of the base table must also be exactly the same.

The MERGE table itself does not store data, simply providing an agreed storage entry for multiple base tables. So when creating the MERGE table, MySQL will only generate two smaller files, one is the. frm structure definition file, and a

. MRG file that holds the name of the table participating in the MERGE, including the owning database schema. The schema of the owning database is required because the merge table can not only implement a table in the same database as the merge, but also

Merge tables in different databases, as long as permission is allowed, and under the same mysqld, you can merge.

After the MERGE table is created, the underlying base table can still be changed by the relevant commands.

The MERGE table not only provides read services, but also provides write services. For the MERGE table to provide an insert service, you must specify which base table the INSERT data is to be written to when the table is created, which can be controlled by the Insert_method parameter. If this parameter is not specified, any operation that attempts to INSERT data into the MERGE table will be faulted. In addition, the full-text index above the base table cannot be directly used through the MERGE table, and the full-text index must be accessed through the base table itself.

3.5.2 MemoryStorage Engine:

Memory storage engine, by name it is easy to know that he is a storage engine that stores data in memory. The memory storage engine does not store any data on disk, only the. frm file that holds information about a table structure is on disk. So once MySQL???? After the Crash or host Crash, the Memory table has only one structure left. The Memory table supports indexes and supports both Hash and b-tree two formats. Because it is stored in memory, it stores data in a fixed-length space and does not support fields of BLOB and TEXT types. The memory storage engine implements page-level locking.

Since all the data is stored in memory, his consumption of memory is conceivable. In the MySQL user manual, there is a formula to calculate the amount of memory that actually needs to be consumed:

???? Sum_over_all_btree_keys (Max_length_of_key +???? sizeof (char*)???? *???? 4)

???? +???? Sum_over_all_hash_keys (sizeof (char*) *???? 2)

???? +???? ALIGN (length_of_row+1, sizeof (char*))

3.5.3 BDBStorage Engine:

The BDB storage engine, called the BerkeleyDB storage engine, is the same as INNODB, not a storage engine developed by MySQL itself, but is provided by Sleepycat software and, of course, the open source storage engine, which also supports transactional security.

BDB storage engine data is stored in two physical files per table, a. frm and a. db file, data and index information are stored in a. db file. In addition, BDB also has its own redo log for transactional security, and, like Innodb, can specify where the log files are stored by parameters. In terms of locking mechanisms, the BDB and memory storage engines implement page-level locking.

Because the BDB storage engine implements transaction security, he must also have his own check???? Point mechanism. BDB will do a check???? each time it is started Point, and empties all previous redo logs. During the run, we can also execute the flush???? Logs to check the BDB manually???? Point operation.

3.5.4 FederatedStorage Engine:

The functionality implemented by the Federated storage Engine is basically similar to Oracle's DBLINK, which is used primarily to provide access to data on the remote MySQL server. If we use source code compilation to install MySQL, then you must manually specify the Enable

Federated storage Engine is not the only line, because MySQL default is not the storage engine.

When we create a federated table, we only create a file of the structure definition information of a table locally, and all the data is taken in real time from the database above the remote MySQL server.

When we operate the federated table through SQL, the implementation process is basically as follows: A, SQL Call is published locally B, MySQL processor API (data in processor format) c, MySQL client API (data is converted to SQL call)

remote database,???? MySQL Client API
Convert result Package (if any) to processor format
Processor???? Api???? ???? The local count of the result row or row affected

3.5.5 ARCHIVEStorage Engine:

The ARCHIVE storage engine is primarily used to store outdated, infrequently accessed historical data with small storage space. The ARCHIVE table does not support indexing, through a. frm structure definition file, one. ARZ's data compression file also has one. The meta-information file for ARM. Due to the particularity of the data being stored, the ARCHIVE table does not support the deletion and modification operation, only the insert and query operations are supported. The locking mechanism is row-level locked.

3.5.6 blackholeStorage Engine:

Blackhole Storage Engine is a very interesting storage engine, the function of the name, is a "black hole." Just like the "/dev/null" device under our UNIX system, no matter what information we write, there is no return. So what's the use of the Blackhole storage engine for us? I had the same question when I first approached MySQL, and I don't know what the purpose of MySQL is to provide such a storage engine to us. But then again in the process of data migration, it was blackhole that brought me a great effect. During that data migration process, because the data needs to go through a relay MySQL server to do some related conversion operations, and then through replication to the new server. But I didn't have enough space to support the operation of this relay server. This time it shows the effectiveness of blackhole, he will not record any data, but will be recorded in the Binlog all the SQL. These SQL will eventually be used by replication and implemented to the final slave end.

The MySQL User manual also describes several other uses of the Blackhole storage Engine: A, SQL file syntax validation.

The cost measurement from binary logging, by comparing the performance of the blackhole that allows binary logging functionality with the blackhole of the Prohibit binary logging feature.
Because Blackhole is essentially a "no-op" storage engine, it may be used to find performance bottlenecks that are not related to the storage engine itself.

3.5.7 CSVStorage Engine:

The CSV storage Engine actually operates a standard CSV file that does not support indexing. The main purpose is that you may sometimes need to export data from the database into a report file, and the CSV file is a more standard format that many software supports, so we can get a CSV by first creating a CVS table in the database and then inserting the generated report information into the table. The report file.

3． 6Summary

Multiple storage engines are???? Mysql???? Unlike other database management software features, different storage engines have different characteristics, can deal with different scenarios, which allows us to choose the most advantageous storage engine in actual application according to different application characteristics, giving us enough flexibility. Through this chapter on???? Mysql???? The initial understanding of the various storage engines, I think that you readers friends should have been to???? Mysql???? 's primary storage engine, there is a deeper understanding of some common storage engines in the following chapters.

3rd. Introduction to MySQL Storage Engine

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More