Mysql carpet-based learning (I)-Mysql overall architecture introduction, mysql carpet-Based Learning
Recently, I feel that my skills have been learned along with my work, and I feel that I have never mastered some knowledge, I plan to systematically study the notes of each study in the form of a blog Based on the Mysql authoritative guide and high performance Mysql. The first is to deepen your impression, the second is to provide you with better learning capabilities, and the third is to share with you.
1. logical architecture
The first layer is not unique to Mysql. Most of them are based on network clients and server tools, such as connection processing, authorization authentication, and security.
Layer 2: core service layer, including query parsing, analysis, optimization, caching, and built-in functions. All cross-origin storage engines are on this layer: stored procedures, triggers, and views.
Layer 3: includes a storage engine for Mysql Data Storage and extraction.
2. Mysql Concurrency Control
Mysql concurrency control at the server layer and storage engine layer.
2.1 read/write lock
When processing concurrent reads and writes, You can implement a lock system consisting of two types of locks to solve the problem. These two types of locks are called shared locks and exclusive locks ), also called read lock and write lock
Describe the concept of locks: Read locks are shared and write locks are exclusive.
In actual database applications, locks occur every moment, and the internal management of Mysql locks is transparent.
2.3 lock Granularity
One way to improve the concurrency of shared resources is to make the locked object more selective. Try to lock only part of the data to be modified. The more ideal method is to precisely lock the modified data slices. The smaller the volume of locked data, the higher the concurrency.
Locking also consumes resources. Various lock operations, including obtaining the lock, checking whether the lock has been lifted, and releasing the lock.
The so-called lock policy seeks a balance between the lock overhead and data security. This balance also affects performance.
Mysql provides multiple options. Each storage engine can implement its own lock policies and lock granularity. The following describes two lock policies:
1. Table lock
Basic Mysql policy with minimum overhead.
2. Row-Level Lock
It supports concurrency to the maximum extent, and also brings about the maximum lock overhead. Row-level locks are implemented only at the storage engine layer, but not at the Mysql server layer. The server layer is completely unaware of the lock implementation in the storage engine.
All storage engines display the lock mechanism in their own way.
3. Transactions
ACID Principle
Atomicity, consistency, isolation, persistent durability
1. isolation level
Read uncommitted (uncommitted read), the transaction can read uncommitted data, called dirty read
Read committed: What is done before a transaction is committed is invisible to other transactions.
Repeatable read solves the dirty read problem. This level ensures that the same transaction reads the same record multiple times and the results are consistent. However, in theory, phantom read cannot be solved. The so-called phantom read is that when a transaction is reading a record in a certain range, another transaction inserts a new record into the range, A phantom line is generated when the transaction reads data again. Mysql's InnoDB and XtrsDB storage engines solve phantom read problems through multi-version concurrency control (MVVC, multiversion concurrency control.
Serializable, serial execution, highest isolation level, severe timeout and lock contention, not commonly used
2. deadlock
A deadlock occurs when two or more transactions occupy each other on the same resource and request to lock the resources occupied by the other party, resulting in a vicious circle. A deadlock may occur when a transaction attempts to lock resources in different sequence. A deadlock occurs when multiple transactions lock the same resource at the same time.
For example:
Transaction 1:
Start transaction;
Update goods set price = 20 where goods_id = 4 and date = '2017-4-22 ';
Update goods set price = 30 where goods_id = 3 and date = '2017-4-22 ';
Commit;
Transaction 2:
Start transction;
Update goods set price = 40 where goods_id = 3 and date = '2017-4-22 ';
Update goods set price = 50 where goods_id = 4 and date = '2017-4-22 ';
Commit;
To solve this problem, the database system implements various Deadlock Detection and deadlock timeout mechanisms. The more complex the system, such as the InnoDB Storage engine, can detect the circular dependency of deadlocks, and an error is returned immediately.
There is also a way to free up the lock request when the query time reaches the lock wait timeout setting. InnoDB currently handles deadlocks by rolling back transactions that hold at least row-level locks.
The behavior and sequence of locks are related to the storage engine. When statements are executed in the same order, some storage engines may encounter deadlocks and some may not. There are two reasons for deadlocks: real data conflicts, and storage engine implementation.
3. Transaction logs
Improve transaction efficiency. If the modified transaction log has been recorded and persisted, but the data has not been written back to the disk, the system crashes, when the storage engine restarts, It can automatically restore the modified data. The specific implementation method depends on the storage engine.
4. Mysql transactions
Autocommit)
Mysql uses the automatic submission mode by default. You can set the autocommit variable to enable or disable the automatic submission mode:
Show VARIABLES like 'autocommit ';
1 or on indicates enabled, 0 or off indicates disabled.
Mysql can set the isolation level by executing the set transaction isolation level command. The new isolation level will take effect at the beginning of the next transaction.
Example: set session transaction isolation level read committed;
Mixed use of storage engine in transactions
If the transaction uses both transactional and non-transactional tables (InnoDB and Myisam) for normal commit, the non-transactional table changes cannot be undone if the transaction is rolled back, this will cause the database to be inconsistent.
Implicit and explicit locking
InnoDB can execute locks during transaction execution at any time. The locks are released only when commit or rollback is executed, and all locks are released at the same time, all of the above are implicit locks. InnoDB will automatically lock the lock as needed at the isolation level.
In addition, InnoDB supports displaying and locking through specific statements.
Select... Lock in share mode-shared lock
Select... For update-exclusive lock
Mysql also supports lock tables and unlock tables, which are implemented at the server layer and have nothing to do with the storage engine.
5. Multi-version Concurrency Control
Most transactional storage engine implementations of Mysql are not simple row-level locks. To improve concurrency, multi-version concurrency control (MVCC) is generally implemented at the same time, including Oracle and PostgreSQL. However, the implementations are different.
MVCC can be considered as a variant of Row-level locks, but it often avoids the locking operation and has lower overhead. Although the implementation mechanism is different, most of them implement non-blocking read operations, and write operations only lock necessary rows.
MVCC is implemented by saving data snapshots at a certain time point. That is to say, no matter how long the implementation time is, the data seen by each thing is consistent.
It can be divided into optimistic Concurrency Control and pressimistic concurrency control. The following describes how to work:
The MVCC of InnoDB is implemented by saving two hidden columns after each record row. The two columns Save the row creation time and the row expiration time (delete time ). Of course, the stored time is not the real time but the system version number. Every time a new transaction starts, the system version number is automatically added. The system version number at the beginning of the transaction is used as the transaction version number to query the version number of each record for comparison. The following describes how MVCC works at the repeatable read isolation level:
SELECT
InnoDB checks each row of records based on the following conditions:
A. InnoDB only looks for data rows whose versions are earlier than the current transaction version. This ensures that the rows read by the transaction either exist before the transaction starts or are inserted or modified by the transaction itself.
B. the row deletion version number is either undefined or greater than the current transaction version number. This ensures that the row read by the transaction is not deleted before the transaction starts.
Only those that meet the preceding two conditions will be queried.
INSERT
InnoDB saves the current system version number as the row version number for each newly inserted row.
DELETE
InnoDB saves the current system version number for each row to be deleted as the row deletion identifier
UPDATE
InnoDB saves the current system version number as the row version number for the new record inserted, and saves the current system version number to the original row as the deletion identifier.
Save the two versions so that most operations do not need to be locked. The data operation is simple, the performance is good, and only the rows required by the combination can be read. The disadvantage is that each row of record requires additional storage space and requires more row checks and some additional maintenance work.
MVCC only applies to repeatable read and COMMITTED READ
6. Mysql storage engine
6.1 InnoDB Storage Engine
Data is stored in tablespaces. tablespaces are a black box managed by InnoDB consisting of a series of data files. In Versions later than Mysql4.1, InnoDB can store table data and indexes in separate files.
InnoDB adopts MVCC to support high concurrency. The default isolation level is repeateable read, and uses the nex-key locking policy to prevent phantom READ, so that InnoDB not only locks the rows involved in the query, but also locks the gap in the index to prevent phantom row insertion.
InnoDB is based on Clustered indexes and has high performance in primary key queries. However, its secondary indexes (non-primary key indexes) must contain primary key columns, if the primary key column is large, all other indexes will be large. Therefore, if there are many indexes, the primary key should be as small as possible.
6.2 Myisam storage engine
Myisam stores tables in two files: data files and index files. The number of row records that can be stored is generally limited by the available disk space or the maximum size of a single file in the operating system.
Features:
1. Locking and concurrency
Apply a lock to the entire table. When reading the table, it adds a shared lock to all the tables to be read. When writing the table, it adds an exclusive lock to the table.
2. Repair
Mysql can perform check and repair operations manually or automatically, but it is different from transaction reply and crash repair. Executing table repair may cause some data loss and the repair operation is very slow.
3. Index features
Even long fields such as BLOB and TEXT can be indexed based on the first 500 characters. Full-text indexes are also supported. This is an index created based on word segmentation.
Large data volume
When we create or manage a large number of InnoDB data volumes in 3 ~ 5 TB or larger, but the number of machines is not a shard. These systems run well. If the data volume continues to grow to 10 TB, you may need to create a data warehouse. Infobright is the most successful solution for Mysql Data Warehouse.
Storage engine of the conversion table:
1. alter table mytable engine = InnoDB; it takes a long time to execute and consumes I/O capability. It also locks the original table and will lose all features related to the original engine. For example: if you convert an InnoDB table to Myisam, all foreign keys in the original InnoDB table will be lost.
2. Export and Import
Use the mysqldump tool to export data to a file, modify the storage engine option of the file create table statement, and modify the table name at the same time, because the same table name cannot exist in the same database, even if different storage engines are used.
3. Create and query
Based on the efficiency and security of the above two methods, you do not need to export the data of the entire table, but create a new storage engine table first. Then use insert... Select syntax comes to data
Create table innodb_table like myisam_table;
Alter table innodb_table engine = InnoDB;
Insert into innodb_table select * from myisam_table;
Batch Processing for large data volumes
Start transaction;
Insert into innodb_table select * from myisam_table where id between x and y;
Commit;