MySQL Basics Getting Started-mysql architecture

Source: Internet
Author: User
Tags compact connection pooling filegroup rollback unix domain socket

MySQL Architecture:
By: Connection pooling components, management services and tools components, SQL interface components, Query Analyzer components, optimizer components,
Buffer components, plug-in storage engine, physical file composition.
MySQL is a unique plug-in architecture, and each storage engine has its own characteristics.





MySQL each storage engine Overview:
InnoDB storage Engine: [/color][/b] for OLTP (online transaction processing), row lock, foreign key support, non-lock read, By default, repeaable level (repeatable read) avoids phantom Read, insert buffer, two write, Adaptive hash index, read-ahead with next-keylocking policy
MyISAM Storage Engine: Transaction not supported, table lock, full-text index, suitable for OLAP (online analytical Processing), where MyD: Put data file, Myi: Put index file
NDB Storage Engine: Cluster storage engine, share nothing, improves availability
Memory storage Engine: Data is stored in RAM, table locks, concurrency performance is poor, hash index is used by default
Archive Storage Engine: supports only insert and select Zlib algorithm compression 1:10, suitable for storing archived data such as logs, row locks
Maria storage Engine: To replace MyISAM, cache data and indexes, row locks, MVCC





InnoDB Features:
Main architecture: Default of 7 background threads, 4 IO thread (insert buffer, log, read, write), 1 master thread (highest priority), 1 lock (lock) Monitoring thread, 1 error monitoring thread. Can be viewed through show engine InnoDB status. The new version has grown to 4 for the default read thread and write thread, and can be viewed through show variables like ' innodb_io_thread% '.
The storage engine consists of a buffer pool, a redo log buffer pool (redo log buffer), and an additional memory pool (additional). Configuration can be made by show variables like ' innodb_ Buffer_pool_size ', show variables like
' Innodb_log_buffer_size ', show variables like ' innodb_additional_mem_pool_size ' to view.
Buffer Pool: The largest block of memory, the cache used to hold various data include index page, data page, undo page, insert buffer, Adaptive Hash index, INNODB store lock information, data dictionary information, etc. How you work always reads the database file by page (16k per page) to the buffer pool, and then retains the cached data in the buffer pool by using the least recently used (LRU) algorithm. If the database file needs to be modified, always first modify the page in the cache pool (which is the dirty page after the modification), and then flush the dirty pages of the buffer pool to the file at a certain frequency. Show engine InnoDB status by command;
Log buffering: The Redo log information is first placed in this buffer and then flushed to the Redo log file at a certain frequency.




Master thread:
The loop main loop operates once per second:
The log buffers are flushed to disk, even if the transaction has not yet been committed. (Always executed, so the big transaction commits
Time is also very fast)
Merge Insert buffer (InnoDB the number of IO times that occur in the current second is less than 5 execution)
Up to 100 InnoDB of dirty pages in the buffer pool to disk (the proportion of buffer pools that exceed the configured dirty pages is
Line, in the configuration file is determined by INNODB_MAX_DIRTY_PAGES_PAC, the default is 90, the new version is 75,
Google suggest is 80)
If no user activity is currently in use, switch to Backgroud loop

The loop main loop operates every 10 seconds:
Refresh 100 dirty pages to disk (the last 10 seconds IO operation is less than 200 times to execute)
Merge up to 5 insert buffers (always)
Buffering the log to disk (always)
Delete useless undo page (always)
Refresh 100 or 10 dirty pages to disk (with more than 70% dirty pages, refresh 100 dirty pages; otherwise refresh 10 dirty pages)
Produce a checkpoint.

Backgroud Loop, if there is currently no user activity (when the database is idle) or the database shuts down, it switches to this loop:
Delete useless undo page (always)
Merge 20 Insert buffers (always)
Jump back to the main loop (always)
Refresh 100 pages continuously until eligible (may be completed in Flush loop)

If there is nothing to do in flush loop, the InnoDB storage engine switches to Suspend_loop, hangs the master thread and waits for the event to occur. If the InnoDB storage engine is enabled and no tables are used for the InnoDB storage engine, the master thread is always in a pending state

Insert buffer: Not part of the buffer pool, insert buffer is an integral part of the physical page, resulting in improved innodb performance. Based on the characteristics of the B + algorithm (mentioned below), the primary key index is sequential when inserting data, does not cause random database reads, and for nonclustered indexes (that is, the secondary index), the leaf node insertion is no longer sequential, when the non-clustered index needs to be accessed in a discrete way, the insertion performance is reduced here. InnoDB introduces insert buffering to determine if a nonclustered index page is in the buffer pool, and if it is inserted directly; not, it is placed in the insert buffer first. Then, as described in the master thread above, a certain frequency will be inserted into the buffer merge. In addition, the secondary index cannot be unique, as inserting into the buffer does not look for the index page, otherwise it will still cause random reads, losing the meaning of the insertion buffer. The insert buffer may account for memory in the buffer pool, which can account for 1/2 by default, so you can turn this value down to 1/3. Set by Ibuf_pool_size_per_max_size, 2 means 1/2,3 represents 1/3.

Two writes: It brings the reliability of INNODB data. If the write fails, it can be recovered by redo the log, but the redo log records the physical operation of the page, and if the page itself is corrupted, it makes no sense to redo it. Therefore, before the redo log is applied, a copy of the page is required, and when the write invalidation occurs, the page is restored by a duplicate of the page, which is doublewire.
Recover data = page Copy + redo log





Adaptive Hash Index: The InnoDB storage Engine presents an adaptive hash index, which monitors the lookup of indexes on a table, and if it observes that establishing a hash index creates a speed boost, then a hash index is established, so called adaptive. Adaptive hash indexes can only be used to search for equivalent queries, such as SELECT * from table where index_col= ' * * *, in addition to adaptive hashing controlled by the InnoDB storage engine, we can only pass Innodb_adaptive_hash_ Index to disable or enable, which is turned on by default.

MySQL file
Parameter file: Tells the MySQL instance where to find the database file when it starts, and specifies some initialization parameters that define settings such as the size of a memory structure. The file is stored, editable, and cannot be started successfully (unlike other databases) if it is not loaded at startup. Parameters have dynamic and static points, static equivalent to read-only, dynamic can be set. If we check out the key, value value through show variable like ' * * * ', it can be directly modified by the set Key=value. In the same way, there are scope points in the modification, that is, the seesion is valid and global, and the session or global is preceded by the corresponding key, such as SELECT @ @seesion. read_buffer_size, SET @ @global. read _buffer_size.
Log file: A file that is written to record the response of a MySQL instance to a condition. such as error log files, binary log files, slow query log files, query log files, and so on.
Error log: View error log storage address via show variables like ' Log_error '
Slow query log: Through show variables like '%long% ' to view the threshold value of slow query logging, the new version is set to 0.05; show variables like ' log_slow_queries ' to see if it is turned on, by default By show variabes like ' log_queries_not_using_indexes ' view is to log queries that do not use indexes to slow logs. MySQL allows you to view slow logs directly from the Mysqldumpslow command.
Binary: does not log queries, only records all modification operations to the database. The purpose is to restore (point-in-time repair) and replication. View the storage path through show variables like ' DataDir '. Binary logs support statement, row, mix three formats, through the Binlog_format parameter setting, usually set to ROW, can provide better reliability for the recovery and replication of the database, but will bring the increase of the binary file size, the replication will increase the network overhead. View the contents of the binary log file through Mysqlbinlog in MySQL.
Socket file: The file you need when you connect using a UNIX domain socket.
PID file: The process ID file for the MySQL instance.
MySQL table structure file: Used to store MySQL table structure definition file. Because of the architecture of the MySQL plug-in storage engine, each table has a corresponding file, ending with the frm suffix.
Storage Engine files: Store your own files to hold a variety of data, really store data and indexes and other data. The following mainly describes the Tablespace files and redo log files under the storage engine for InnoDB.
Tablespace file: InnoDB The default tablespace file is Ibdata1, which can be viewed by show variables like ' innodb_file_per_table ' to see if each table produces a separate. IDB tablespace file. However, a separate tablespace file stores information such as data, indexes, and insert buffers for the table, and the rest of the information is stored in the default table space.





Redo log Files: instance and media failure, redo log files can come in handy, such as the database power down, the InnoDB storage engine will use redo logs to restore to the moment before power down, in order to ensure the integrity of the data. Parameter innodb_log_file_size Specifies the size of the redo log file, Innodb_log_file_in_group specifies the number of redo log files in the log filegroup, and defaults to 2,innodb_mirrored_log_ Groups specifies the number of log mirror filegroups, default 1, which represents only one log filegroup, no mirror, innodb_log_group_home_dir specifies the path to the log filegroup, which is default to the database path.
Differences between binary and redo logs: First, the binary log records all mysql-related logging, including logs for other storage engines such as InnoDB, MyISAM, and heap. While the InnoDB storage engine redo log stores only the transaction log about itself, the second content is different, regardless of whether the binary log file is formatted as statement or row, or mixed, it records the specific operation of a transaction. The InnoDB storage engine's redo log files record the physical condition of each page change. In addition, the write time is different, the binary log file is logged before the transaction commits, and during the process of the transaction, there are constant redo log entries written to the redo log file.

MySQL InnoDB table
Tablespace: The table space can be seen as the highest level of the logical structure of the INNODB storage engine.
Segment: The table space consists of each segment, the common segments are data segments, index segments, rollback segments, and so on.
Zone: consists of 64 contiguous pages, each with a size of 16kb, which is 1MB per area.
Page: 16KB per page, and cannot be changed. Common page types are: Data pages, undo pages, System pages, transactional data pages, insert buffer bitmap pages, insert buffered free list pages, uncompressed binary large object pages, compressed binary large object pages.
Line: The InnoDB storage engine is line-oriented (row-oriented), allowing up to 7992 rows of data per page.
Row record format: Two common lines of record format compact and redundant,mysql5.1 versions, mainly compact row record format. For the compact, either char or varchar, the null type does not occupy storage space, and for Redudant,varchar Null does not occupy space, and the null type of char takes up storage space.
The length limit of varchar type is 65535, in fact, can not reach, there will be other costs, usually about 65530, which is also related to the selected character set. In addition, this length limit is a full line, for example: Create TABLE Test (a varchar (22000), B varchar (22000), Cvarchar (22000)) Charset=latin1 engine= InnoDB will also make an error.

For BLOB type data, only the first 768 byte prefix data of varchar (65535) is saved in the data page, followed by an offset, pointing to the row overflow page, which is the uncompressed BLOB pages. The new InnoDB plugin introduces a new file format called Barracuda, which has two new row record formats compressed and dynamic, both for a full overflow in the blog field and a 20-byte pointer in the database page. The actual data is stored in the Blob page.






Data page structure: The data page structure consists of the following 7 parts:
File header: Records some header information for a page, such as page offset, previous page, next page, page type, and so on, with a fixed length of 38 bytes.
Page header (header): Records the status information of a page, the number of records in the heap, a pointer to an idle list, the number of bytes of deleted records, the last inserted position, and so on, with a fixed length of 56 bytes.
Infimun+supremum Records: In the InnoDB storage engine, there are two virtual row records per data page that are used to limit the boundaries of a record.
A Infimun record is a value that is smaller than any primary key in the page, and Supermum refers to a value that is larger than any value that is likely to be large. These values are established when the page is created and are not deleted under any circumstances. Under the compact row format and the redundant row format, the two bytes occupy different numbers.



User Records (Records of users): implements the contents of the record. Again, the InnoDB storage engine table is always a B + Village index organization.
Free space: Refers to free space, which is also a linked list data structure. When a record is deleted, the space is added to the idle list.
Page Directory: the page directory holds the relative position of the record, not the offset, sometimes these records are called slots (slots), InnoDB is not a slot for each record, the slot is a sparse directory, that is, a slot may belong to more than one record, a minimum of 4 records, Up to 8 records. Keep in mind that the B + Tree index itself cannot find a specific record, and the B + Tree index can find just the page where the record is located. The database loads the page into memory and then makes a binary lookup through page directory. However, the time complexity of the two-fork lookup is low, and the in-memory lookup is fast, so the time spent in this part of the lookup is ignored.
File Trailer: A fixed length of 8 bytes to ensure that the page is completely written to disk (such as disk corruption of the write process, machine downtime, and so on).

Views: Views in MySQL are always virtual tables and do not natively support materialized views. However, with some other tricks (such as triggers), it is also possible to implement some simple materialized view functions.

Partition: MySQL database supports range, LIST, HASH, KEY, columns partition, and can use HASH or KEY to sub-partition.

MySQL InnoDB common indexes and algorithms:
B + Tree Index: The data structure of the + + tree is relatively complex, and B. The balance is originally evolved from a balanced binary tree, but a B + tree is not a binary tree, a more detailed introduction to it can be found in this article: http://blog.csdn.net/v_JULY_v/ article/details/6530142 because of the high fan-out of the B + Tree index, so in the database, B + Tree height is generally in two or three layers, also for finding a key value of the row records, up to 2 to 3 Io, now the general disk can do at least 100 times per second io,2~ 3 times of IO means the query time is only 0.02~0.03 seconds.
The B + index in the database can be divided into a clustered index (clustered index) and a secondary clustered index (secondary index), but inside it is a B + tree, which is a highly balanced, leaf node that holds data.
Clustered index: Because the clustered index is organized according to the primary key, each table can have only one clustered index, each data page is connected by a doubly linked list, and the leaf node holds a whole row of information, so the query optimizer prefers to walk the clustered index. Additionally, the storage for the clustered index is logically contiguous. Therefore, the clustered index is very fast for the sort lookup and range lookups of the primary key.
Secondary index: Also called a nonclustered index, the leaf node does not have all the data, the primary deposit key value and a boomark (actually the key of the clustered index) tells InnoDB where to find the row data corresponding to the index, such as a secondary index with a height of 3 and a clustered index with a height of 3, A total of 6 IO is required to query row records based on the secondary index. In addition, secondary indexes can have more than one.

Index usage principles: High selection, fetching small portions of data from a table (also known as a unique index). Generally, the amount of data taken out exceeds 20% of the data in the table, and the optimizer does not use the index for a full table scan. This is meaningless for fields such as gender.
Federated indexes: Also called composite indexes, are indexes established on multiple columns (>=2). The composite index in InnoDB is also a B + tree structure. The indexed data consists of multiple columns (col1, col2, col3 ...), sorted sequentially in the index by col1, col2, col3. such as (1, 2), (1, 3), (2,0) ... Using a composite index to take full advantage of the leftmost prefix principle, as the name implies, is the leftmost priority. such as CREATE index Ind_col1_col2 (col1, col2), then in the query where col1 = xxx and col2 = XX or where col1 = XXX can walk Ind_col1_col2 index, but where col2=**** is walking Not to the index. When creating a multicolumn index, the most frequently used and well-filtered column in the WHERE clause is placed on the leftmost side, depending on the business requirements.

Hash index: Hash algorithm is also a more common algorithm, MySQL InnoDB used the more common link address method to go to the weight. Also mentioned above, InnoDB in the hash is self-adaptive, when the use of hash is determined by the system, can not be manually set.
Binary search method: This algorithm is more common, there is not much to mention. In InnoDB, slots in page directory are stored in the order of primary keys, and queries for a specific record are obtained by binary lookup of page directory.

The lock in MySQL InnoDB
The implementation of the INNODB storage engine lock is very similar to Oracle, providing consistent, non-lock read, row-level lock support, row-level locks with no associated overhead, and concurrency and consistency at the same time.
The InnoDB storage engine implements the following two standard row-level locks:
Shared Lock (S Lock): Allows a transaction to read a row of data;
Exclusive lock (X Lock): Allows a transaction to delete or update a row of data.
When a transaction has acquired a shared lock on row R, another transaction can immediately get a shared lock on row R, because reading does not change the data of row R, which we call a lock-compatible. However, if a transaction wants to obtain an exclusive lock on row R, it must wait for the transaction to release the shared lock on row R ———— This condition is called lock incompatibility.



Before InnoDB plugin, you can only view the current database request through commands such as show full processlist,show ENGINE inoodb status, and then determine the lock in the current transaction. In the new version of INNODB plugin, Innodb_trx, innodb_locks, Innodb_lock_waits are added under the INFORMATION_SCHEMA schema. With these three tables, you can more easily monitor the current transaction and analyze the problem of possible locks.
The Innodb_trx consists of 8 fields:
TRX_ID:INNODB unique transaction ID within the storage engine
Trx_state: The state of the current transaction.
Trx_started: The start time of the transaction.
TRX_REQUESTED_LOCK_ID: The lock ID of the waiting transaction. If the status of Trx_state is lock wait, then the value represents the ID of the current wait before the transaction occupies the lock resource.
If Trx_state is not lock WAIT, the value is null.
Trx_wait_started: The time at which the transaction waits to start.
Trx_weight: The weight of the transaction reflects the number of rows modified and locked by a transaction. In the InnoDB storage engine, when a deadlock needs to be rolled back, the InnoDB storage is selected
The minimum value for the rollback.
The trx_mysql_thread_id:mysql thread id,show processlist The results that are displayed.
Trx_query: The SQL statement that the transaction runs.
Through select * from Infomation_schema. Innodb_trx; can view

The Innodb_locks table, which consists of the following fields:
LOCK_ID: The ID of the lock.
LOCK_TRX_ID: Transaction ID.
Lock_mode: The mode of the lock.
Lock_type: The type of lock, the table lock, or the row lock.
Lock_table: The table to be locked.
Lock_index: The index of the lock.
Lock_space:innodb the ID number of the storage engine tablespace.
Lock_page: The number of locked pages. If the table is locked, the value is null.
Lock_rec: The number of rows that are locked. If the table is locked, the value is null.
Lock_data: The primary key value of the locked row. When it is a table lock, the value is null.
Through select * FROM INFORMATION_SCHEMA. Innodb_lock; can view

The innodb_lock_wait consists of 4 fields:
REQUESTING_TRX_ID: The transaction ID of the request lock resource.
REQUESTING_LOCK_ID: ID of the requested lock.
BLOCKING_TRX_ID: The ID of the blocked lock.
Through select * FROM INFORMATION_SCHEMA. Innodb_lock_waits; can be viewed.

Consistent, non-locking read: The INNODB storage engine reads data from rows in the current execution time database in a row-versioning manner. If the read row is performing a delete, update operation, then the read operation does not wait for the row lock to be released, instead, the InnoDB storage engine reads a snapshot of the row data. Snapshot data refers to the data in the previous version of the row, which is implemented by the undo segment. Undo is used to rollback data in a transaction, so the snapshot itself has no additional overhead. In addition, snapshot data does not need to be locked out because there is no need to modify historical data. A row may have more than one snapshot data, so this technique is called line-to-multi-version technology. This brings concurrency control, called multi-version concurrency control (Multi versionconcurrency control, MVCC).
Isolation level of the transaction: READ UNCOMMITTED, Read committed, Repeatable read, serializable. Under Read committed and Repeatable read, the INNODB storage engine uses non-locking consistent reads. However, the definitions for snapshots are different. Under the Read COMMITTED transaction isolation level, for snapshot data, non-conforming reads always read the latest snapshot data for the locked row. Under the REPEATABLE transaction isolation level, for snapshot data, non-conforming reads always read the row data version at the beginning of the transaction.




Algorithm of the Lock:
Record Lock: A lock on a single-line record
Gap Lock: A lock that locks a range but does not contain the record itself
Next-key lock:gap lock + record lock, locks a range, and locks the record itself. A more detailed introduction can be found in this article blog,http://www.db110.com/?p=1848

Lock problem:
Missing updates: A classic database issue that occurs when two or more transactions select the same row and then update the row based on the originally selected value. Every transaction is unaware of the existence of other transactions. The last update overrides updates made by other transactions, which results in data loss.
Cases:
Transaction A and transaction B modify the value of a row at the same time.
1. Transaction a changes the value to 1 and submits
2. Transaction B changes the value to 2 and commits it.
At this point the value of the data is 2 and the updates made by transaction A will be lost.
Workaround: Transaction parallel variable serial operation, add exclusive lock to update operation.

Dirty reads: One transaction reads the uncommitted update data from another transaction, that is, the dirty data is read.
Cases:
The original salary of 1.Mary was 1000, and the finance staff changed Mary's salary to 8000 (but not the transaction)
2.Mary read their own wages and found their wages changed to 8000, a rapturous!
3. While the financial discovery was wrong, and the transaction was rolled back, Mary's salary changed to 1000, as was the case with Mary's salary of 8000 being a dirty data.
WORKAROUND: Dirty reads will only occur if the transaction isolation level is READ UNCOMMITTED, INNODB default isolation level is repeatable read, so dirty reads do not occur in the production environment.

Non-repeatable reads: In the same transaction, the same data is read multiple times, and the results returned are different. In other words, subsequent reads can read the updated data that has been committed by another transaction. On the contrary, "repeatable read" ensures that the same transaction reads data more than once, that is, subsequent reads cannot be read to the updated data submitted by another transaction. The main difference between dirty reads and non-repeatable reads is that dirty reads are read to uncommitted data, and non-repeatable reads are read to the submitted data.
Cases:
1. In transaction 1, Mary reads her own salary of 1000, and the operation is not completed
2. In transaction 2, the Financial officer modified Mary's salary to 2000 and submitted the transaction.
3. In transaction 1, when Mary reads her salary again, the salary becomes 2000
Workaround: Read the submitted data, the general database is acceptable, so the transaction isolation level is generally set to read Committed. Mysql InnoDB avoids non-repeatable reads through the Next-key lock algorithm, and the default isolation level is repeatable read.


The transaction in MySQL InnoDB
Four characteristics of a transaction: atomicity, consistency, isolation, persistence
Isolation is achieved through locks, atomicity, consistency, and persistence through the redo and undo of the database.
The redo log records the behavior of the transaction, which is achieved through redo, which guarantees the integrity of the transaction, but sometimes the transaction needs to be undone, and then the undo is generated. Undo and redo on the contrary, when the database is modified, the database will not only produce redo, but also generate some undo, even if the executed transaction or statement failed for some reason, or if a ROLLBACK statement request rollback, You can use these undo information to roll back the data to the way it was before you modified it. Unlike redo, redo is stored in a redo log file, and undo is stored in a special segment (segment) inside the database, called the Undo segment (undo segment), and the Undo field is in the shared tablespace. It is also important that undo records a logical operation that is contrary to a transactional operation, such as insert undo recording a Delete, so undo simply restores the database to the way it was before the transaction began. For example, insert 100,000 rows of data, which may cause table space to increase, the table space will not be reduced back after the rollback.

Transfer from http://javabkb.iteye.com/blog/1441197

MySQL Basics Getting Started-mysql architecture

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.