Mysql database interview summary and mysql database interview

Last Update:2017-05-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Mysql database interview summary and mysql database interview
Database optimization table creation optimization 1) database paradigm

L 1NF: It emphasizes the atomicity of the column, that is, the column cannot be further divided into several other columns.

For example, the telephone column can be split-home phone number, company phone number

L 2NF: 1NF is the first and contains two parts. One is that the table must have a primary key, and the other is that the columns not included in the primary key must be completely dependent on the primary key, instead of relying only on a portion of the primary key.

L 3NF: 2NF first. In addition, non-primary key columns must directly depend on the primary key and cannot transmit dependencies.

For example, Student table (Student ID, name, age, gender, school, school address, school phone number)

Such a table structure has the preceding relationship. Student ID --> school --> (School Address, school phone number)

The following table structure should be split.

(Student ID, name, age, gender, school) -- (school, school address, school phone number)

Databases that meet these specifications are concise and have clear structures. At the same time, no insert, delete, or update operation exceptions will occur.

2) Data Type Selection

L numeric type

Float and double options (select float whenever possible)

Separate TINYINT, INT, and BIGINT fields. We recommend that you add the unsigned definition.

Use numeric fields instead of string fields.

L character type

Select char, varchar, and TEXT: Do not use the TEXT data type, set the length of the field, it is recommended to use the CHAR type (fill in space), regardless of the length of the field to use VARCHAR (automatically adapt to the length, beyond the stage), and only set the appropriate maximum length

L time type

Sort DATE (accurate to day), TIMESTAMP, DATETIME (accurate to time) by selected priority)

L ENUM

You can use ENUM to store status fields.

L avoid using NULL fields, it is difficult to query and optimize and occupy additional index space

3) character encoding

The same content uses different character sets to indicate that the space occupied is significantly different. Therefore, using appropriate character sets can help us minimize the amount of data and reduce the number of I/O operations.

1. Content that can be expressed only by latin1 characters

2. UTF-8 is optional for Chinese characters.

3. mySQL data types can be accurate to fields, so when we need to store multi-byte data in large databases, you can use different data types for different fields in different tables to greatly reduce the data storage capacity, reduce the number of IO operations, and increase the cache hit rate.

SQL Optimization

1) only the required data is returned.

A) do not write the SELECT * Statement.

B) write the WHERE clause reasonably. Do not write SQL statements without WHERE.

2) Try to do less repetitive work

Some SQL statements can be merged.

3) create an appropriate index (not the more the better), but the following points will scan the entire table.

A) left fuzzy query '% ...'

B) the unequal operator is used! =

C) Or improper use, or both sides must have an index.

D) In, not in

E) The Where clause performs expression operations on fields.

F) for the created composite index (from the leftmost combination), the columns used by the query conditions must start from the left and cannot be separated. Otherwise, the compound index structure is similar to that of the phone book.

G) full-text index: When a dictionary-based index is created for the file (the file's large full-text index is better than fuzzy match)

Full-text indexes can be created on char, varchar, and text columns.

MySQL 5.6 Innodb engine can also perform full-text indexing

Search Syntax: MATCH (column name 1, column name 2 ,...) AGAINST (search string [search modifier])

If the column type is a string, but a numeric constant is assigned a value to the name of a column in the numeric type during the query, although there is an index in the name column, it is not used.

4) use join instead of subquery

5) use union instead of manual creation of temporary tables

INDEX OPTIMIZATION

I. Create an index. The following situations are not suitable for creating an index:

L too few table records

L frequently inserted, deleted, and modified tables

L duplicate and evenly distributed table fields

Ii. Composite Index

If multiple fields are always displayed when data in a table is queried, these fields can be used as composite indexes.

Index

An index is a structure that sorts the values of one or more columns in a database table.

Advantages:

L greatly speed up data retrieval

L create a unique index to ensure the uniqueness of each row of data in the database table

L accelerates the connection between tables.

Disadvantages:

L The index must occupy physical space.

L when the data in the table is added, deleted, and modified, the index must also be dynamically maintained,

This reduces the Data Maintenance speed.

Index category:

L normal index

Create index zjj_temp_index_1 on zjj_temp_1 (first_name );

Drop index zjj_temp_index_1;

L unique index. The index column value must be unique, but null values are allowed.

Create unique index zjj_temp_1 on zjj_temp_1 (id );

L primary key index, which is a special unique index and does not allow null values.

L Composite Index

Transactions

Database Transaction refers to a series of operations performed as a single logical unit of work, either completely or completely.

Four features:

(1) atomicity

A transaction must be an atomic unit of work. modifications to its data must either be performed in all or not.

(2) Consistency

Transaction consistency means that the database must be consistent before and after a transaction is executed. The result of transaction execution must be that the database changes from one consistent state to another consistent state.

(3) isolation (the transaction isolation database provides multiple isolation levels)

The execution of one transaction cannot interfere with other transactions. That is to say, the operations and data used within a transaction are isolated from other concurrent transactions, and each transaction that is executed concurrently cannot interfere with each other.

(4) Persistence

After the transaction is completed, it changes the data in the database permanently. This modification changes

Keep straight.

Before introducing the various isolation levels provided by the database, let's take a look at the several problems that will occur if the isolation of transactions is not taken into account:

L dirty read

Dirty reading refers to reading data from another uncommitted transaction during the transaction processing process.

L non-repeated read

L phantom read

Phantom read and non-repeated read another committed transaction. The key to non-repeated read is update and delete, while phantom read focuses on insert.

In Repeatable read, after the SQL statement reads data for the first time, it locks the data and other transactions cannot modify the data to achieve repeated read. However, this method cannot lock the insert data. Therefore, when transaction A previously reads data or modifies all the data, transaction B can still commit the insert data, at this time, transaction A will find that there is an additional piece of data that is not available before, which is Phantom read and cannot be avoided through row locks. Serializable isolation level is required. Read locks are used for reading and write locks. Read locks and write locks are mutually exclusive. This can effectively avoid phantom read, non-repeated read, dirty read, and other problems, but it will greatly reduce the concurrency of the database.

Now let's take a look at the four isolation levels provided by the MySQL database:

① Serializable: it can avoid dirty reads, repeated reads, and Phantom reads.

② Repeatable read: This prevents dirty reads and repeated reads.

③ Read committed (Read committed): prevents dirty reads.

④ Read uncommitted (Read uncommitted): the lowest level, which cannot be guaranteed in any situation.

In MySQL databases, the default isolation level is Repeatable read (Repeatable read ).

Lock modes include:

L shared lock: (read) the lock created by the Operation. Other users can read data concurrently, but nothing can get the exclusive lock on the data until all the shared locks are released.

L exclusive lock (X lock): After an exclusive lock is applied to data a, other transactions cannot block Aany type. Transactions authorized to exclusive locks can read and modify data.

L update lock:
Update (U) locks can prevent normal deadlocks. If the two transactions obtain the Shared Mode Lock on the resource and attempt to update the data at the same time, both transactions need to convert the shared lock to the (X) Lock, and each transaction waits for another transaction to release the share mode lock, so a deadlock occurs.
To avoid this potential deadlock, use the update (U) Lock. Only one transaction can obtain the resource Update (U) Lock at a time. If the transaction modifies the resource, the update (U) Lock is converted to the row (X) Lock. Otherwise, the lock is converted to a shared lock.

The lock granularity mainly includes the following types:

L row lock: Minimum granularity and highest concurrency

L page lock: one page is locked at a time. The 25 row locks can be upgraded to one page lock.

L table lock: large granularity and low concurrency

L database lock: controls the entire database operation

Optimistic lock: Compared with pessimistic locks, optimistic locks assume that data generally does not cause conflicts. Therefore, when the data is submitted for update, the system will officially detect whether data conflicts exist, if a conflict is found, the user error information will be returned, and the user will decide how to do it. The general method to implement optimistic locks is to record the data version.

Pessimistic lock: As the name implies, it is very pessimistic. Every time you get the data, you think that others will modify it. So every time you get the data, you will lock it, in this way, others will block the data until it gets the lock. In traditional relational databases, many of these locks are used, such as row locks, table locks, read locks, and write locks.

Index

Differences between MyISAM and InnoDB

L The MyISAM type does not support advanced processing such as transaction processing, whereas the InnoDB type does.

L MyISAM table does not support foreign keys, InnoDB supports

L the granularity of MyISAM locks is table-level, while InnoDB supports row-level locks.

L MyISAM supports full-text indexes, while InnoDB does not. (Innodb supports full-text indexing after mysql 5.6)

MyISAM is relatively simple, so it is more efficient than InnoDB. For small applications, you can consider using MyISAM. When your database has a large number of write and update operations but few queries or high data integrity requirements

Then select the innodb table. When your database is mainly used for queries, there are fewer updates and writes in comparison, and the business data integrity requirements are not that strict, you can select the mysiam table.

Implementation of MyISAM and InnoDB indexes:

MyISAM index implementation

The MyISAM index file and data file are separated. The index file only stores the data record address.

L Primary Index

The MyISAM engine uses B + Tree as the index structure. The data domain of the leaf node stores the data record address.

L secondary index

In MyISAM, the primary index and Secondary index (Secondary key) have no difference in structure, but the primary index requires that the key is unique, and the Secondary index key can be repeated.

The index search algorithm in MyISAM first searches for indexes based on the B + Tree search algorithm. If the specified Key exists, the value of its data domain is taken out, and the address is the value of the data domain, read the corresponding data records.

The index method of MyISAM is also called "non-clustered". The reason for this is to distinguish it from the clustered index of InnoDB.

InnoDB Index implementation

InnoDB also uses B + Tree as the index structure, but the implementation method is different from that of MyISAM.

L Primary Index

The InnoDB table data file itself is the primary index.

The InnoDB primary index (also a data file) shows that the leaf node contains a complete data record. This index is called a clustered index. Because the data files in InnoDB need to be clustered by the primary key, InnoDB requires that the table have a primary key (MyISAM may not). If it is not explicitly specified, mySQL automatically selects a column that uniquely identifies a data record as the primary key. If this column does not exist, MySQL automatically generates an implicit field for the InnoDB table as the primary key, this field is 6 bytes in length and its type is long integer.

L secondary index

All secondary indexes of InnoDB reference the primary key as the data field.

Clustered index makes the search by primary key very efficient, but secondary index search requires two indexes: first, retrieve the secondary index to obtain the primary key, then, use the primary key to search for the record in the primary index.

The index implementation methods of different storage engines are very helpful for correct use and optimization of indexes. For example, after knowing the index Implementation of InnoDB, it is easy to understand

1. Why not use too long fields as the primary key? Because all secondary indexes reference the primary index, too long primary index will make the secondary index too large. For example,

2. It is not a good idea to use non-monotonous fields as the primary key in InnoDB, because the InnoDB data file itself is a B + Tree, non-monotonous primary keys will cause frequent split and adjustment of data files to maintain the features of B + Tree during the insertion of new records, which is very inefficient, using an auto-increment field as the primary key is a good choice.

Differences between InnoDB indexes and MyISAM indexes:

L The first is the difference between the primary index and the InnoDB data file itself is the index file. The indexes and data of MyISAM are separated.

L second, the difference between secondary indexes: the secondary index data domain of InnoDB stores the value of the primary key of the corresponding record instead of the address. The secondary index of MyISAM is not much different from the primary index.

Red/black tree B + Tree B-tree

Binary Search Tree (BST ):

The binary sorting tree is either an empty tree or a binary tree of the following nature:

L if the left subtree is not empty, the value of all nodes on the left subtree is smaller than the value of its root node.

L if the right subtree is not empty, the value of all nodes on the right subtree is greater than the value of its root node.

L left and right subtree are also Binary Decision Trees.

L no node with the same key value (therefore, it must be a leaf node during insertion ).

Delete An Algorithm

L The node to be deleted is a leaf node.

L The node to be deleted has only one child (Left or Right child). In this case, you only need to connect the child to the parent node of the current node.

L The node to be deleted has two children, and the algorithm at this time is relatively complicated (compared to the case where there is only one child ). First, we need to find the maximum node on the left subtree of the node to be deleted, or the minimum node on the right subtree, and then exchange the parameter values of the node with the parameter values of the node to be deleted, finally, delete the node. In this way, the parameters to be deleted are deleted from the binary tree.

Red/black tree:

The Red Black Tree is a self-balancing Binary Search Tree:

L each node is either black or red.

L The root node is black.

L each leaf node is black.

L if a node is red, its child nodes must be black.

L all paths from a node to its child nodes contain the same number of black nodes.

The time complexity of various operations on the red/black tree is O (log2N ).

Red/black tree vs AVL

The query performance of the red/black tree is slightly inferior to that of the AVL Tree, because it is slightly more unbalanced than the avl Tree, that is to say, the query performance of the red/black tree is only the same as that of the avl tree with the same content, however, if the red and black trees are inserted and deleted, avl trees are exploding. Each time they are inserted and deleted, a large amount of balance is calculated, compared with avl Tree, the overhead of red-black transformation and rotation is much smaller than that of avl Tree to maintain balance.

Insert operation

Red parent
If the parent node of the new node is red, a series of operations are required to ensure that the entire tree is red and black. As shown in, because the parent node is red, it can be determined that the grandfather node must be black. In this case, you need to determine the operation based on the color of the uncle node. The blue node indicates that the color is unknown. Since it may be necessary to perform multiple rotation operations on the path from the root node to the new point, the starting point of each imbalance judgment (we can regard it as the new point) is different. So here we use a blue arrow to point to this starting point, which is called a judgment point.

L hongshu
When the uncle node is red, as shown in, you do not need to rotate it. You only need to change the father and uncle nodes to black and change the grandfather node to red. However, because the parent node of the grandfather node may be red, it violates the red-black tree nature. In this case, the grandfather node must be used as the new judgment point to continue the upward (iterative) Balancing operation.

Note that no matter whether the "parent node" is on the left or right of the "Uncle node", whether the "new node" is the Left or Right child of the "parent node, their operations are the same (in fact, this situation includes four types, you only need to adjust the color, do not need to rotate the tree ).

L uncle Hei
When the uncle node is black, it needs to be rotated to show the possibility of all rotation:
Case 1:

Case 2:

Case 3:

Case 4:

B and B-trees:

B-tree: B, B, Balanced. Because the original English name of B-tree is called B-tree, many people in China prefer to translate B-tree into B-tree. In fact, this is a very bad literal translation and can be easily misunderstood. For example, people may think that B-tree is a tree, and B-tree is another tree. In fact, B-tree refers to B-tree.

The m-Level B tree is a balanced m-path search tree. It is an empty tree or a tree that meets the following requirements:

L The number of sons at the root node is [2, M].

L The number of non-leaf nodes except the root node is [M/2, M]; (M/2 rounded up)

L each node should store at least M/2-1 (rounded up) and at most M-1 keywords;

L number of keywords for non-leaf nodes = number of pointers to Son-1;

L non-leaf node keywords: K [1], K [2],…, K [X-1]; and K [I] <K [I + 1];

L non-leaf node pointer: P [1], P [2],…, P [X]; where P [1] points to

Subtree, P [X] pointing to a subtree with a keyword greater than K [X-1], other P [I] pointing to a keyword belonging to (K [I-1], K [I]) child tree;

L all leaf nodes are on the same layer;

B + tree:

The B + tree is a variant of the B-tree and also a multi-path Search Tree:

Its definition is basically the same as that of B-tree,:

L The number of subtree pointers and keywords for non-leaf nodes is the same;

L non-leaf node subtree pointer P [I], pointing to a subtree whose key value belongs to [K [I], K [I + 1 ])

(B-the tree is an open interval );

L adds a pointer chain to all leaf nodes;

L all keywords appear at the leaf node;

Basic SQL operations

L SELECT * FROM table order by field DESC; (ASC | DESC)

Select distinct field from table where range

L insert into table_name (column1, column2, column3 ,...)
VALUES (value1, value2, value3 ,...);

L UPDATE table_name SET column1 = value1, column2 = value2 ,...
WHERE some_column = some_value;

L delete from table_name WHERE some_column = some_value;

LIKE Operator

SELECT column_name (s) FROM table_name WHERE column_name LIKE pattern;

IN Operator

SELECT column_name (s) FROM table_name WHERE column_name

IN (value1, value2 ,...);

BETWEEN Operator

SELECT column_name (s) FROM table_name WHERE column_name

JOIN

Left connection, right connection, internal connection

Left join: returns records that include all records in the left table and join fields in the right table.

Right join: returns all records in the right table that are equal to the join fields in the left table.

Inner join: returns only rows with equal join fields in two tables. (Default)

UNION operator

The UNION operator is used to merge the result sets of two or more SELECT statements.

SELECT country, name FROM Websites WHERE country = 'cn'
UNION
SELECT country, app_name FROM apps WHERE country = 'cn'
Order by country;

Create View

Create view view_name as select column_name (s) FROM table_name WHERE condition

SQL Functions

Avg () Count () Max () Min () Sum ()

Group ():

The group by statement is used in combination with Aggregate functions to GROUP result sets based on one or more columns.

SELECT column_name, aggregate_function (column_name)
FROM table_name WHERE column_name operator value
Group by column_name;

HAVING clause allows us to filter the data of each group after grouping.

SELECT column_name, aggregate_function (column_name)
FROM table_name WHERE column_name operator value
Group by column_name

HAVING aggregate_function (column_name) operator value;

How to query the database table structure and primary key

Desc tabl_name;

Create a table

Create table Name

(

Column name 1 data type,

....

)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Mysql database interview summary and mysql database interview

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Mysql database interview summary and mysql database interview

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support