Collection! The interview will be used, MySQL common face questions

Source: Internet
Author: User
Tags one table table definition

The following are some of the MySQL issues that are often encountered in interviews and studies. Example SQL statement optimization: 1) Try to avoid using the! = or <> operator in the WHERE clause, or discard the engine for a full table scan using the index. 2) Try to avoid null values for the field in the WHERE clause, otherwise it will cause the engine to discard full table scans using the index, such as: Select ID from t where num is null

1. Key candidate key for primary key Super key

Primary key:

A combination of data columns or properties in a database table that uniquely and fully identifies the stored data object. A data column can have only one primary key , and the value of the primary key cannot be missing, that is, it cannot be a null value (NULL).

Super key:

A property set that uniquely identifies a tuple in a relationship is called a hyper-key of a relational pattern. A property can be used as a super-key, and multiple properties can be combined together as a super-key. The super-key contains the candidate and primary keys.

Candidate Key:

Is the least-key , that is, a super-key without redundant elements.

FOREIGN key:

The primary key of another table that exists in one table is called a foreign key of this table.

2. Four features and implications of database transactions

The database transaction transanction four basic elements that are correctly executed. ACID, Atomicity (atomicity), consistency (correspondence), isolation (isolation), persistence (durability).
atomicity : All operations in the entire transaction, either complete or complete, are not likely to stall in the middle of the process. When an error occurs during execution, the transaction is rolled back (Rollback) to the state before the transaction begins, as if the transaction had never been executed.
consistency : The integrity constraints of the database are not compromised until the transaction begins and after the transaction has ended.
isolation : The isolated state performs transactions so that they appear to be the only operations that the system performs within a given time. If there are two transactions that run at the same time and perform the same function, the isolation of the transaction will ensure that every transaction in the system considers only that the transaction is in use by the system. This attribute is sometimes called serialization, and in order to prevent confusion between transactional operations, the request must be serialized or sequenced so that only one request is used for the same data at a time.
Persistence : After the transaction is completed, changes made to the database by the firm are persisted in the database and are not rolled back.

3. Can the view be changed by the function of the view?

A view is a virtual table that is not the same as the table that contains the data, and the view contains only queries that retrieve data dynamically when used, and does not contain any columns or data. Using views simplifies complex SQL operations, hides specific details, and protects the data, and when views are created, they can be used in the same way as tables.
The view cannot be indexed or has an associated trigger or default value, and if an order by is within the view itself, order by will be overwritten.
Creating view: Create View XXX as xxxxxxxxxxxxxx;
For some views, such as the group aggregation function distinct union, which is not using the join subquery, it is possible to update the base table by updating it, but the view is mainly used for simplifying the retrieval, protecting the data, not for updating, and most of the views are not updated.

The difference between 4.drop,delete and truncate

Drop the table truncate delete the data in the table, then insert the self-growth ID from 1 and delete the data in the table, you can add the WHERE clause.

(1) The DELETE statement performs the deletion by deleting one row from the table at a time and saving the row's delete operation as a transaction record in the log for the rollback operation. TRUNCATE table deletes all the data from the table at once and does not record the individual deletion records in the log, and deleting rows is not recoverable. Delete triggers related to the table are not activated during the removal process. Execution speed is fast.

(2) The space occupied by the table and index. When the table is truncate, the space occupied by the table and index is restored to its original size, and the delete operation does not reduce the space occupied by the table or index. The drop statement frees all the space occupied by the table.

(3) Generally, drop > truncate > Delete

(4) Scope of application. TRUNCATE can only be table;delete to a table and view

(5) TRUNCATE and delete delete data only, and drop deletes the entire table (structure and data).

(6) Truncate and without where Delete: delete data only, without deleting the structure of the table (definition) The DROP statement will delete the structure of the table that is dependent on the constraint (constrain), the trigger (trigger) index (index), and the stored procedure that depends on the table/ The function will be preserved, but its state will change to: invalid.

(7) The DELETE statement is DML (data maintain Language), which is placed in rollback segment and is not valid until the transaction is committed. If there is a corresponding Tigger, the execution time will be triggered.

(8) Truncate, drop is a DLL (data define language), the operation takes effect immediately, the original data is not placed in the rollback segment, can not be rolled back

(9) In the absence of backup, use the drop and truncate sparingly. To delete some data rows, use Delete and pay attention to where to constrain the extent of the impact. The rollback segment should be large enough. To delete a table with drop, if you want to preserve the table and delete the data in the table, you can do it with truncate if the transaction is irrelevant. If it is related to a transaction, or if the teacher wants to trigger trigger, use Delete.

Truncate table name is fast and efficient because:
TRUNCATE TABLE is functionally the same as a DELETE statement without a WHERE clause: Both delete all rows in the table. However, TRUNCATE TABLE is faster than DELETE and uses less system and transaction log resources. The DELETE statement deletes one row at a time and records an entry in the transaction log for each row that is deleted. TRUNCATE table deletes data by releasing the data page used to store the table data, and records the release of the page only in the transaction log.

TRUNCATE table deletes all rows in the table, but the table structure and its columns, constraints, indexes, and so on, remain unchanged. The count value used for the new row identity is reset to the seed of the column. If you want to preserve the identity count value, use DELETE instead. If you want to delete the table definition and its data, use the DROP table statement.

(12) For a table referenced by the FOREIGN KEY constraint, you cannot use TRUNCATE table and you should use a DELETE statement without a WHERE clause. Because TRUNCATE TABLE is not recorded in the log, it cannot activate the trigger.

5. How the index works and its types

Database index is a sort of data structure in the database management system, which helps to quickly query and update data in database tables. the implementation of an index typically uses a B-tree and its variants, plus trees .

In addition to data, the database system maintains a data structure that satisfies a particular lookup algorithm that references (points to) data in some way, so that an advanced find algorithm can be implemented on those data structures. This data structure is the index.

There is a cost to indexing a table: one is to increase the storage space for the database, and the other is to spend more time inserting and modifying the data (because the index changes as well).

The diagram shows a possible way to index. On the left is the data table, a total of two columns seven records, the leftmost is the physical address of the data record (note that logically adjacent records on disk is not necessarily physically adjacent). To speed up the search for Col2, you can maintain a two-fork lookup tree on the right, each containing the index key value and a pointer to the physical address of the corresponding data record, so that the binary lookup can be used to obtain the corresponding data in the complexity of O (log2n).

Creating an index can greatly improve the performance of your system.

First, by creating a unique index, you can guarantee the uniqueness of each row of data in a database table.

Second, it can greatly speed up the retrieval of data, which is the main reason for creating indexes.

Thirdly, the connection between tables and tables can be accelerated, particularly in terms of achieving referential integrity of the data.

Finally, when using grouping and sorting clauses for data retrieval, you can also significantly reduce the time to group and sort in queries.

By using the index, we can improve the performance of the system by using the optimized hidden device in the process of querying.

Perhaps someone will ask: there are so many advantages to adding indexes, why not create an index for each column in the table? Because there are many disadvantages to increasing the index.

First, it takes time to create indexes and maintain indexes, and this time increases as the amount of data increases.

Second, the index needs to occupy the physical space, in addition to the data table to occupy the data space, each index also occupies a certain amount of physical space, if you want to establish a clustered index, then the space will be larger.

Thirdly, when the data in the table is added, deleted and modified, the index should be maintained dynamically, thus reducing the maintenance speed of the data.

Indexes are built on top of some columns in a database table. When you create an index, you should consider which columns you can create indexes on and which columns you cannot create indexes on. In general, indexes should be created on these columns: on columns that are frequently searched, you can speed up the search, enforce the uniqueness of the column on the column that is the primary key, and arrange the structure of the data in the organization table; These columns are often used on connected columns, which are mostly foreign keys, to speed up the connection Create an index on a column that often needs to be searched by scope, because the index is sorted, its specified range is contiguous, and the index is created on columns that are often ordered, because the index is sorted so that the query can take advantage of the sorting of the index to speed up the sort query time To speed up the judgment of a condition by creating an index on a column that is often used in the WHERE clause.

Similarly, indexes should not be created for some columns. In general, these columns that should not be indexed have the following characteristics:

First, the index should not be created for columns that are seldom used or referenced in queries. This is because, since these columns are seldom used, they are indexed or non-indexed and do not improve query speed. Conversely, by increasing the index, it reduces the system maintenance speed and increases the space requirement.

Second, you should not increase the index for columns that have only a few data values. This is because, because these columns have very few values, such as the gender column of the personnel table, in the results of the query, the data rows of the result set occupy a large proportion of the data rows in the table, that is, the data rows that need to be searched in the table are large. Increasing the index does not significantly speed up the retrieval.

Third, for those columns defined as text, the image and bit data types should not be indexed. This is because the amount of data in these columns is either quite large or has very little value.

The index should not be created when the performance of the modification is far greater than the retrieval performance. This is because modifying performance and retrieving performance are conflicting . When you increase the index, the retrieval performance is improved, but the performance of the modification is reduced. When you reduce the index, you increase the performance of the modification and reduce the retrieval performance. Therefore, you should not create an index when the performance of the modification is far greater than the retrieval performance.

Depending on the capabilities of your database, you can create three indexes in the Database Designer: unique indexes, primary key indexes, and clustered indexes .

Unique index

A unique index is one that does not allow any two rows to have the same index value.

When duplicate key values exist in existing data, most databases do not allow a newly created unique index to be saved with the table. The database may also prevent the addition of new data that will create duplicate key values in the table. For example, if a unique index is created on the employee's last name (lname) in the Employees table, none of the two employees will have a namesake. PRIMARY key index database tables often have one column or column combination whose values uniquely identify each row in the table. This column is called the primary key of the table. Defining a primary key for a table in a database diagram automatically creates a primary key index, which is a specific type of unique index. The index requires that each value in the primary key be unique. When a primary key index is used in a query, it also allows quick access to the data. clustered Indexes in a clustered index, the physical order of rows in a table is the same as the logical (indexed) Order of the key values. A table can contain only one clustered index.

If an index is not a clustered index, the physical order of the rows in the table does not match the logical order of the key values. clustered indexes typically provide faster data access than nonclustered indexes.

Principle of locality and disk pre-reading

Due to the characteristics of the storage media, the disk itself is much slower than main memory, coupled with mechanical movement, disk access speed is often one of the hundreds of of main memory, so in order to improve efficiency, to minimize disk I/O. To do this, the disk is often not read strictly on-demand, but is read-ahead every time, even if only one byte is required, and the disk starts from this location, sequentially reading a certain length of data into memory. The rationale for this is the well-known local principle of computer science: When a data is used, the data around it is usually used immediately. The data that is required during the program run is usually relatively centralized.

Due to the high efficiency of disk sequential reads (no seek time required and minimal rotational time), pre-reading can improve I/O efficiency for programs with locality.

The length of the read-ahead is generally the integer multiple of the page. Page is the logical block of Computer Management memory, hardware and operating system tend to divide main memory and disk storage area into contiguous size equal blocks, each storage block is called a page (in many operating systems, the page size is usually 4k), main memory and disk in the page to exchange data. When the program to read the data is not in main memory, will trigger a page fault, the system will send a read signal to the disk, the disk will find the starting position of the data and sequentially read one or several pages back into memory, and then return unexpectedly, the program continues to run.

Performance analysis of B-/+tree indexes

Here you can finally analyze the performance of the B-/+tree index.

As mentioned above, the index structure is generally evaluated using disk I/O times. First, from the B-tree analysis, according to the definition of b-tree, it is necessary to retrieve up to H nodes at a time. The designer of the database system skillfully exploits the principle of disk pre-reading, setting the size of a node equal to one page, so that each node can be fully loaded with only one I/O. To achieve this, the following techniques are required to implement B-tree in practice:

Each time you create a new node, request a page space directly, so that a node is physically stored in a page, and the computer storage allocation is page-aligned, the implementation of a node only one time I/O.

B-tree requires a maximum of h-1 I/O (root node resident memory) in a single retrieval, and a progressive complexity of O (h) =o (LOGDN). in general practice, the out-of-size D is a very large number, usually more than 100, so H is very small (usually not more than 3).

And the red-black tree structure, H is obviously much deeper. Because the logically close node (parent-child) may be far away physically, it is not possible to take advantage of locality, so the I/O asymptotic complexity of the red-black tree is also O (h), and the efficiency is significantly worse than B-tree.

In summary, using B-tree as index structure efficiency is very high.

6. Types of connections

Execution in Query Analyzer:
– Build Table Table1,table2:
CREATE TABLE table1 (ID int,name varchar (10))
CREATE TABLE table2 (ID int,score int)
INSERT INTO table1 Select 1, ' Lee '
INSERT INTO Table1 Select 2, ' Zhang '
INSERT INTO table1 Select 4, ' Wang '
Insert INTO table2 Select 1,90
Insert INTO table2 Select 2,100
Insert INTO table2 select 3,70
such as table
————————————————-
Table1 | table2 |
————————————————-
ID Name |id Score |
1 Lee | 90|
2 Zhang| 2 100|
4 wang| 3 70|
————————————————-

The following are performed in Query Analyzer
One, outer connection
1. Concept: Includes a LEFT outer join, a right outer join, or a full outer join

2. Left-side connection: outer JOIN
(1) The result set of the left outer join includes all rows of the left table specified in the OUTER clause, not just the rows that match the joined columns. If a row in the left table does not have a matching row in the right table, all select list columns in the right table in the associated result set row are null (NULL).
(2) SQL statements
SELECT * FROM table1 LEFT join table2 on Table1.id=table2.id
————-Results ————-
Idnameidscore
——————————
1lee190
2zhang2100
4wangNULLNULL
——————————
NOTE: All clauses that contain table1, return table2 corresponding fields according to specified criteria, non-conforming null display

3. Right Join
(1) A right outer join is a reverse join of a left outer join. All rows of the right table will be returned. If a row in the right table does not have a matching row in the left table, a null value will be returned for left table.
(2) SQL statements
SELECT * FROM table1 right join table2 on Table1.id=table2.id
————-Results ————-
Idnameidscore
——————————
1lee190
2zhang2100
NULLNULL370
——————————
NOTE: All clauses that contain table2, return table1 corresponding fields according to specified criteria, non-conforming null display

4. Complete outer join: Full JOIN or outer join
(1) A full outer join returns all rows from the left and right tables. When a row does not have a matching row in another table, the selection list column for the other table contains a null value.if there are matching rows between the tables, the entire result set row contains the data values of the base table.
(2) SQL statements
SELECT * FROM table1 full join table2 on Table1.id=table2.id
————-Results ————-
Idnameidscore
——————————
1lee190
2zhang2100
4wangNULLNULL
NULLNULL370
——————————
Note: Returns the left and right connected and (see Top

Second, internal connection
1. Concept: An inner join is a join that compares the values of the columns to be joined by comparison operators

2. Internal connection: Join or INNER JOIN

3.sql statements
SELECT * FROM table1 join table2 on Table1.id=table2.id
————-Results ————-
Idnameidscore
——————————
1lee190
2zhang2100
——————————
Comments:returns only the columns of Table1 and table2 that match the criteria

4. Equivalent (same as the following execution effect)
A:select a.*,b.* from table1 a,table2 b where a.id=b.id
B:select * FROM table1 Cross join Table2 where Table1.id=table2.id (Note: After cross join conditions can only be used where, cannot be used on)

Three, cross-linking(Full)

1. Concept: A cross join without a WHERE clause will produce a Cartesian product of the table involved in the join. The number of rows in the first table multiplied by the number of rows in the second table equals the size of the Cartesian product result set. (Table1 and table2 cross-joins generate 3*3=9 Records)

2. Cross-connect:Cross join (without conditions where ...)

3.sql statements
SELECT * FROM table1 cross join Table2
————-Results ————-
Idnameidscore
——————————
1lee190
2zhang190
4wang190
1lee2100
2zhang2100
4wang2100
1lee370
2zhang370
4wang370
——————————
Note: Returns the 3*3=9 record, which is the Cartesian product

4. Equivalent (same as the following execution effect)
A:select * from Table1,table2

7. Database Paradigm

1 First paradigm (1NF)

In any relational database, the first paradigm (1NF) is the basic requirement for relational schemas, and a database that does not meet the first normal form (1NF) is not a relational database.
The so-called First paradigm (1NF) refers to the fact that each column of a database table is an indivisible basic data item and cannot have multiple values in the same column, that is, an attribute in an entity cannot have multiple values or cannot have duplicate properties. If duplicate attributes are present, you may need to define a new entity, which is composed of duplicate attributes, and a one-to-many relationship between the new entity and the original entity. In the first normal form (1NF), each row of a table contains only one instance of information. In short, the first paradigm is a column with no duplicates.

2 second paradigm (2NF)

The second paradigm (2NF) is established on the basis of the first paradigm (1NF), i.e. satisfying the second normal form (2NF) must first satisfy the first paradigm (1NF). The second normal form (2NF) requires that each instance or row in a database table must be divided by a unique region. It is often necessary to add a column to the table to store the unique identity of each instance. This unique attribute column is called the primary key or primary key, and the main code.
The second normal form (2NF) requires that the attributes of an entity depend entirely on the primary key. The so-called full dependency is the inability to have a property that depends only on the primary key, and if so, this part of the property and the primary key should be separated to form a new entity, and the new entity is a one-to-many relationship with the original entity. It is often necessary to add a column to the table to store the unique identity of each instance. In short, the second paradigm is that a non-principal attribute is dependent on the primary key.

3 Third paradigm (3NF)

Satisfying the third normal form (3NF) must first satisfy the second normal form (2NF). In short, the third paradigm (3NF) requires that a database table not contain non-primary key information already contained in other tables. For example, there is a departmental information table, where each department has a department number (dept_id), a department name, a department profile, and so on. After the department number is listed in the Employee Information table, the department name, department profile and other department related information can no longer be added to the Employee Information table. If there is no departmental information table, it should be built according to the third paradigm (3NF), otherwise there will be a lot of data redundancy. In short, the third paradigm is that properties do not depend on other non-principal properties. (My understanding is to eliminate redundancy)

8. The idea of database optimization

I borrowed lessons from the course on database optimization.

1.SQL Statement Optimization

1) Try to avoid using the! = or <> operator in the WHERE clause, or discard the engine for a full table scan using the index.
2) Try to avoid null values for the field in the WHERE clause, otherwise it will cause the engine to discard full table scans using the index, such as:
Select ID from t where num is null
you can set the default value of 0 on NUM, make sure that the NUM column in the table does not have a null value , and then query:
Select ID from t where num=0
3) It's a good choice to use exists instead of in.
4) Replace the HAVING clause with a WHERE clause because the having will only filter the result set after retrieving all records

2. Index optimization

Look at the index above

3. Database structure optimization

1) Paradigm optimization: such as eliminating redundancy (space saving). 2) Inverse paradigm optimization: such as appropriate redundancy (reduce join) 3) Split table: partition data is physically separated, different partitions of the data can be set up in a different disk data files. In this way, when querying this table, only need to scan the table partition, instead of full table scan, significantly shorten the query time, the other partition on different disks will be the data transfer to the table of different disk I/O, a well-provisioned partition can transfer data to disk i/ o The competition is evenly dispersed. You can take this approach when you have a large amount of data. Table partitions can be automatically built by month.
4) split in fact, vertical split and horizontal split: case: The simple shopping system is set up in the following table: 1. Product table (data volume 10w, Stable) 2. Order form (data volume 200w, and growth trend) 3. User tables (data volume 100w, with growth trend) With MySQL as an example of horizontal split and vertical split, MySQL can tolerate an order of magnitude in millions of static data can be up to tens of thousands of vertical split: solve the problem: the IO competition between table and table does not solve the problem: the pressure scheme of the growth of data volume in single table: Place the product and user tables on one server The order table is split horizontally on one server: Solve the problem: the pressure of data volume growth in a single table does not solve the problem: IO contention between table and table
Scenario: User table by gender split into male user table and female user form order form completed and completed in Split into completed orders and unfinished orders product table unfinished orders placed on a server completed Orders box Male user table put a server on female user table put on a server (woman love shopping haha)

4. Server hardware optimization

How much does this cost?

9. The difference between a stored procedure and a trigger

Triggers are very similar to stored procedures, and triggers are also sets of SQL statements, the only difference being that triggers cannot be invoked with an EXECUTE statement, but are automatically triggered (activated) when a user executes a Transact-SQL statement. A trigger is a stored procedure that executes when a data in a specified table is modified. Pass- through creates triggers to enforce referential integrity and consistency for logically related data in different tables. because the user cannot bypass the trigger, it can be used to enforce complex business rules to ensure the integrity of the data. Triggers are different from stored procedures, and triggers are executed primarily through event execution , and stored procedures can be called directly by the name of the stored procedure name . When you perform operations such as update, INSERT, and delete on a table, SQL Server automatically executes the statements defined by the trigger, ensuring that the processing of the data must conform to the rules defined by those SQL statements.
Related articles:

PHP Common Interview Questions

Common face questions of jquery

Related videos:

Database MySQL Video tutorial

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.