MySQL FAQ and mysql questions

Source: Internet
Author: User

MySQL FAQ and mysql questions
1. Foreign key candidates for primary key superkeys

Primary key:

A combination of data columns or attributes that are unique and fully identified for the stored data objects in the database table.A data column can have only one primary key.And the value of the primary key cannot be missing, that is, it cannot be Null ).

Superkeys:

The attribute set that uniquely identifies a tuples in a link is called the superkey of The Link mode. An attribute can be used as a super key, and multiple attributes can be combined as a super key.The superkeys include candidate keys and primary keys.

Candidate Key:

YesMinimum superkeyThat is, the superkeys without redundant elements.

Foreign key:

ThePrimary Key of another tableThe foreign key of the table.

2. Four features and meanings of database transactions

The four basic elements of the correct execution of Database Transaction transanction. ACID, Atomicity, consistency, Isolation, and Durability ).
Atomicity: All the operations in the entire transaction are either completed or not completed, and it is impossible to stop at a stage in the middle. When a transaction encounters an error during execution, it will be rolled back to the state before the start of the transaction, just as this transaction has never been executed.
Consistency: The integrity constraints of the database are not damaged before and after the transaction starts.
Isolation: The isolation status executes transactions, making them seem to be the only operations performed by the system within a given time. If two transactions run at the same time and perform the same functions, the isolation of the transaction will ensure that each transaction is considered to be only the transaction in the system. This type of attribute is sometimes called serialization. To prevent confusion between transaction operations, you must perform serialization or serialization requests so that only one request is used for the same data at the same time.
Durability: After the transaction is completed, the changes made by the firm to the database will be permanently stored in the database and will not be rolled back.

3. What is the role of a view? Can a view be changed?

A view is a virtual table. Unlike a table that contains data, a view only contains queries to Dynamically Retrieve data during use, and does not contain any columns or data. Using a view can simplify complex SQL operations, hide specific details, and protect data. After creating a view, you can use it in the same way as a table.
The view cannot be indexed or associated with triggers or default values. If the view itself contains order by, the view will be overwritten again by order.
Create view XXX as XXXXXXXXXXXXXX;
For some views, such as the Distinct Union function, which does not use the join subquery grouping function, you can update the view, and update the base table. However, the view is mainly used to simplify retrieval, data protection is not used for update, and most views cannot be updated.

4. Differences between drop, delete and truncate

Drop directly deletes the table truncate to delete the table data. When inserted, the auto-increment id is deleted from 1 and the table data is deleted from 1. You can add the where clause.

(1) The DELETE statement deletes a row from the table each time, and saves the DELETE operation of the row as a transaction record in the log for rollback. Truncate table deletes all data from the TABLE at a time and does not store the delete operation records in logs. Deleting rows cannot be recovered. In addition, table-related deletion triggers are not activated during the deletion process. Fast execution speed.

(2) space occupied by tables and indexes. When a table is TRUNCATE, the space occupied by the table and index will be restored to the initial size, and the DELETE operation will not reduce the space occupied by the table or index. The drop statement releases all the space occupied by the table.

(3) In general, drop> truncate> delete

(4) application scope. TRUNCATE can only be set to TABLE. DELETE can be set to table or view.

(5) TRUNCATE and DELETE only DELETE data, while DROP deletes the entire table (structure and data ).

(6) truncate and delete without where: delete data only, but not the table structure (Definition) drop statement will delete the constraints (constrain) on which the table structure is dependent ), trigger index; the stored procedure/function dependent on the table will be retained, but its status will change to: invalid.

(7) When the delete statement is DML (data maintain Language), this operation will be placed in the rollback segment and take effect after the transaction is committed. If there is a corresponding tigger, it will be triggered during execution.

(8) truncate and drop are DLL (data define language). The operation takes effect immediately. The original data is not stored in rollback segment and cannot be rolled back.

(9) exercise caution when using drop and truncate without backup. To delete some data rows, use delete and use where to restrict the impact scope. The rollback segment must be large enough. To delete a table, use drop. If you want to retain the table and delete the table data, use truncate if it is unrelated to transactions. If it is related to the transaction or the instructor wants to trigger the trigger, delete is used.

(10) Truncate table names are fast and efficient because:
The truncate table function is the same as the DELETE statement without the WHERE clause: both DELETE all rows in the table. However, truncate table is faster than DELETE and uses less system and transaction log resources. The DELETE statement deletes a row at a time and records one row in the transaction log. Truncate table deletes data by releasing the data pages used to store TABLE data, and only records the release of pages in transaction logs.

(11) truncate table deletes all rows in the TABLE, but the TABLE structure and its columns, constraints, and indexes remain unchanged. The Count value used by the new row ID is reset to the seed of the column. To retain the ID Count value, use DELETE instead. To delete TABLE definitions and data, use the drop table statement.

(12) for tables referenced by the foreign key constraint, the truncate table cannot be used, but the DELETE statement without the WHERE clause should be used. Because the truncate table is not recorded in the log, it cannot activate the trigger.

5. indexing principles and types

Database indexIt is a sorted data structure in the database management system to help you quickly query and update data in the database table.The implementation of indexes usually uses the B tree and Its Variant B + tree..

In addition to data, the database system also maintains data structures that meet specific search algorithms. These data structures reference (point to) data in a certain way, in this way, you can implement advanced search algorithms on these data structures. This data structure is an index.

The cost for setting indexes for a table is: first, the storage space of the database is increased, second, it takes a lot of time to insert and modify data (because the index also needs to change ).

 

The figure shows a possible indexing method. On the left is a data table with a total of seven records in two columns, and on the left is the physical address of the data records (note that logically adjacent records are not physically adjacent on the disk ). To speed up Col2 search, you can maintain a binary search tree shown on the right. Each node contains an index key value and a pointer to the physical address of the corresponding data record, in this way, you can use binary search to obtain the corresponding data in the complexity of O (log2n.

Creating indexes can greatly improve the system performance.

First, you can create a unique index to ensure the uniqueness of each row of data in the database table.

Second, it can greatly speed up data retrieval, which is also the main reason for creating an index.

Third, it can accelerate the connection between tables, especially in achieving Data Reference integrity.

Fourth, when you use grouping and sorting clauses to retrieve data, you can also significantly reduce the time for grouping and sorting in queries.

Fifth, by using indexes, you can use the optimizer during the query process to improve system performance.

Some may ask: why not create an index for each column in the table because increasing Indexes has so many advantages? This is because adding Indexes has many disadvantages.

First, it takes time to create and maintain indexes. This time increases with the increase of data volume.

Second, indexes occupy physical space. In addition to data tables, each index occupies a certain amount of physical space. To create a clustered index, the required space is larger.

Third, when adding, deleting, and modifying data in the table, the index must also be dynamically maintained, which reduces the Data Maintenance speed.

Indexes are created on certain columns in the database table. When creating an index, you should consider which columns can create an index and which Columns cannot create an index.In general, you should create an index on these columns:In columns that frequently need to be searched, the search speed can be accelerated. In columns that are used as the primary key, the uniqueness of the column and the data arrangement structure in the organization table are enforced. In columns that are frequently used for connection, these columns are mainly foreign keys that can accelerate the connection speed. You can create an index on a column that needs to be searched by range. The specified range is continuous because the index has been sorted; create an index on a column that needs to be sorted frequently. Because the index has been sorted, you can sort the index to accelerate the sorting query time. You can create an index on a column that is frequently used in the WHERE clause, speed up condition judgment.

Similarly, indexes should not be created for some columns.In general, these columns that should not be indexed have the following features:

First, indexes should not be created for columns that are rarely used or referenced in queries. This is because, since these columns are rarely used, there is an index or no index, and the query speed cannot be improved. On the contrary, the addition of indexes reduces the system maintenance speed and space requirements.

Second, indexes should not be added to columns with only few data values. This is because these columns have very few values, such as gender columns in the personnel table. In the query results, the data rows in the result set account for a large proportion of the data rows in the table, that is, the proportion of data rows to be searched in the table is large. Adding indexes does not significantly accelerate the search speed.

Third, indexes should not be added for columns defined as text, image, and bit data types. This is because the data volume of these columns is either large or small.

Fourth, when the modification performance is far greater than the retrieval performance, you should not create an index. This is because,The modification performance and retrieval performance are inconsistent.. When an index is added, the search performance is improved, but the modification performance is reduced. When the index is reduced, the modification performance is improved and the retrieval performance is reduced. Therefore, when the modification performance is much higher than the retrieval performance, you should not create an index.

Based on the database features, you can create three indexes in the Database Designer:Unique index, primary key index, and clustered Index.

Unique Index

A unique index is an index that does not allow any two rows to have the same index value.

When duplicate key values exist in existing data, most databases do not allow you to save the newly created unique index with the table. The database may also prevent adding new data that will create duplicate key values in the table. For example, if the employee's last name (lname) in the employee table creates a unique index, neither employee can have the same name.Primary Key IndexA database table often has a column or a combination of columns. Its Values uniquely identify each row in the table. This column is called the primary key of the table. When you define a primary key for a table in the database relationship diagram, the primary key index is automatically created. The primary key index is a specific type of unique index. This index requires that each value in the primary key be unique. When a primary key index is used in a query, it also allows quick access to data.Clustered IndexIn the clustered index, the physical order of the row in the table is the same as the logic (INDEX) Order of the key value.A table can contain only one clustered index.

If an index is not a clustered index, the physical sequence of the row in the table does not match the logical sequence of the key value.Compared with non-clustered indexes, clustered indexes generally provide faster data access speeds.

Local principle and disk pre-read

Because of the characteristics of the storage medium, the access to the disk itself is much slower than the primary storage, coupled with the cost of mechanical movement, the access speed of the disk is often one of the primary storage, so in order to improve efficiency, minimize disk I/O. To achieve this goal, the disk is usually not read strictly on demand, but preread every time. Even if only one byte is required, the disk starts from this location, read data of a certain length in sequence into the memory. The theoretical basis for this is well-known in computer science.Locality Principle:When a data is used, the data nearby it is usually used immediately. The data required during the program running is usually concentrated.

Because sequential disk reading is highly efficient (with little rotation time required without seeking time), preread can improve I/O efficiency for local programs.

The preread length is generally an integer multiple of the page. Pages are logical blocks for computer memory management. Hardware and operating systems often divide primary and disk storage areas into contiguous blocks of the same size, each block is called a page (in many operating systems, the page size is usually 4 k). The primary storage and disk exchange data in pages. When the data to be read by the program is not in the primary storage, a page missing exception is triggered, and the system sends a disk reading signal to the disk, the disk finds the starting position of the data and reads one or more pages consecutively into the memory. If an exception is returned, the program continues to run.

B-/+ Tree index Performance Analysis

At last, we can analyze the performance of B-/+ Tree indexes.

As mentioned above, the index structure is evaluated by the number of disk I/O operations. From B-Tree analysis, according to the definition of B-Tree, we can see that a maximum of h nodes can be accessed at a time. The database system designer cleverly utilizes the disk pre-read principle to set the size of a node to equal to a page, so that each node can be fully loaded only once I/O. To achieve this goal, you also need to use the following techniques to implement B-Tree:

Each time you create a node, you can directly apply for a page space to ensure that a node is physically stored on a page. In addition, the computer storage allocation is page-aligned, A node only needs one I/O operation.

A B-Tree retrieval requires a maximum of H-1 I/O (root node resident memory), and the progressive complexity is O (h) = O (logdN ).Generally, in practice, the output degree d is a very large number, usually greater than 100, so h is very small (usually no more than 3 ).

The structure of the red and black trees is much deeper than h. Because logically close nodes (Parent and Child) may be far physically unable to use locality, the I/O complexity of the red and black trees is O (h ), the efficiency is much lower than that of B-Tree.

In conclusion, the efficiency of using B-Tree as the index structure is very high.

6. connection types

Query analyzer execution:
-- Create Table table1, table2:
Create table table1 (id int, name varchar (10 ))
Create table table2 (id int, score int)
Insert into table1 select 1, 'lil'
Insert into table1 select 2, 'zhang'
Insert into table1 select 4, 'wang'
Insert into table2 select 1, 90
Insert into table2 select 2,100
Insert into table2 select 3, 70
Such as table
-------------------------------------------------
Table1 | table2 |
-------------------------------------------------
Id name | id score |
1 lee | 1 90 |
2 zhang | 2 100 |
4 wang | 3 70 |
-------------------------------------------------

Run the following commands in the query Analyzer:
1. External Connection
1. Concept: including left Outer Join, right Outer Join or complete external join

2. left join: left join or left outer join
(1) The result set of the left outer Join includes all rows in the LEFT table specified in the left outer clause, not just the rows matched by the join column. If a row in the left table does not match a row in the right table, all the selection list columns in the right table in the row of the associated result set are null ).
(2) SQL statements
Select * from table1 left join table2 on table1.id = table2.id
------------- Result -------------
Idnameidscore
------------------------------
1lee190
2zhang2100
4 wangNULLNULL
------------------------------
Note: all the clauses containing Table 1 return the corresponding fields of Table 2 based on the specified conditions. The non-conforming fields are displayed as null.

3. right join: right join or right outer join
(1) The right outer join is the reverse join of the left Outer Join. All rows in the right table are returned. If a row in the right table does not match a row in the left table, a null value is returned for the left table.
(2) SQL statements
Select * from table1 right join table2 on table1.id = table2.id
------------- Result -------------
Idnameidscore
------------------------------
1lee190
2zhang2100
NULLNULL370
------------------------------
Note: all the clauses containing Table 2 return the corresponding fields of Table 1 Based on the specified conditions. The non-conforming fields are displayed as null.

4. Complete External join: full join or full outer join
(1) The Complete External join returns all rows in the left and right tables. If a row does not match a row in another table, the selection list column of the other table contains a null value.If there are matched rows between tables, the entire result set row contains the data value of the base table.
(2) SQL statements
Select * from table1 full join table2 on table1.id = table2.id
------------- Result -------------
Idnameidscore
------------------------------
1lee190
2zhang2100
4 wangNULLNULL
NULLNULL370
------------------------------
Note: returns the sum of left and right connections (see upper left and right connections)

2. Internal Connection
1. Concept: inner join is a join that uses a comparison operator to compare the values of the columns to be joined.

2. inner join: join or inner join

3. SQL statements
Select * from table1 join table2 on table1.id = table2.id
------------- Result -------------
Idnameidscore
------------------------------
1lee190
2zhang2100
------------------------------
Note:Only the columns of table1 and table2 that meet the conditions are returned.

4. equivalent (same as the following execution)
A: select a. *, B. * from table1 a, table2 B where a. id = B. id
B: select * from table1 cross join table2 where table1.id = table2.id (Note: Only the where clause can be used for cross join and the on clause cannot be used.)

Iii. Cross join(Complete)

1. Concept: A cross join without a WHERE clause will generate the Cartesian product of the table involved in the join. The number of rows in the first table multiplied by the number of rows in the second table is equal to the size of the Cartesian result set. (Table1 and table2 generate 3*3 = 9 records)

2. Cross join:Cross join (without the condition where ...)

3. SQL statements
Select * from table1 cross join table2
------------- Result -------------
Idnameidscore
------------------------------
1lee190
2zhang190
4wang190
1lee2100
2zhang2100
4wang2100
1lee370
2zhang370
4wang370
------------------------------
Note: 3*3 = 9 records are returned, that is, Cartesian product.

4. equivalent (same as the following execution)
A: select * from table1, table2

7. Database paradigm

1. 1NF)

In any relational database, the first paradigm (1NF) is the basic requirement for the relational model. databases that do not meet the first paradigm (1NF) are not relational databases.
The first paradigm (1NF) means that each column in the database table is an inseparable basic data item. The same Column cannot contain multiple values, that is, an attribute in an object cannot have multiple values or duplicate attributes. If duplicate attributes exist, you may need to define a new object. A new object consists of duplicate attributes. The new object has one-to-multiple relationships with the original object. In the first paradigm (1NF), each row of the table contains only information of one instance. In short,The first paradigm is non-repeated columns.

2 second Paradigm (2NF)

The second Paradigm (2NF) is established on the basis of the first paradigm (1NF), that is, to satisfy the second Paradigm (2NF) must satisfy the first paradigm (1NF) first ). The second Paradigm (2NF) requires that each instance or row in the database table be able to be distinguished by a unique region. To implement differentiation, you usually need to add a column to the table to store the unique identifier of each instance. This unique attribute column is called as the primary keyword, primary key, and primary code.
The second Paradigm (2NF) requires that the attributes of an object fully depend on the primary keyword. The so-called full dependency refers to the fact that there cannot be an attribute that only depends on a part of the primary keyword. If so, this attribute and this part of the primary keyword should be separated to form a new entity, the relationship between the new object and the original object is one-to-multiple. To implement differentiation, you usually need to add a column to the table to store the unique identifier of each instance. In short,The second paradigm is that non-primary attributes are not partially dependent on primary keywords.

3. Third Paradigm (3NF)

The third paradigm (3NF) must satisfy the second Paradigm (2NF) first ). In short, the third paradigm (3NF) requires that a database table do not contain information about non-primary keywords already contained in other tables. For example, there is a department information table, where each department has a department ID (dept_id), department name, Department profile, and other information. After listing the Department numbers in the employee information table, you cannot add the Department name, Department profile, and other information related to the department to the employee information table. If the department information table does not exist, it should also be constructed based on the third paradigm (3NF), otherwise there will be a large amount of data redundancy. In short,The third paradigm is that attributes do not depend on other non-primary attributes. (My understanding is to eliminate redundancy)

 

8. Database optimization ideas

I learned from MOOC about database optimization.

1. SQL statement Optimization

1) try to avoid using it in the where clause! = Or <> operator. Otherwise, the engine will discard the index for full table scanning.
2) try to avoid null value determination on the field in the where clause. Otherwise, the engine will discard the index and perform full table scanning, for example:
Select id from t where num is null
You can set the default value 0 on num to ensure that the num column in the table has no null value.And then query:
Select id from t where num = 0
3) using exists instead of in is a good choice.
4) Replace the HAVING clause with the Where clause because HAVING filters the result set only after all records are retrieved.

2. INDEX OPTIMIZATION

See the index above

3. Database Structure Optimization

1) paradigm optimization: for example, eliminating redundancy (Saving space ..) 2) reverse paradigm optimization: for example, adding redundancy appropriately (reducing join) 3) Split table: partitions physically separate data, data of different partitions can be stored in data files on different disks. In this way, when querying this table, you only need to perform a row scan in the table partition instead of a full table scan, which significantly shortens the query time, in addition, partitions on different disks will also distribute the data in this table to different disk I/O, A well-configured partition can evenly distribute data transmission to disk I/O competition. This method can be used for tables with large data volumes. Table partitions can be automatically created on a monthly basis.
4) split is actually divided into vertical split and horizontal split. Case: The following table is currently involved in a simple Shopping System: 1. product table (data volume: 10 million, stable) 2. order table (200 million data records, with an increasing trend) 3. the User table (with a data volume of 100 million and an increasing trend) uses mysql as an example to describe horizontal split and Vertical Split. mysql can tolerate an order of magnitude of millions of static data.Vertical Split:Solution: I/O competition between tables does not solve the problem: the pressure solution for increasing data volume in a single table: Put the product table and user table on a server, and put the order table on a server.Horizontal Split:Solution: increasing data volume in a single table does not solve the problem: I/O competition between tables
Solution: the User table is split into a male table and a female table by gender. The order table is split into completed orders and unfinished orders. The product table is incomplete. The order is placed on a server. The completed order table is the male table. put a server on a female user table on a server (female loves shopping haha)

4. server hardware optimization

This is a little more expensive!

9. Differences between stored procedures and triggers

Triggers are very similar to stored procedures. triggers are also SQL statement sets,The only difference between the two is that the trigger cannot be called using the EXECUTE statement, but is automatically triggered (activated) when the user executes the Transact-SQL statement. A trigger is a stored procedure that is executed when data in a specified table is modified.ConnectYou can create a trigger to enforce the integrity and consistency of reference data related to logic in different tables.Because you cannot bypass the trigger, you can use it to enforce complex business rules to ensure data integrity. Triggers are different from stored procedures,Trigger is triggered by event execution., AndStored procedures can be called directly by the stored procedure name.. When performing operations such as UPDATE, INSERT, and DELETE on a table, SQLSERVER automatically runs the SQL statement defined by the trigger, this ensures that data processing must comply with the rules defined by these SQL statements.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.