Mysql details-basic concepts of databases

Source: Internet
Author: User
1. Each relational database of a Transaction uses a Transaction as the basic unit of operations. In relational databases, transaction operations are defined as follows: a transaction is a program execution unit consisting of a series of operation sequences. These operations are either done or not done, is an inseparable work unit. 2. Transactions

1. Each relational database of a Transaction uses a Transaction as the basic unit of operations. In relational databases, transaction operations are defined as follows: a transaction is a program execution unit consisting of a series of operation sequences. These operations are either done or not done, is an inseparable work unit. 2. Transactions

. 1. Transactions

Each relational database uses transactions as the basic unit of operations. In relational databases, transaction operations are defined as follows:
A transaction is a program execution unit consisting of a series of operation sequences. These operations are either done or not done, and are an inseparable unit of work.

2. Features of transactions

The ACID attribute of a transaction, that is, Atomicity, Consistency, Isolation, and Durability ).

Atomicity: ensure that all or all operations in the transaction are not executed. For example, to execute a transfer transaction, either the transfer is successful or the transfer fails. If the transfer is successful, the amount is transferred from the transfer-out account to the destination account, and the amount of the two accounts changes accordingly. If the transfer fails, the amount of the two accounts remains unchanged. No money is deducted from the transfer-out account, but the target account does not receive the money.
Consistency: ensure that the database always maintains data consistency-the transaction operation is consistent before and after the transaction operation, regardless of whether the transaction is successful or not. In the above example, the database maintains data consistency before and after the transfer.
Isolation: if multiple transactions are executed concurrently, the result should be the same as the serial execution of multiple transactions. Obviously, the simplest (strict) isolation is to execute all transactions in a serial mode: execute the transaction first, and execute the next transaction only after it is executed. However, the database efficiency is low. For example, two different transactions only read the same batch of data, which can be performed concurrently. Different isolation levels are available to control the effect of concurrent execution.
Durability: Durability indicates that the impact on the database is persistent after the transaction operation is completed. Even if the database is damaged due to a fault, the database should be able to recover. Generally, logs are used.

3. transaction isolation level

If the isolation level is not set, what will happen? (Necessity for setting the isolation level)
Lost update (Lost update ):
Both transactions update a row of data at the same time, but the second transaction fails to exit,
Both changes to the data are invalid.

Dirty Reads (Dirty standalone ):
One Transaction starts to read data from a row, but the other transaction has updated this number.
Data is not submitted in a timely manner. This is quite dangerous, because it is very likely that all operations
Are rolled back.

Non-repeatable Reads (repeated read is not allowed ):
A transaction reads data from the same row twice but returns different results.

Second lost updates problem (invalid write ):
Special cases that cannot be read repeatedly. Two concurrent transactions read the same row of data at the same time.
One of them submits modifications, and the other also submits modifications. This will cause
The first write operation is invalid.

Phantom Reads ):
The transaction performs two queries during the operation. The results of the second query include the first query.
Data not displayed in the query (the SQL statements for the two queries are not required to be the same here ). This is
This is because another transaction inserts data during the two queries.

Transaction isolation level (transaction isolation levels): isolation level is the level of the system's control over transaction concurrency. (International Organization for Standardization) ANSI/iso SQL divides it into SERIALIZABLE, REPEATABLE READ, READ committed, READ uncommitted) four levels. To achieve the isolation level, the database usually uses a Lock ). Generally, you only need to set the isolation level during programming. The database will set the specific lock used. First, we will introduce four levels, and then give an example to explain the concurrency problems that may occur in the next three levels (Repeatable read, read committed, read uncommitted.

SERIALIZABLE: all transactions are executed serially one by one to avoid phantom read (phantom reads ). For databases that implement Concurrency Control Based on locks, serialization requires that when you perform a range query (for example, select a user of the age of 10 to 30, you need to obtain the range lock ). If it is not a database that implements Concurrency Control Based on locks, you need to roll back the transaction when checking that there is a transaction that violates the serial operation.

Repeatable read: All data obtained by the Select statement cannot be modified. This prevents data inconsistency before and after a transaction. However, there is no way to control Phantom reads, because at this time other transactions cannot change the selected data, but data can be added because the previous transaction has no range lock.

READ committed: the READ data can be modified by other transactions. This may cause non-repeated reading. That is to say, the read lock is obtained when the transaction reads data, but is released immediately after reading the data (you do not need to wait until the transaction ends), and the write lock is released after the transaction is committed. After the read lock is released, data may be modified by other things. This level is also the default isolation level of SQL Server.

READ uncommitted: This is the lowest isolation level, allowing other transactions to see uncommitted data. This level causes Dirty Read (Dirty Read ).

Isolation level and possible causes:

Isolation level Dirty read Non-repeated read Phantom read
Read not submitted YES YES YES
Read committed NO YES YES
Repeatable read NO NO YES
Serializing NO NO NO


Mysql users can use the set transaction statement to change the isolation level of a single session or all new connections. Its syntax is as follows:

SET [SESSION | GLOBAL] TRANSACTION ISOLATION LEVEL {READ UNCOMMITTED | READ COMMITTED | REPEATABLE READ | SERIALIZABLE}

Note: The default action (without session or global) is to set the isolation level for the next (not started) transaction. If you use the GLOBAL keyword, the statement sets the default transaction level globally for all new connections (except non-existing connections) created from that point. You need the SUPER permission to do this. Use the SESSION keyword to set the default transaction level for future transactions executed on the current connection. Any client can freely change the session isolation level (even in the middle of the transaction), or set the isolation level for the next transaction.

You can use the following statements to query the global and session transaction isolation levels:

SELECT @@global.tx_isolation; SELECT @@session.tx_isolation; SELECT @@tx_isolation;
4. Database locks

For details about the lock mechanism and related locks in mysql, see the following
Http://blog.csdn.net/wangxiaotongfan/article/details/51367069

5. "relationship" Specification in relational databases

The basic idea of Relationship Mode normalization is to eliminate data redundancy in the Relationship Mode, eliminate inappropriate parts of the data dependency, and solve the abnormal phenomenon of data insertion and deletion. This requires the relational model to meet certain conditions. We refer to the different standards set up for different degrees of standardization requirements in the process of standardization of the relationship model as the paradigm. Due to the different degree of standardization, different paradigms are generated.
The relationships between the six paradigms are 1NF, 2NF, 3NF, BCNF, 4NF, and 5NF.
1NF? 2NF? 3NF? BCNF? 4NF? 5NF ;
Here we only study the situations of the first five paradigms.

First paradigm

If all attributes of relational mode R are simple attributes, that is, each attribute cannot be further divided, R is called the first paradigm.
Data redundancy, insertion, deletion, and update exceptions cannot be avoided.

Second Paradigm (eliminating the dependence of some molecular functions)

If the relational mode R conforms to the first paradigm, and every non-primary attribute fully depends on the primary key of the R, it is called the second paradigm.

Conclusion

1. Remove from the first paradigm relationship unless the primary attribute is dependent on some functions of the primary relationship, the second paradigm relationship can be obtained.
2. If the relational key of R is a single attribute, or all the attributes of R are primary attributes, R complies with the second paradigm.
2NF normalization refers to a set of 1NF relational patterns that are decomposed by projection into 2NF relational patterns. The principle behind decomposition is "one thing, one place", so that a link can only describe the relationship between one entity or another. If there is more than one entity or link, the projection is decomposed.

If the relational mode R is the first paradigm, and every non-primary attribute function in R depends on a candidate key of R, it is called the second paradigm.

Example

The second Paradigm (2NF) is established on the basis of the first paradigm (1NF), that is, to satisfy the second Paradigm (2NF) must satisfy the first paradigm (1NF) first ). The second Paradigm (2NF) requires that each instance or row in the database table be able to be distinguished by a unique region. To implement differentiation, you usually need to add a column to the table to store the unique identifier of each instance. This unique attribute column is called as the primary keyword, primary key, and primary code.

For example, the employee ID (emp_id) column is added to the employee information table. Because each employee's employee ID is unique, each employee can be uniquely distinguished.

In short, the second Paradigm (2NF) is that non-primary attributes fully depend on primary keywords.

Full dependency is an attribute that cannot depend only on A part of the primary keyword (with function dependency W → A, if XW exists, X → A is true, so w→a is A local dependency; otherwise, w→a is A full function dependency ). If this attribute exists, this part of the attribute and the primary keyword should be separated to form a new object. The relationship between the new object and the original object is one-to-multiple.

Assume that the course selection relation table is SelectCourse (student ID, name, age, course name, score, and credits), and the keywords are combined keywords (student ID, course name), because the following deciding relation exists:

(Student ID, course name) → (name, age, score, credits)

This database table does not meet the second paradigm because of the following decision relationships:

(Course name) → (credits) (student ID) → (name, age)

That is, fields in the combined keywords determine non-keywords.

Because 2NF is not met, this course selection relation table has the following problems:

(1) data redundancy:

The same course is selected by n students, and "Credits" are repeated for n-1 times. The same student takes m courses, and the names and ages are repeated for m-1 times.

(2) Update exception:

If you have adjusted the credits of a course, the "Credits" value of all rows in the data table must be updated. Otherwise, different credits may occur for the same course.

(3) insertion exception:

For example, if you want to open a new course, no one will take the course. In this way, the course name and credits cannot be recorded in the database because there is no "student ID" keyword.

(4) Deletion exception:

Assuming that a group of students have completed their electives, These electives should be deleted from the database table. However, the course name and credit information are also deleted. Obviously, this will also cause insertion exceptions.

Change the SelectCourse table to the following three tables:

Student: Student (Student ID, name, age );

Course: Course (Course name, credits );

Course Selection relationship: SelectCourse (student ID, course name, score ).

Such database tables conform to the second paradigm, eliminating data redundancy, update exceptions, insertion exceptions, and deletion exceptions.

In addition, all database tables with single keywords comply with the second paradigm, because it is impossible to have a combination of keywords.

Third Paradigm (eliminating transmission dependencies)

If the relational model conforms to the second paradigm, and each non-primary attribute does not pass the primary key of the function dependent on R, it is called the third paradigm.
3NF normalization refers to a set of 2NF relational patterns converted into 3NF relational patterns through projection decomposition. It follows the same principle as 2NF, "one thing, one place ", let a link describe the relationship between an object or an object.
If the relational mode R is the second paradigm, and each non-primary attribute does not pass a candidate key dependent on R, R is called the third paradigm.

The third paradigm (3NF) must satisfy the second Paradigm (2NF) first ). The third paradigm (3NF) requires that a database table do not contain information about non-primary keywords already contained in other tables.

For example, there is a department information table, where each department has a department ID (dept_id), department name, Department profile, and other information. After the Department numbers are listed in the employee information table, you cannot add the Department name, Department profile, and other information related to the department to the employee information table. If the department information table does not exist, it should also be constructed based on the third paradigm (3NF), otherwise there will be a large amount of data redundancy.

Third Paradigm (3NF): Based on the second paradigm, if there is no transfer function dependency for any candidate keyword segment in the data table, it complies with the third paradigm. In short, the third paradigm is that attributes do not depend on other non-primary attributes.

The so-called transfer function dependency refers to the existence of the "A → B → C" decision relationship, then the C transfer function depends on.

Therefore, database tables that meet the third paradigm should not have the following dependency:

Keyword field → non-Keyword field x → non-Keyword field y

Assume that the Student relationship table is Student (Student ID, name, age, school, school location, school phone number), and the keyword is single keyword "Student ID" because of the following decision relationships:

(Student ID) → (name, age, school, school location, school phone number)

This database complies with 2NF but does not comply with 3NF because of the following decision relationships:

(Student ID) → (school location, school phone number)

That is, the transfer function dependency of the non-Keyword section "school location" and "school phone" on the keyword section "student ID" exists.

It can also cause data redundancy, update exceptions, insertion exceptions, and deletion exceptions. You can analyze and learn this information on your own.

The student relationship table is divided into the following two tables:

Student (student ID, name, age, school );

School: (school, location, phone number ).

Such database tables conform to the third paradigm, eliminating data redundancy, update exceptions, insertion exceptions, and deletion exceptions.

BC paradigm
If the relational mode R belongs to the first paradigm and all functions depend on X → Y (Y does not belong to X), the decisive factor X contains a candidate key of R, it is called the BC paradigm.
BCNF has the following properties:
The relationship that satisfies BCNF will eliminate some function dependencies on keys and transmit function dependencies for any attribute (primary or non-primary attribute). That is to say, R belongs to BCNF, and R belongs to 3NF.
If R belongs to 3NF, R does not necessarily meet BCNF.
This paradigm solves four existing exceptions.

Fourth paradigm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.