Database and database Learning

Last Update:2015-05-09 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Database and database Learning
Relational system

Purpose of this chapter:
RDBMS query procedure
Query Optimization concepts
Basic methods and technologies

Query Optimization classification:
Algebra Optimization
Physical Optimization

RDBMS query processing stage:

1. query analysis 2. query check 3. Query Optimization 4. query execution

Scanning, lexical analysis, and syntax analysis of query statements
Identifies a language symbol from a query statement
Perform syntax check and syntax analysis

Query check

Check the semantics of valid query statements based on the data dictionary
Check the user's access permissions based on the user permissions and integrity constraints defined in the data dictionary.
After the check is passed, convert the SQL query statement into an equivalent relational Algebra Expression.
RDBMS generally uses the query tree (syntax analysis tree) to represent extended relational algebra expressions.
Converts the external name of the database object to an internal representation.

Query Optimization

Query Optimization: select an efficient Query Processing Policy
Query Optimization classification:
Algebra optimization: optimization of relational algebra expressions
Physical optimization: Selection of access paths and underlying Operation algorithms
The basis for selecting the query optimization method:
Rule-based)
Cost-based)

Query execution

Generate a query plan based on the execution policy obtained by the optimizer
Code generator generates the code for executing the query plan

Select Operation implementation

[Example 1] Select * from student where <condition expression>; Consider the following situations: C1: unconditional; C2: Sno = '2016'; C3: sage> 20; C4: Sdept = 'cs 'AND Sage> 20;

Select a typical implementation method:
1. Simple full table Scan Method
Scan the basic tables in sequence to check whether each tuple meets the selection criteria one by one, and use the tuples that meet the criteria as the result output.
Suitable for small tables, not large tables
2. Index (or hash) scan method
It is suitable for selecting attributes with indexes (such as B + tree indexes or Hash indexes)
You can use the index to first find the primary code or pointer of the tuples that meet the conditions, and then find the tuples directly in the basic query table through the tuples pointer.

[Example 1-C2] Taking C2 as an example, Sno = '20160301', And Sno has an index (or Sno is a hash code) using an index (or hash) the pointer WITH Sno as '000000' is retrieved from the student table using the tuple pointer. [Example 1-C3] Taking C3 as an example, Sage> 20, in addition, on the Sage, the B + Tree Index uses the B + tree index to locate the index item Sage = 20, using this as the entry point in the sequence set of the B + tree, we can get all the tuples of Sage> 20 and use these tuples to retrieve all students older than 20 in the student table.

[Example 1-C4] Taking C4 as an example, Sdept = 'cs 'AND Sage> 20. If both Sdept AND Sage have an index: algorithm 1: use the above two methods to find a set of tuples pointer of Sdept = 'cs 'and another set of tuples pointer of Sage> 20, respectively, and find the intersection of these two pointers in the student table for retrieval. algorithm 2: locate a set of tuples pointer of Sdept = 'cs ', and use these tuples pointer to search for the obtained tuples in the student table to check other selection conditions (such as Sage> 20) whether or not the tuples that meet the conditions are output as results.

Connection operation implementation

Connection operations are one of the most time-consuming operations in query processing.
This section only describes the most common implementation algorithms for equijoin (or natural join ).

[Example 2] SELECT * FROM Student, SC where Student. Sno = SC. Sno;

Nested loop)

Each tuples (s) of the external layer loop (Student), retrieves each tuple (SC) in the inner layer loop (SC)
Check whether the two tuples are equal in the connection property (sno ).
If the connection condition is met, the result is output after the concatenation until the tuples in the outer loop table are processed.

Sort-merge method (sort-merge join or merge join)

Suitable for sorting of connected tables
Sort-merge join method steps:
If the connected tables are not sorted in order, sort the Student and SC tables by the connection property Sno.
Take the first Sno in the Student table and scan the tuples with the same Sno in the SC table in sequence.

Index join method

Steps:
① Create an Sno index on the SC table.
② For each of Student's tuples, the Sno value searches for the corresponding SC tuples through the SC Index
③ Connect these SC tuples with Student tuples
Loop execution ② ③ until the tuples in the Student table are processed

Hash Join method

Use the connection property as the hash code and use the same hash function to hash the tuples in R and S into the same hash file.
Steps:
Partitioning phase ):
Repeat a table (such as R) that contains fewer tuples
Distribute its tuples to the buckets of the hash table using the hash function.
Probing phase: join phase)
Process another table (S) once
Hash the tuples of S to an appropriate hash bucket.
Concatenates the tuples with all the tuples in the bucket that come from R and match them.

Sort-merge join method steps (continued ):
When the first SC tuples with different Sno values are scanned, the Student table is returned to scan its next tuples, And the tuples with the same Sno in the SC table are scanned to connect them.
Repeat the preceding steps until the Student table has been scanned.
You only need to scan Student and SC tables once.
If the two tables are unordered, the execution time must be followed by the sorting time of the two tables.
For two large tables, sort the table first and then use the sort-merge join method to execute the join operation. The total time will be greatly reduced.
The preceding hash join algorithm assumes that the smaller tables in the two tables can be completely put into the memory hash bucket after the first stage.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Database and database Learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Database and database Learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support