SQL Application and Development: (5) connecting multiple data tables and multiple SQL tables
Different tables in the database store different data. Users often need to use the data in multiple tables to combine and extract the required information. If a query needs to operate on multiple tables, it is called a join query. The join query result set or result table is called a join between tables. In fact, data is queried through the Association of common columns between tables. It is the most important feature of relational database queries.
The connection result is that there is no limit on adding new types of data to the database, which is flexible. A new table is always created through a connection to contain data in different tables. If a new table has an appropriate field, you can connect it to an existing table.
1. Simple connection
The simplest connection method is to separate different base tables with commas in The FROM clause of the SELECT statement, the equivalent join created using the WHERE clause connects the values of one or more columns in the first base table to the values of one or more columns in the second base table. In this way, only the rows with the matching values of the columns in the two base tables are displayed in the query results. However, whether or not the columns in different tables have the same column name, you can add a table name to limit the column name.
For example, in the "sales relationship system" of the database, you can query a table that contains "sales clerk information" and "customer information", where the "sales clerk number" is greater than 1004, add a column "Clerk ID" to the SELECT list ":
SELECT salesman information. Clerk ID, clerk information. Clerk ID, customer information. Customer name, customer information. Customer address, customer information. Contact number
FROM salesman information, Customer Information
WHERE salesman information. Clerk ID = Customer Information. Clerk id and clerk information. Clerk ID> 1004
After execution, it is the result of our condition.
In this query, we can also use the relevant name. The "salesman information" table is replaced by alias A, and the "customer information" table is replaced by B. When the connected table is referenced, except for the FROM clause, A and B can be used in any part of the SELECT statement, so the preceding statement becomes:
Select a. Clerk Number, A. Clerk Number, B. Customer name, B. Customer address, B. Contact number
FROM salesman information A, customer information B
Where a. Clerk No. = B. Clerk No. AND A. Clerk No.> 1004
The execution result is exactly the same as the previous one.
This also laid the foundation for learning the connection between multiple tables. For example, in the Database "jewelry sales system", a query is created based on three tables: "jewelry information", "jewelry supplier information", and "sales details. Return the "jewelry name" column in the "jewelry information" table, and the "commodity name" column in the "commodity information" table and the "profit" column in the "sales details" table ":
Select a. Jewelry name, B. Name of the jeweler, C. Profit
FROM commodity information as a, commodity information as B, sales details AS C
Where a. Jewelry code = C. Jewelry code and a. jewelry dealer No. = B. jewelry dealer No.
After execution, the query result set is displayed.
2. Internal Connection
INNER join is the most common connection query. Generally, the inner jion keyword is used to specify the INNER join. However, INNER is not required. If you only use the JOIN keyword, you must define an ON clause. The inner join query operation lists the data rows that match the connection condition. It uses the comparison operator to compare the joined column values.
2.1 equijoin
The so-called equijoin interpretation uses the equal sign (=) operator in the connection condition to compare the values of connected columns. All columns in the connected table, including repeated columns, are listed in the query results. In other words, a query that connects base tables through equal column values is an equivalent join query.
Two Representation modes can be used to specify the connection conditions for an equijoin query. For example, in the Database "sales association system", a query is created based on the "Product Information" and "supplier information" tables. Returns if the query condition is equal to the supplier number in two tables, the returned columns must be the "Product Name" in the "Product Information" table and the "supplier name" and "contact name" in the "supplier information" table ".
Select a. Item Name, B. Supplier name, B. Contact Name, A. Unit Price
FROM product information A, supplier name B
Where a. Supplier No. = B. Supplier No.
In the WHERE clause of the preceding statement, use "=" to specify that the query is an equivalent join query.
You can also use the inner join keyword in the FROM clause of the query statement to connect the specified query to an equivalent JOIN query.
Select a. Item Name, B. Supplier name, B. Contact Name, A. Unit Price
FROM product information a inner join supplier information B
On a. Supplier No. = B. Supplier No.
The execution result is the same as that of the preceding statement.
You can also use the order by clause to sort the query results. For example, the query of the above-mentioned equi-join query is sorted in descending order of the column "unit price.
Select a. Item Name, B. Supplier name, B. Contact Name, A. Unit Price
FROM product information a inner join supplier information B
On a. Supplier No. = B. Supplier No.
Order by a. Unit Price DESC
After running, the only difference from the preceding results is that the query results are sorted in descending order based on the "unit price.
2.2 Non-equivalent connections
The equal sign is not applicable to the connection conditions of the equijoin query, and other comparison operators constitute the non-equijoin query. That is to say, a non-equijoin query uses a comparison operator other than the equal operator to compare the column values of the connected column in the connection condition. In non-equi join queries, the following operators can be used:>, <,> =, <= ,! =, You can also use keywords such as BETWEEN... AND.
For example, a query is created based on the "supplier information" and "warehouse receiving ticket information" tables in the Database "Sales Management System. The query condition is that the "supplier number" in the two tables is not equal, and the "supplier name" and "supplier City" columns are returned ".
SELECT Supplier name, supplier City
FROM supplier information a inner join warehouse receiving ticket information B
On a. Supplier ID! = B. Supplier ID
After the preceding statement is executed, the query result returns 80 rows, most of which are repeated. We use the following statement to remove duplicate rows.
Select distinct Supplier name, supplier City
FROM supplier information a inner join warehouse receiving ticket information B
On a. Supplier ID! = B. Supplier ID
The information in the query results is exactly the same as that in the "supplier name" and "supplier City" columns in the "supplier information" table. Therefore, the preceding query has no practical application value.Non-equi-join queries often need to be used in conjunction with other connection queries, especially with equi-join queries.
2.3 natural connections
A natural join uses the equal sign (=) operator in the connection condition to compare the column values in the connected column. However, it uses the selection list to indicate the columns contained in the query result set, delete duplicate columns in the connection table. Simply put, removing duplicate attribute columns in the equijoin is a natural join.
Columns with the same name are automatically connected for record matching. Natural connections do not have to specify any equivalent connection conditions. The SQL implementation method identifies columns with the same name and then forms a match. However, although you can specify the columns included in the query results, you cannot specify the columns to be matched.
For example, in the Database "jewelry marketing system", a natural connection query is created based on the "customer information" and "commodity information" tables. The query condition for this connection is that the "city where the consumer is located" and the "city where the jeweler is located" are the same in the two tables, the query results are returned Based on the column "Consumer name", "Consumer address", "commodity name", and "commodity address.
SELECT consumer name, consumer address, city where the consumer is located AS city, and name of the jeweler. Jewelry dealer address
FROM customer information A INNER JOIN B
On a. City where consumers are located = B. City where jewelry dealers are located
After the preceding statement is executed, the result is analyzed. Although the use of natural queries can eliminate duplicate rows in the query results, it can be found from the query results of the preceding statement that the query results are also formed by cartesian products.
3. External Connection
If at least one row in the same two tables meets the connection conditions, the inner Join Operation returns the row. The outer connection returns all matched rows and some unmatched rows, depending on the type of the established connection. SQL supports three types of external connections:
Left Outer JOIN: returns all matched rows and returns all unmatched rows from the table on the left of the keyword JOIN.
Outer right JOIN: returns all matched rows and returns all unmatched rows from the table on the right of the JOIN keyword.
Full join: returns all matched rows and unmatched rows.
3.1 left Outer Join
The query result set of the left Outer Join includes all rows in the specified left table, not just the rows matched by the connection column. If a row in the left table does not find a matched row in the right table, the corresponding position of the right table in the result set is NULL.
In the SELECT statement for left outer join query, use the left outer join keyword to connect two base tables.
For example, in the Database "jewelry marketing system", the left outer join query is performed on the "Consumer Information" and "commodity information" tables in the same city on the condition of consumers and commodity owners.
SELECT consumer name, consumer address, consumer city AS city, jeweler name, and jeweler address
FROM customer information a left outer join merchant information B
On a. City where consumers are located = B. City where jewelry dealers are located
After execution, the query results contain NULL-worthy data in the three rows. Although the three rows do not match the columns, they are still included in the query results, the reason is that the information in the "Consumer name" column in the three rows is included in the left table. This means that when performing a left outer join query, no matter whether the rows in the left table can find matching rows in the right table, the query results will be displayed in this row, the only difference from other rows is that the position corresponding to the row in the right table is replaced by a NULL value.
3.2 Right Outer Join
The right Outer Join refers to the reverse join of the left Outer Join, except that all rows in the right table are specified in the inserting result set. If no matched row is found in the left table for a row in the right table, the corresponding position of the Left table in the result set is NULL.
In the SELECET statement of the right outer join query, use the right outer join keyword to connect two base tables.
For example, in the Database "commodity marketing system", right outer links are made to the "Consumer Information" and "commodity information" tables on the condition of consumers and commodity owners in the same city.
SELECT consumer name, consumer address, consumer city AS city, jeweler name, and jeweler address
FROM customer information a right outer join jeweler information B
On a. City where consumers are located = B. City where jewelry dealers are located
After execution, the query results contain NULL data. Although the two rows do not match the columns, they are still included in the query results, the reason is that the information in the "commodity name" column in the two rows is contained in the right table. This means that when you perform a right outer join query, no matter whether the rows in the right table can find matching rows in the left table, the query results will be displayed in this row, the only difference from other rows is that the position corresponding to the row in the left table is replaced by a NULL value.
3.3 full connection
Returns all rows in the left and right tables. If a row does not match a row in another table, the corresponding column value of the other table is NULL. If the right row matches a table, the entire result set contains the data value of the base table.
In the SELECT statement for a fully connected query, use the full outer join keyword to connect two base tables.
For example, in the Database "commodity marketing system", the "Consumer Information" and "commodity information" tables are fully connected based on the consumer and commodity in the same city.
SELECT consumer name, consumer address, consumer city AS city, jeweler name, and jeweler address
FROM customer information a full outer join jeweler information B
On a. City where consumers are located = B. City where jewelry dealers are located
After execution, the query results contain six rows containing NULL value data. Although these six rows do not have matching columns, they are still included in the query results, the reason is that the information in the "commodity name" column in the two rows is contained in the right table. This means that when performing a full join query, whether the left table or the right table can find matching rows, it will display the row in the query results, instead, use NULL instead of finding a matching position.
4. Joint Query
If there are multiple different query results, but you do not want to connect them together to form data. In this case, you can use the UNION clause. A query using the UNION clause is called a UNION query. It can combine the results of two or more queries into a single result set, this result set contains all row data in all query result sets in the Union query. The Union query is different from the join query for the columns in two tables. The former is to combine the rows in two tables, and the latter is to match the column data in the two tables. The syntax of the joint query is as follows:
SELECT <select_list>
FROM <table_reference>
[WHERE <search_condition>]
{UNION [ALL]
SELECT <select_list>
FROM <table_reference>
[WHERE <search_condition>]}
[Order by <order_condition>]
When performing a joint query, the UNION clause automatically deletes duplicate rows. The column title of the query result is the column title of the first query statement. Therefore, you must define the column title in the first query statement.
For example, in the "customer information" table of the "jewelry marketing system" in the database, query the name, home address, and contact number of the consumer whose city is "Beijing, add a Category column as "Category" and the column content as "consumer". From the "commodity information" table, query the information of the jeweler in the city where the jeweler is located, and add a column with the content of the column being "Jeweler". Finally, combine the two query results.
SELECT consumer name AS name, consumer address AS home address, contact number, 'consumers' AS Category
FROM customer information
WHERE consumer city = 'beijing'
UNION
SELECT name, address, phone number, 'preferer'
FROM jeweler Information
WHERE city = 'beijing'
In SELECT statements using UNION, if you want to sort the UNION query results, you must use the column name, column title, or column number in the first query statement. In addition, the ORDER clause 'ORDER BY' is the most useful number to specify the ORDER. If no number is required, the column names in the subquery of the Union query must be the same. You can also use aliases to unify the column names.
In addition, when sorting the results of the Union query, the order by clause must be placed behind the SELECT clause.
For example, in the "customer information" table of the "Sales Management System", query the name, home address, and contact number of a customer whose "customer number" is not greater than 1005, add the "function" column for The type column, and the column content is "customer". From the "salesman information" table, query the name, home address, and contact number of a clerk whose "Clerk Number" is not greater than 1005, and add a column with the content as "clerk". Finally, combine the two query results and sort them in ascending order of the type "function.
SELECT customer name AS name, customer address AS home address, contact number, and 'customer' AS Function
FROM customer information
WHERE customer No. <= 1005
UNION
SELECT salesman name, home address, phone number, 'salesman'
FROM salesman Information
WHERE salesman No. <= 1005
Order by function
Union all is another method for table UNION. The only difference between it and UNION is that it does not delete duplicate rows or automatically sort rows. If you want to display duplicate rows in the query results when performing a UNION query on a table, you can use union all. Here, we will not give examples one by one.
5. Cross-join and self-join
There are two special connection query methods in connection query: Cross-join and self-join. In the result of a cross-join query, each of the two tables may have one row in pairs. A table is connected to itself for query.
5.1 cross join
The two tables will generate all possible combinations of rows from the two tables. The Cartesian product of all rows in the joined two tables is returned when the WHERE clause is not included, that is, the number of rows returned to the result set is equal to the number of rows that meet the query conditions in the first table multiplied by the number of rows that meet the query conditions in the second table. When a cross-join clause has a WHERE clause, the return result is the Cartesian product that connects all rows of the two tables minus all rows omitted by the WHERE clause.
The cross-join operation is very similar to the basic connection operation described above. The only difference is that in the FROM clause, multiple table names are separated by the cross join keyword instead of commas, the ON keyword cannot be used to limit the connection conditions like the internal and external connections, but the connection conditions can be limited to the WHERE clause.
For example, in the Database "sales management system", the "Product Information" table and "supplier information" table are crossly connected. It is required to return information about the city where the origin and supplier of the product are both "Shanghai" or "Nanjing.
SELECT Product Information. Product Name, product information. Origin, supplier information. Supplier name
FROM product information cross join supplier information
WHERE product information. Origin = supplier information. City of the Supplier
AND (product information. Origin = 'shanghai' OR product information. Origin = 'nanjing ')
Order by product information. Origin
5.2 self-connection
Connections can be performed between different tables, or a table can be connected to itself. This connection is a self-connection. The corresponding query is called a subconnection query. A sub-join is an internal or external connection to the table itself.
Sub-Join Operations can use aliases to connect a table. Essentially, this seed join method is similar to the join operation for two tables. Only an alias is assigned to the table every time it is listed.
For example, you can perform a self-connection to the "customer information" table in the "sales management system" to query the customer information with the same "customer address, in the query results, only the information of "customer address": "Haidian District, Beijing" or "Chaoyang District, Beijing" is returned.
Select a. Customer name, A. contact number, A. Customer address, B. Customer name, B. Contact number
FROM customer information a inner join customer information B
On a. Customer address = B. Customer address
Where a. Customer address = 'haidian district, Beijing'
Or a. Customer address = 'chaoyang district, Beijing'
Because this query involves the connection between the "customer information" table and itself, the "customer information" table is displayed in two roles. To distinguish these roles, you must provide two different aliases (A and B) for the "customer information" table in the FROM clause. These two aliases are used to limit the names of the columns in the query. If you want to remove the line that the customer matches with itself, you only need to add a condition in the preceding statement to change the preceding statement:
Select a. Customer name, A. contact number, A. Customer address, B. Customer name, B. Contact number
FROM customer information a inner join customer information B
On a. Customer address = B. Customer address
WHERE (A. Customer address = 'haidian district, Beijing'
Or a. Customer address = 'chaoyang district, Beijing ')
And a. Customer name <> B. CUSTOMER NAME
Compared with the preceding query results, it is found that the customer's own matching rows are exceeded in the query results.
6. Learning Summary
When studying the content of this chapter, it is a headache, and the content is indeed a little more. However, the connection is very close, and it is not very laborious to learn. Finally, we sum up the learned knowledge.
In the connection learning of multiple data tables, the relationship between data tables is very complex and more knowledge is learned during continuous exercises.