A big and comprehensive choice between SQL and functions: Who is a more effective SQL ?, More effective SQL

Source: Internet
Author: User

A big and comprehensive choice between SQL and functions: Who is a more effective SQL ?, More effective SQL

Reprinted please indicate the source: jiq •'s technical Blog

Note that the Oracle11g database is discussed here. It is not applicable to SQL Server, MYSQL, and other databases.


1. Problem Description

There are two tables in the current system. One is the product table and the other is the attachment table.

PRODUCT (PRODUCTID, PRODUCTNAME, PRODUCTPRICE, PICTURE)

ATTACHMENT (ATTACHMENTID, FILEPATH, BUSINESSTYPE, BUSINESSID)

The product table contains the product ID, product name, product price, image, and other fields. The attachment table contains the attachment ID, full path of the attachment, and the module ID of the attachment (for example, the product module is 105 ), ID of the record to which the attachment belongs (here is the product ID ).

The requirement is to display the list of all products on the query interface of all products, and a "video or not" field is required after each record to indicate whether the current product record contains a video. In addition, you can query products based on whether videos are available in the query condition area above.


2. Solution

Concepts such as left link, regular expression, function, deduplication, and ROWNUM

Method 1: A large and comprehensive SQL statement

When I did this, I first thought of using a complicated SQL statement to spell out an SQL statement to find all the product records to be displayed.

Therefore, first think of the Left link, because we need to ensure that all records in the product table on the left can be displayed without any omissions, so we have written such an SQL statement:

select T.*,A.BUSINESSTYPEfrom PRODUCT T left join ATTACHMENT AonT.PRODUCTID=A.BUSINESSID and regexp_like(A.FILEPATH, '^.*.(rmvb|avi|mtv|3gp|flv|mpeg|mov)')

Link the product table and the attachment table to the left. The attachment table on the right is connected to the left table record only when the BUSINESSID is equal to the product ID on the left and the attachment is a video file.

Note: The where condition uses a regular expression to judge, that is, the regexp_like function.


However, when a product has two video files, the problem arises. For example, if the product with ID 331 has two video attachments, the ATTACHMENT table has two ATTACHMENT records to meet the following requirements:

T.PRODUCTID=A.BUSINESSID and regexp_like(A.FILEPATH, '^.*.(rmvb|avi|mtv|3gp|flv|mpeg|mov)')
Where condition, the two attachment records will be connected to the left table, because the left link does not mean that the Left table is used as the benchmark, the number of left tables, and the number of connected records, in this case, the number of records connected is more than the number of records in the original left table. The connection result is as follows:

PRODUCTID... BUSINESSID BUSINESSTYPE

331 331 105

331 331 105

You may have already thought that we can perform de-duplication, so the following statement appears:

select distinct T.*,A.BUSINESSTYPEfrom PRODUCT T left join ATTACHMENT AonT.PRODUCTID=A.BUSINESSID and regexp_like(A.FILEPATH, '^.*.(rmvb|avi|mtv|3gp|flv|mpeg|mov)')

Because it seems that the above repeated records are identical, the distinct keyword is used for record deduplication.

Unfortunately, this statement compilation error occurs. At first, I thought that if there are more than two videos in a product, the record records are identical.

Unfortunately, this is different in my scenario, because the product table contains a PICTURE field, which is BLOB type and stores the binary data of images. The distinct keyword is not allowed to be deduplicated.


You can remove duplicates in the following way:

select * from(select distinct T.*,A.BUSINESSTYPE,ROW_NUMBER() OVER(PARTITION BY T.PRODUCTID order by T.PRODUCTNAME) rkfrom PRODUCT T left join ATTACHMENT AonT.PRODUCTID=A.BUSINESSID and regexp_like(A.FILEPATH, '^.*.(rmvb|avi|mtv|3gp|flv|mpeg|mov)'))where rk=1

That is, based on the repeated PRODUCTID, as long as the repeat takes only the first record.

A qualified SQL statement has been constructed. The query result is that all records in the product table are listed only once. Each product record is followed by the BUSINESSTYPE field, if this field has a value, it indicates that this product record has a video attachment. If there is no value, it indicates that there is no video attachment.

When querying, you only need to set a layer outside the preceding SQL statement with or without a video as the query condition, and add the where BUSINESSTYPE = '2016' condition.


Method 2: advanced features such as functions

After communicating the above method with more experienced colleagues in the project team, I found that this method was not elegant enough. I felt that my database experience was too small, there are many elegant ways to solve this problem.

The solution above is too complicated first. Second, you can check whether the BUSINESSTYPE field has a value or 105 in the record after the connection.

In more cases, we may not only need to keep the values of a field in the right table after the join operation, but also need to convert the values of some columns !!!

For example, we want to add a column next to the product table. If the value of this column is 0, there is no video attachment; if it is 1, there is a video attachment. In other scenarios, we want to add a column after the product table. The value in this column indicates the number of accessories owned by the product. Where the information comes from, it must be from the attachment table, but is it possible to achieve this simply by linking two tables?


At this time, we need to use functions to implement:

select distinct T.*,FUNC_HASATTACHMENT(t.PRODUCTID)from PRODUCT T

It seems that this method is more elegant, more scalable, and more functions can be added later. All the information you want to know can be achieved. Theoretically, all functions can be implemented.

The following describes how this function returns whether there are attachments based on the value of the product ID (0-no video attachment, 1-video attachment)

create OR REPLACE function FUNC_HASATTACHMENT(productID in decimal)return numberisnum number;result number;BEGINresult:=0;select count(ATTACHMENTID) into num from ATTACHMENT where BUSINESSTYPE='105' and BUSINESSID=productID and regexp_like(A.FILEPATH, '^.*.(rmvb|avi|mtv|3gp|flv|mpeg|mov)');if(num>0) then result:=1;else result:=0;endif;return result;END;

You can see that we only need to input the product ID. Based on this ID, we can figure out whether this product has video attachments in the attachment table.

It seems that when we consider a database operation and feel a little troublesome, we should try to jump out of the "Big and full SQL" Strange Circle and try to use processes/functions and triggers, view and other advanced features to help solve the problem!


The following is a function description:

CREATE [or replace] FUNCTION function_name (arg1 [{IN | OUT | in out}] type1 [DEFAULT value1], [arg2 [{IN | OUT | in out}] type2 [DEFAULT value1],... [argn [{IN | OUT | in out}] typen [DEFAULT valuen]) [authid definer | CURRENT_USER] RETURN return_type IS | AS <type. variable declaration> BEGIN execution RETURN expressionEXCEPTION Exception Handling END function_name;

More details reference: http://blog.csdn.net/jiyiqinlovexx/article/details/13417455


Query Optimization-batch query:

Because we still have a large number of products, thousands or more, and there are many images, it is impossible to return all the images at a time, so we need to query them in batches and load only a portion of each click.

Mode: Click the query button, and the foreground will request the RPC interface on the server to return the first batch of 500 data records, and create a unique sessionID for this query, each time you click the "load more" button on the interface, you can use this sessionID and the two parameters that have been returned to the server to call the RPC interface on the server to continue to return the next batch of data.

If you click the "load all" button, start a background worker thread on the client and keep running the "load more" process.


The core is to record the sessionID of the current query session, how many records have been returned, and the SQL statements used for batch query.

The following is the SQL statement for batch query, and the query records of-are returned:

select * from(select A.*,ROWNUM RN from(select distinct T.*,FUNC_HASATTACHMENT(t.PRODUCTID) from PRODUCT T) A where ROWNUM<={1000})where RN>500

Note: In the program, 500 and 1000 are passed as parameters.



What are Aggregate functions in SQL statements?

The content here is purely plagiarism ~, I hope it will be useful ~

8.2 application of Aggregate functions

Aggregate functions are widely used in the query and Analysis of database data. This section describes the application of Aggregate functions.
8.2.1 SUM function -- SUM ()

SUM () is used to SUM data and return the SUM of all values in the selected result set. The syntax is as follows.

Select sum (column_name)

FROM table_name

Note: The SUM () function can only act on numeric data, that is, the data in column column_name must be numeric.

Use of instance 1 SUM function

Query the total salaries of all male teachers from the TEACHER table. For the structure and data of the TEACHER table, see Table 5-1 in section 5.2.1. Instance code:

Select sum (SAL) AS BOYSAL

FROM TEACHER

Where tsex = 'male'

The running result is 8.1.

Figure 8.1 total salaries of all male teachers in the TEACHER table

Processing of NULL values by instance 2 SUM function

Query the total salaries of teachers over 40 years old from the TEACHER table. Instance code:

Select sum (SAL) AS OLDSAL

FROM TEACHER

Where age> = 40

The running result is 8.2.

Figure 8.2 Total salaries of all instructors older than 40 in the TEACHER table

When a column is summed, if the column has a NULL value, the SUM function ignores this value.
8.2.2 counting function -- COUNT ()

The COUNT () function is used to calculate the number of records in a table or the number of values in a column. The calculation content is specified by the SELECT statement. When using the COUNT function, you must specify a column name or use an asterisk. An asterisk indicates that all records in a table are calculated. The two methods are as follows.

* COUNT (*): calculates the total number of rows in the Table. Even if the row data in the table is NULL, it is included.

* COUNT (column): calculates the number of rows contained in the column. If the data of a row in this column is NULL, this row is not counted in the total number of statistics.
1. Use the COUNT (*) function to COUNT the number of rows in the table

The COUNT (*) function returns a function that meets the search criteria in the WHERE clause of the SELECT statement.

Use of instance 3 COUNT (*) Function

Queries the number of rows of all records in the TEACHER table. Instance code:

Select count (*) AS TOTALITEM

FROM TEACHER

The running result is 8.3.

Figure 8.3 COUNT the number of rows in a table using the COUNT (*) Function

In this example, if the SELECT statement does not contain the WHERE clause, all rows in the table are considered to meet the SELECT statement. Therefore, the SELECT statement returns the count of all rows in the table, the results are consistent with the data in the TEACHER table listed in table 5-1 in section 5.2.1.

If the DBMS stores the number of rows in the Table in its system table, COUNT (*) will return the number of rows in the table quickly, because at this time, the DBMS does not have to read the table from start to end, it also counts rows in the physical table and extracts the row count directly from the system table. If the DBMS does NOT store the number of rows in the system table, it uses the column with the not null constraint as the parameter. Using the COUNT () function, it may COUNT the rows in the table more quickly.

Note:

The COUNT (*) function accurately returns the total number of rows in the table, and returns the correct row COUNT only when the parameter column of the COUNT () function does not have a NULL value, therefore, you can use the COUNT () function to replace the COUNT (*) function only when a column restricted by not null is used as a parameter.
2. Use the COUNT () function to COUNT data in a column

The COUNT () function can be used to COUNT data values in a column. And ignore all columns... the remaining full text>

How to organize an SQL statement in a concise manner, including Aggregate functions and judgment values

Select B. a, a. B, a. c from
(Select B, sum (C) as c from TABLE group by B),
(Select A, B from TABLE where (substr (A,-1) = 'A') B
Where a. B = B. B

--------- To LZ ----------
How long does it take for you to execute a and B queries?
If the two subqueries are fast, we recommend that you create an index in field B.

By the way, there is another problem. When using substr, it will invalidate the index. Your (substr (A,-1) = 'A' can be changed to a like '% ', it all means

--------------------------
If there are other where conditions, you can consider placing the index conditions on the front side. If the query speed is slow due to a large amount of data, there are not many good methods, but you can consider using partitions, but it seems not worth it.
Try using a forced index first.
The table is the table of the current subquery, and the index is the index that can be used. For example, the index on field B is used in the first subquery, and the index on field a is used in the second subquery.

Select B. a, a. B, a. c from
(Select/* + index (TABLE Name, index name) */B, sum (C) as c from TABLE group by B),
(Select/* + index (TABLE Name, index name) */A, B from TABLE where A like = '% A') B
Where a. B = B. B

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.