Improve the efficiency and accuracy through well-formatted SQL statements, and improve the efficiency and accuracy of SQL statements

Source: Internet
Author: User

Improve the efficiency and accuracy through well-formatted SQL statements, and improve the efficiency and accuracy of SQL statements

Well-formatted SQL statements do not run better than messy SQL statements. In fact, the database is not very concerned with SQL statements. You put the comma before or after the field name. To make your own ideas clear, you should make an efficient SQL writer. I suggest you follow these format rules. In this article, I will share how to improve productivity through well-formatted SQL statements. My defined Efficiency refers to the ability to output accurate results from SQL, and the code is clear, easy to understand, modify, and debug. Only SELECT statements are listed, because SQL statement 99% I write is a query statement. Formatting SQL code is very personalized, and I know it varies from person to person. developers think their formatting rules are the most reasonable.

Example

The following is a typical SQL application scenario. The data in the business report comes from three tables: Customer table, sales table, and Region table. Based on the data in March, this report shows the total number of customers and sales in each administrative region. This requirement can be achieved through a simple SQL statement, and three tables need to be joined for query.

Possible data problems

Although SQL is very simple, ensuring that your results are correct is still the real key, because there are some reasons that may cause errors:

Data may come from different data sources. That is to say, you cannot guarantee the integrity between these tables. For example, you cannot assume that all the zip codes in the customer table are valid and must exist in the region table.
Applications that enter customer table data may capture unverified location data and may include incorrect zip codes.
The zip code table may not be complete. The newly released ZIP Code may not be imported into the table as soon as it is released.

Principle 1

For me, obtaining the correct results from SQL is the top priority compared to writing clear and easy-to-read SQL statements. The first thing I need to do is write the following SQL statement to get the total number of customers. After I write the entire statement, I will adjust it.

The first statement I wrote is as follows:

SELECTCOUNT(DISTINCT cust_id) as count_customersFROMcustomers  Result:  count_customers  “10” 

This query is very important because it closely follows the first principle. Because there is no SQL management query, there is no dependency. I know this is the correct result of the number of customers. I will write down this result, because I always need to use this number to measure the subsequent SQL statements (whether they are correct), which will be mentioned many times later in this article.

The next step is to add necessary fields and tables to complete the query. I specifically highlight the word "add", because according to my rules, I will comment out the query that can get the same results when applying the first principle. The following is the final formatted query statement.

Format SQL

The following is the formatted SQL statement recommended based on my formatting ideas.

SELECT  0  ,c.cust_post_code  ,p.location  ,COUNT(DISTINCT c.cust_id) number_customers  ,SUM(s.total_amount) as total_sales  FROM  customers c  JOIN post_codes p ON c.cust_post_code = p.post_code  JOIN sales s ON c.cust_id = s.cust_id  WHERE  1=1  AND s.sales_date BETWEEN ‘2015-01-01' AND ‘2015-01-31'  —AND s.order_id = 5  GROUP BY  c.cust_post_code  ,p.location 

Always use table alias

Time will prove that this is necessary. If you do not use an alias for each field in the SQL statement, you may add another field with the same name to the query statement in the future. At that time, your query and even the report will produce an error (a duplicate field name is displayed ).

Put the comma before the field

When debugging or testing my query statements, you can easily comment out a field without modifying other rows. No comma is missing or unnecessary. Otherwise, you may need to adjust the comma to ensure that the statement is correct. If you need to debug statements frequently, doing so will bring great convenience and improve efficiency. This method is also applicable to the "SELECT" and "group by" clauses.

During development, I used "SELECT 0" as the statement to start, and it was easy to delete it when it was migrated to the formal environment. In this way, we can write all the fields in front of them. Without this "0", if I want to comment out the first field (in this example, it is "c. cust_post_code"), I have to deal with the comma problem. I must temporarily comment it out and add it back in the future. The same is true in the "group by" statement. This "0" is additionally added.

Place JOIN into independent rows

Place the JOIN statement in the independent row with the following:Benefits:

By doing so, we can easily see all the tables involved in this query statement. You only need to check the rolling JOIN statement.

If you use JOIN to list all the table and expression relationships in the WHERE clause, you can place all logical relationships in one place. We cannot always put the JOIN statement into a row, but at least it should be put together.

In this case, it is relatively easy to comment out the JOIN statement. This is very useful during debugging. You may need to know if it is "JOIN" that causes data problems.

Column mode editing

When processing a large number of fields, the column mode editing is very convenient. The following is my first dynamic GIF presentation. you can comment out all non-clustered fields. I used the column mode to edit, not just comment out the field:

Create all indexes

When using UNION statements with many fields:

Comment out the field list of the "group by" clause

Test query results

I have to use the external connection "OUTER" to list all customers, because not all customers' zip codes have corresponding zip codes in the region table. I can include and exclude different fields and tables for repeated operations to ensure that the results of my query are the same as those of the first query (the statement used to separately query the customer, in fact, this is to observe the first principle.

SELECT0,c.cust_post_code—,p.location,COUNT(DISTINCT c.cust_id) number_customers,SUM(s.total_amount) as total_salesFROMcustomers c—LEFT OUTER JOIN post_codes p ON c.cust_post_code = p.post_codeJOIN sales s ON c.cust_id = s.cust_idWHERE1=1AND s.sales_date BETWEEN ‘2015-01-01' AND ‘2015-01-31'—AND c.cust_post_code = 2000—AND p.post_code = 200GROUP BYc.cust_post_code—,p.location 

For me, an SQL statement like this means that I have to write an independent test to check data. I can use the first principle to verify the accuracy of my data query by using the comments. This improves my efficiency and reports.

The above is all the content of this article, hoping to help you learn.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.