Improve efficiency and accuracy with well-formed SQL

Improve efficiency and accuracy with well-formed SQL _mssql

Last Update:2017-01-18 Source: Internet

Author: User

Tags aliases zip

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Well-Formed SQL does not work better than a messy SQL operation. The database doesn't really care. In SQL statements, you put commas in front of or behind the field name. For your own clarity, you should be an efficient SQL writer, and I recommend that you follow these formatting rules. In this article I'll share how to improve productivity with well-formed SQL statements. My definition of efficiency refers to the accurate results from the SQL output, and the code is clear and easy to understand, modify, and Debug. I only list the "SELECT" statement, because I write SQL statements 99% are query statements. Formatting SQL code is a very personal thing, and I'm very aware that developers think their formatting rules are the most reasonable.

Examples of problems

The following is a typical SQL scenario where the data from the business report comes from three tables, customer tables, sales tables, and geographical tables. Based on January 2015 data, the report needs to show the total number of customers and sales in each administrative district. This requirement can be achieved through a simple SQL statement, which requires three tables associated with the query.

Problems that may occur with data

While SQL is simple, ensuring that your results are correct is still the real key, because there are some reasons that can cause errors:

The data may come from a different data source. That means you can't guarantee the integrity between these tables. For example, you cannot assume that all zip codes in the Customer table are valid ZIP codes and must be present in the local table.
The application of input Customer table data may capture unverified location data and may include faulty zip codes.
The Postal Code table may not be complete. The newly published ZIP code may not be imported into the table in time after publication.

First principle

For me, getting the right results from SQL is the first priority compared to writing clear, readable SQL. The first thing I'm going to do is write the following SQL statement to get the total number of customers. I'll adjust it again after I've finished writing the whole statement.

The first sentence I wrote was this:

SelectCount (DISTINCT cust_id) as count_customersfromcustomers result 
 
: 
 
count_customers 
 
"10"

This query is important because it is tightly centered around the first principle. Because there is no SQL management query, there is no dependency, I know this is the correct result of the number of customers. I'm going to write this down because I always have to take this number to measure the next SQL (or not), which is also mentioned several times later in this article.

The next thing to do is to add the necessary fields and tables to complete the query. I purposely highlighted the word "add" because, according to my rules, I would comment out queries that could get the same result when applying the first principle. The following is my final formatted query statement.

Format SQL

Here is the format SQL that I recommend based on my formatting ideas.

SELECT 
 
0 
 
, C.cust_post_code 
 
, P.location 
 
, COUNT (DISTINCT c.cust_id) number_customers 
 
, SUM ( S.total_amount) as Total_sales 
 
from 
 
customers c 
 
JOIN post_codes p on c.cust_post_code = P.post_code 
 
JOIN sales s on c.cust_id = s.cust_id 
 
WHERE 
 
1=1 and 
 
s.sales_date BETWEEN ' 2015-01-01 ' and ' 2015-01-31 '-and 
 
s.order_id = 5 
 
GROUP 
 
by C.cust_post_code 
 
, P.location

Always use table aliases

Time will prove necessary to do so. If you do not use aliases for every field used in the SQL statement, you may be able to add a different name segment to the query at some point in the future. By then your query and even the report will produce errors (the name of the renamed section).

Before the comma is placed in the field

When debugging or testing my query, doing so makes it easy to comment out a field without having to modify other rows, and all commas are not missing or superfluous. If you don't, you might want to adjust the comma to make sure the statement is correct. If you often want to debug the statement, this will bring great convenience, efficiency will be higher. This practice applies to the "SELECT" section as well as the "GROUP by" clause.

At development time I used "SELECT 0" as the beginning of the statement, and it was easy to remove it when migrating to the formal environment. So we can write all the fields in front of them. Without this "0", if I wanted to comment out the first field ("C.cust_post_code" in this case), I would have to deal with the following comma question. I have to comment it out temporarily and add it back in the future. The same is true in the GROUP by statement. This "0" is an extra bonus.

Put "JOIN" on a separate line

Putting a "JOIN" statement on a stand-alone line has the following benefits :

It's easy to see all the tables involved in this query, just look at the scrolling "JOIN" statement.

You can put all the logical relationships in one place by using JOIN instead of listing all of the table and expression relationships in the WHERE clause. We can't always. The JOIN statement is placed on a line, but should at least be put together.

It is also relatively easy to comment out the "JOIN" statement in this way. This is useful when debugging, and you may need to know if "JOIN" is causing the data problem.

Column mode editing

Column mode editing is convenient when working with a large number of fields. Here's the first animated GIF show I've ever done, and you can comment out all nonclustered fields. I used column mode editing, not just commenting out the fields:

Create all Indexes

When you use more than a union statement for a field:

Comment out the list of fields for the GROUP by clause

Test Query Results

I have to use the external connection "OUTER" to list all the customers, because not all of the customer's ZIP code has a corresponding ZIP code in the local table. I can make sure that the results of my query are the same as the results of the first query (the statement that queries the customer separately) by including and excluding different fields and tables, which is actually the adherence to the primary principle.

Select0,c.cust_post_code-,p.location,count (DISTINCT c.cust_id) number_customers,sum (s.total_amount) as Total_ Salesfromcustomers c-left OUTER JOIN post_codes p on c.cust_post_code = P.post_codejoin sales s on c.cust_id = S.CUST_IDWH Ere1=1and s.sales_date BETWEEN ' 2015-01-01 ' and ' 2015-01-31 '-and c.cust_post_code = 2000-and P.post_code = 200GROUP BYc.cu St_post_code-,p.location

SQL like this means that I have to write independent tests to check the data. I can use the first principle to verify the accuracy of my query data by commenting out the lines of statements. Doing so improves my efficiency and reporting.

The above is the entire content of this article, I hope to help you learn.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More