Four technologies to help you improve SQL Server performance

Last Update:2016-02-18 Source: Internet

Author: User

Tags joins server memory sql using

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Sometimes, in order for the application to run faster, all that is done is to make minor adjustments here or there. But the key is to determine how to adjust! Sooner or later you'll encounter this scenario: the SQL query in your application doesn't respond the way you want. It either doesn't return data, or it takes a surprisingly long time. If it lowers the speed of the enterprise application, the user must wait a long time. Users expect applications to respond quickly and their reports can return analytic data in an instant. As far as I'm concerned, I'm impatient if a page takes more than 10 seconds to load while surfing the web.

In order to solve these problems, it is important to find the root of the problem. So, where do we start? The root cause is usually the database design and the query that accesses it. I'll cover four technologies that can be used to improve the performance of SQL Server-based applications or to improve their scalability. I will carefully explain the use of the left join, cross join, and the retrieval of the identity value. Keep in mind that there is no magic solution at all. Adjusting your database and its queries requires time, analysis, and a lot of testing. These technologies have proven to work, but for your application, some of these technologies may be more appropriate than others.

Return IDENTITY from INSERT

I decided to start with a lot of questions: how to retrieve the identity value after executing SQL INSERT. Typically, the problem is not how to write queries that retrieve values, but where and when to retrieve them. In SQL Server, the following statement can be used to retrieve the IDENTITY value created by the most recent SQL statement running on the active database connection:

SELECT @ @IDENTITY

This SQL statement is not complex, but one thing to keep in mind is that if this latest SQL statement is not an INSERT, or if you run this SQL on another connection to non-INSERT SQL, you will not get the expected value. You must run the following code to retrieve the IDENTITY immediately following the INSERT SQL and on the same connection as follows:

INSERT into Products (ProductName) VALUES (' Chalk ')

SELECT @ @IDENTITY

Running these queries against the Northwind database on one connection returns the IDENTITY value of a new product called Chalk. So, in the Adovisual basic application, you can run the following statement:

<ccid_nobr>

<ccid_code>set oRs = Ocn.execute ("Set NOCOUNT On;insert into Products _ (ProductName) VALUES (' Chalk '); SELECT @ @IDENTITY ") Lproductid = oRs (0)

This code tells SQL Server not to return the row count of the query, then executes the INSERT statement and returns the IDENTITY value that was just created for the new row. The set NOCOUNT on statement indicates that the recordset returned has a row and a column containing the new IDENTITY value. Without this statement, an empty recordset is returned first (because the INSERT statement does not return any data), then a second recordset is returned, and the second recordset contains the IDENTITY value. This can be confusing, especially because you never wanted INSERT to return a recordset. This happens because SQL Server sees the row count (that is, one row is affected) and interprets it as representing a recordset. As a result, the real data is pushed back to the second set of records. Of course you can get this second recordset using the NextRecordset method in ADO, but it is more convenient and more efficient if you can always return the recordset first and return only that recordset.

Although this method works, you need to add some extra code to the SQL statement. Another way to get the same result is to use the SET NOCOUNT on statement before INSERT and place the SELECT @ @IDENTITY statement in the for INSERT trigger in the table, as shown in the following code fragment. In this way, any INSERT statement that enters the table will automatically return the IDENTITY value.

<ccid_nobr>

<ccid_code>create TRIGGER Trproducts_insert on products for Insert as  SELECT @ @IDENTITY  GO

The trigger starts only when an insert occurs on the Products table, so it always returns an IDENTITY after a successful insert. With this technique, you can always retrieve the IDENTITY value in the application in the same way.

Inline views and temporary tables

At some point, a query needs to join data with some other data that might be collected only by performing a GROUP by and then executing a standard query. For example, if you want to query information about the latest five orders, you first need to know which orders are. This can be retrieved using an SQL query that returns the order ID. This data is stored in a temporary table (this is a common technique) and then joined with the Products table to return the number of items sold on these orders:

<ccid_nobr>

<ccid_code>create TABLE #Temp1 (OrderID INT NOT NULL, _ OrderDate DATETIME not null) INSERT into #Temp1 (OrderID, Or derdate) Select TOP 5 O.orderid, o.orderdate from Orders o ORDER by o.orderdate DESC SELECT p.productname, SUM (OD. Quantity) as productquantity from #Temp1 t  INNER joins [Order Details] od on t.orderid = od. OrderID INNER JOIN products p on od. ProductID = P.productid  GROUP by P.productname ORDER by P.productname DROP TABLE #Temp1

These SQL statements create a temporary table, insert data into the table, join other data to the table, and then drop the temporary table. This causes this query to perform a large number of I/O operations, so you can rewrite the query and replace the temporary table with an inline view. An inline view is simply a query that can be joined to the FROM clause. So, instead of consuming large amounts of I/O and disk access on temporary tables in tempdb, you can use inline views to get the same result:

<ccid_nobr>

<ccid_code>select P.productname,  SUM (OD. Quantity) as productquantity from (SELECT TOP 5 O.orderid, o.orderdate from Orders o  ORDER by o.orderdate DESC) t
        inner JOIN [Order Details] od on t.orderid = od. OrderID INNER JOIN products p on od. ProductID = P.productid GROUP by P.productname ORDER by P.productname

This query is not only more efficient than the previous query, but also shorter in length. Temporary tables can consume a lot of resources. If you only need to join data to other queries, you can try using inline views to conserve resources.

Avoid left joins and NULL

Of course, there are times when you need to perform a left JOIN and use a NULL value. However, they do not apply to all situations. Changing the way SQL queries are built can result in a less than a few seconds of a report that runs for a few minutes. Sometimes, you must adjust the shape of the data in your query to fit the way your application requires it to be displayed. Although the TABLE data type reduces the resource-intensive situation, there are many areas in the query that can be optimized. A valuable common function of SQL is the left JOIN. It can be used to retrieve all rows from the first table, all matching rows in the second table, and all rows in the second table that do not match the first table. For example, if you want to return each customer and their orders, use the left JOIN to display the customers with orders and no orders.

This tool may be overused. Left joins consume a lot of resources because they contain data that matches NULL (nonexistent) data. In some cases, this is unavoidable, but the cost can be very high. The left join consumes more resources than the INNER join, so if you can rewrite the query so that the query doesn't use any left join, you get a pretty good return.

One technique for speeding up query speed with a LEFT join involves creating a table data type, inserting all the rows in the first table (the table to the left of the JOIN), and then updating the table data type with the values from the second table. This technique is a two-step process, but it can save a lot of time compared to a standard left JOIN. A good rule is to try out a variety of technologies and record the time required for each technology until you get the best execution-performance queries for your application.

When you test the speed of a query, it is necessary to run the query multiple times and then take an average. Because the query (or stored procedure) may be stored in the process cache in SQL Server memory, the first attempt seems to take a little longer, and all subsequent attempts are less time consuming. Additionally, when you run your query, other queries may be running against the same table. When other queries lock and unlock these tables, you may cause your queries to wait in a queue. For example, if you make a query when someone is updating the data in this table, your query may take longer to execute when you update the submission.

The simplest way to avoid slowing down with left joins is to design the database around them as much as possible. For example, suppose a product might have a category or it might not have a category. If the Products table stores the ID of its category, and there is no category for a particular product, you can store NULL values in the field. Then you must perform a left JOIN to get all the products and their categories. You can create a category with a value of "no category", which specifies that a foreign key relationship does not allow NULL values. By doing this, you can now use the INNER JOIN to retrieve all products and their categories. While this may seem like a workaround with redundant data, it can be a valuable technique because it eliminates the use of more resource-intensive left joins in SQL batch statements. Using this concept all in the database can save you a lot of processing time. Keep in mind that for your users, even a few seconds is important, because when you have many users accessing the same online database application, these seconds actually mean a lot.

Flexible use of the Cartesian product

For this technique, I will introduce you in great detail and advocate the use of the Cartesian product in some cases. For some reason, the Cartesian product (cross JOIN) has been a lot of condemnation, and developers are often warned not to use them at all. In many cases, they consume too much resources to be efficient to use. But like any tool in SQL, they can be valuable if used correctly. For example, if you want to run a query that returns monthly data (even if a customer does not have an order in a specific month), you can easily use the Cartesian product.

While this may seem like nothing magical, consider that if you have a standard INNER JOIN from customer to order (these orders are grouped by month and subtotal sales), only the month in which the customer has an order is obtained. Therefore, you will not receive a value of 0 for the month when the customer does not order any products. If you want to draw a diagram for each customer to show monthly and monthly sales, you might want this chart to include months with monthly sales of 0 to visually identify those months. If you use SQL from Figure 2 (the last page), the data skips a month with sales of $0, because in the order table there is no row for 0 sales (assuming you only store events that occur).

The code in Figure 3 (the last page) is long, but it can achieve the goal of getting all sales data, even months without sales. First, it extracts a list of all the months of the previous year, and then puts them in the first table data type sheet (@tblMonths). Next, this code gets a list of all the customer companies that have sales for that time period, and then puts them in another table data type sheet (@tblCus-tomers). Both tables store all the basic data necessary to create a result set, except for the actual number of sales. The first table lists all the months (12 rows), and the second table lists all the customers with sales for that time period (81 for me). Not every customer has purchased a product every month in the last 12 months, so performing a INNER join or a LEFT join does not return each customer for each month. These actions return only the customers and months that purchased the product.

The Cartesian product can return all customers for all months. The Cartesian product basically multiplies the first table with the second table, producing a rowset that contains the result of multiplying the number of rows in the first table with the number of rows in the second table. Therefore, the Cartesian product returns 972 rows to the table @tblFinal. The final step is to update the @tblFinal table with the monthly sales totals for each customer in this date range, and select the final rowset.

Cross joins can be used with caution if the resources that are consumed by the Cartesian product can be many and do not require a true Cartesian product. For example, if you perform a cross JOIN on a product and a category and then use the WHERE clause, DISTINCT, or GROUP by to filter out most rows, using the INNER JOIN results in the same result and is much more efficient. If you need to return data for all possibilities (such as when you want to populate a chart with a monthly sales date), the Cartesian product can be very helpful. However, you should not use them for other purposes, because INNER JOIN is much more efficient in most scenarios.

Supplements complement 0

Here are some other common techniques that can help improve the efficiency of SQL queries. Let's say you'll group all salespeople by region and subtotal their sales, but you want only those salespeople who are marked as active in those databases. You can group salespeople by area and use the HAVING clause to eliminate salespeople who are not active, or you can do so in the WHERE clause. Doing this in the WHERE clause reduces the number of rows that need to be grouped, so it is more efficient than performing this operation in the HAVING clause. The filter of a row-based condition in the HAVING clause forces the query to group the data that will be stripped in the WHERE clause.

Another technique for improving efficiency is to use the DISTINCT keyword to find separate reports for rows of data instead of using the GROUP by clause. In this case, SQL using the DISTINCT keyword is more efficient. Use GROUP by when you need to calculate aggregate functions (SUM, COUNT, MAX, and so on). Also, if your query always returns a unique row for itself, do not use the DISTINCT keyword. In this case, the DISTINCT keyword only increases the overhead of the system.

As you've seen, there are a number of techniques that can be used to optimize queries and implement specific business rules, and the trick is to make some attempts and then compare their performance. The most important thing is to test, test, and re-test.

<ccid_nobr>

<ccid_code>figure 2 returning all Customers and their Sales set NOCOUNT Ondeclare @dtStartDate DATETIME, @dtEndDate DATETIME, @dtDate datetimeset @dtEndDate = ' 5/5/1997 ' SET @dtEndDate = DATEADD (DD,-1, Cast ((MONTH (@dtEndDate) + 1) A S VARCHAR (2)) + '/01/' + CAST (year (@dtEndDate) as VARCHAR (4)) + ' 23:59:59 ' as DATETIME) SET @dtStartDate = DATEADD (MM ,-1 *, @dtEndDate) SELECT CAST (Year (o.orderdate) as VARCHAR (4) + "-" +case when MONTH (O.orderdate) < Ten then ' 0 ' ' + CAST (month (o.orderdate) as varchar (2)) ELSE cast (month (o.orderdate) as varchar (2)) END as Smonth,c.customerid,c. Companyname,c.contactname,sum (OD. Quantity * od. UnitPrice) as Msalesfrom Customers Cinner join Orders o on c.customerid = O.customeridinner joins [Order Details] od on o.o Rderid = od. Orderidwhere o.orderdate between @dtStartDate and @dtEndDateGROUP BYCAST (year (o.orderdate) as VARCHAR (4)) + "-" +case WH EN MONTH (O.orderdate) < Ten then ' 0 ' + cast (MONTH (o.orderdate) as VARCHAR (2)) ELSE cast (MONTH (O.orderdate) as VARCHAR (2)) End,c.customerid,c.companyname,c.contactnameorder Byc.companyname,smonthfigure 3 Cartesian Product at Workdeclare @tblMonths table (Smonth VARCHAR (7)) DECLARE @tblCustomers table (CustomerID CHAR (Ten), Com Panyname varchar (ContactName) DECLARE @tblFinal TABLE (smonth varchar (7), CustomerID CHAR (10), CompanyName varchar (), ContactName varchar (msales money) DECLARE @dtStartDate DATETIME, @dtEndDate datetime,@ Dtdate datetime,@i integerset @dtEndDate = ' 5/5/1997 ' SET @dtEndDate = DATEADD (DD,-1, Cast ((MONTH (@dtEndDate) + 1) As varchar (2)) + '/01/' + CAST (year (@dtEndDate) as varchar (4)) + ' 23:59:59 ' as DATETIME) SET @dtStartDate = DATEADD (M M,-1 *, @dtEndDate)-get all months into the first tableset @i = 0WHILE (@i <) Beginset @dtDate = DATEADD (mm,-1  * @i, @dtEndDate) INSERT into @tblMonths SELECT CAST (Year (@dtDate) as VARCHAR (4) + "-" +case when MONTH (@dtDate) < 10 Then ' 0 ' + cast (MONTH (@dtDate) as VARCHAR (2)) ELSE cast(MONTH (@dtDate) as VARCHAR (2)) END as Smonthset @i = @i + 1end-get All clients ' who had sales during ' period into the ' Y ' Tableinsert into @tblCustom Ersselect distinctc.customerid,c.companyname,c.contactnamefrom Customers cinner JOIN Orders o on c.customerid = O. Customeridwhere o.orderdate between @dtStartDate and @dtEndDateINSERT into @tblFinalSELECT m.smonth,c.customerid,c. Companyname,c.contactname,0from @tblMonths m Cross JOIN @tblCustomers c UPDATE @tblFinal SET msales = Mydata.msalesfrom @t  Blfinal F INNER JOIN (SELECT c.customerid,cast (Year (o.orderdate) as VARCHAR (4)) + "-" +case when MONTH (O.orderdate) < Ten then ' 0 ' + cast (month (o.orderdate) as varchar (2)) ELSE cast (month (o.orderdate) as varchar (2)) END as Smonth,sum (OD. Quantity * od. UnitPrice) as Msalesfrom Customers Cinner join Orders o on c.customerid = O.customeridinner joins [Order Details] od on o.o Rderid = od. Orderidwhere o.orderdate between @dtStartDate and @dtEndDateGROUP Byc.customerid,cast (year (o.orderdate) as VArchar (4)) + '-' +case when month (o.orderdate) < Ten then ' 0 ' + CAST (MONTH (o.orderdate) as VARCHAR (2)) ELSE cast (MONT H (O.orderdate) as VARCHAR (2)) END) MyData on f.customerid = MyData. CustomerID and f.smonth = Mydata.smonth SELECT f.smonth,f.customerid,f.companyname,f.contactname,f.msalesfrom @ Tblfinal Forder Byf.companyname,f.smonth

Reprinted from: http://www.aspnetjia.com/Cont-205.html

Four technologies to help you improve SQL Server performance

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More