Step 10: optimize SQL Server database performance (roughly flipped down)

Source: Internet
Author: User
English version: Shanghai. You have a good client interface, so in a short period of time, you can attract thousands of users to register and use your site. Your customers, management, team, and you are very happy every day. Life is not a rose garden. When the number of users on your site grows rapidly every day, problems begin to emerge. The customer started to complain via email that the site access was very slow (and some Angry emails), and the customer requested improvement and began to lose your users. You start to analyze this application. Soon you discovered the problem. When the application tried to store and update data, the database execution was very slow. The data table in the database has become very large, containing hundreds of thousands of rows of data. The test team executes a test on the product site. It takes 5 minutes for the data obtained during the sorting and submission process to succeed. However, the user only needs 2/3 seconds to successfully test the site. This is an old story. There are thousands of mature application projects around the world. Almost every developer, including me, spends some time watching him grow. So, I know Step 1: Apply proper indexing in the table columns in the database Step 1. If you create an appropriate index in the column of the table, creating an appropriate index should be the first step in database optimization. However, I think a reasonable index should be taken into consideration first because of the following two points: 1. This will be the most effective measure to improve application performance in the shortest time; 2. submitting an index in the database does not require you to modify the application, so you do not need to re-generate or deploy it. Of course, if you can find it, this is a fast performance improvement, the index is not completed properly in the current database. However, if all the indexes have been completed, I do not recommend this step. What is an index? I think you know what an index is. However, I have seen some people confuse this. So let's try the index again and let's read a little story. There is a large library in an old city. They have thousands of graphic books, but the books are not sorted on the shelves. Therefore, every time a person asks a book in the library, the library must check every book until he finds the book he wants. It takes several hours to find the desired book in the library and long waiting for the guests. [The table does not have a primary key, so we need to find it in all the data. The database engine scans the entire table to find the rows that match and which rows are executed very slowly.] As the number of books in the library increases day by day and more people borrow books, library administrators are miserable. Then one day, a wise person came to the library and saw the situation in the library. He advised him to arrange these on the shelf according to the number of each book and their numbers. "What are the benefits I will get," the librarians asked, and wise people replied, "So now, if someone gives you a book number and asks for the book, you can quickly discover the shelves that contain the book's numbers and put them on hold. You can find the books that are quickly arranged based on their numbers ". [The book number sounds like creating a primary key in a data table, when you create a primary key in a data table, a clustered index tree is created, and all data pages contain all rows in the table are physically sorted by their primary key values in the file system, each data page contains the primary keys of rows for sorting. Therefore, when querying rows in a table, the database server first finds the compliant data header and uses the clustered index tree, then find the data row containing the keyword in the data page] "this is what I need!" The excited librarians immediately began numbering books and placing those different bookshelves. He spent a whole day on this arrangement, but at the very best, he tested the effect of searching for a book, and the librarians were very happy. [When you create a table primary key. Internally, data files are performed according to the primary key value. The clustered index tree is created and the data page is physically sorted. It is also easy to understand that a table can only create one unique clustered index, and only one column value can be used as the primary key. It's like a book can only use one standard number] and so on! One problem cannot be solved. On the next day, a person asks a book and only knows the name of the book (the number of the book is unknown). The poor librarians can't find the book from 1, and finally find it on shelf 67. It took the Administrator 20 minutes to find the book. At the earliest time, he used 2 or 3 hours to search books before arrangement, so he still needs to improve. However, the comparison time is 30 seconds, which seems to be very high for the administrator. So he asked smart people how to improve. [Product ID when you have a primary key for a product table, but you have no other indexes in this table. Therefore, when you query a product by product name, the database engine does not have a way to scan all physically sorted data pages until you find the desired book.] The wise man told the library administrator, "okay, because you have sorted all the books by their numbers, and you cannot sort them again. Therefore, it is better to create a directory or index for all titles and the corresponding number. However, in this directory, sort the titles in alphabetical order and group the titles. If you want to find the names "Database Management System ", you can follow these steps to find the title of the book. 1. Skip to the "D" section, and find the title of the book. 2. Find the book number. "You are a genius." The librarians shouted. After several hours, he created a name directory and used the name to test it. Now it takes only one minute to find the desired book. Librarians believe that people will ask for standard searches such as titles. Therefore, he creates other similar directories. The common criteria (book number, name, author name) can be found in 1 minute. The suffering librarians quickly ease the queuing process because it can be truly fast. The librarians have a pleasant life. The story ends. Now, I'm sure you understand what indexes are, why they are important, and how they work internally. For example, if we have a product table, create a clustered index (automatically created when the primary key column is created) and create a non-clustered index in the product name column. If we do this, the database engine creates an index tree with non-clustered indexes (such as the book name) and sorts the product names on the index page. Each index page will contain some product range names along with their primary keywords. Therefore, when a product is queried, the database engine first queries the non-clustered Index Tree of the product name. One query, the database engine then finds the rows in the actual book in the clustered index tree based on the primary key. The index tree looks like the following: The index tree structure is called B + tree (balanced tree ). When the search tree index starts from the root node, the intermediate node contains the range value and only where the SQL engine goes. The leaf node contains the actual given value. If this is a clustered index, the leaf node is a physical data page. If the index is not clustered, the leaf node contains the index value, and the value on the clustered index tree can be found along the index value database engine. Generally, it takes very little time to locate a required value in the index tree and to skip the actual row from there. Therefore, data query operations are generally improved during indexing. Therefore, apply the fastest returned result set of the index in your database at any time. Follow these steps to ensure that the appropriate indexes are built in your database. In your database, each table has a primary key. This ensures that each table has a clustered index created and these pages are physically sorted in the table. Therefore, some operations to retrieve data from a data table use the primary key or some sort operations, and the table can be sorted as quickly as possible. The following conditions are suitable for creating non-clustered indexes: 1. Frequently Used query conditions; 2. Using join to other tables; 3. Using Foreign keys; 4. High Selectivity (columns with a total number of rows between 0% and 5%) 5. Use sorted columns; 6. The object is XML (primary and secondary indexes need to be created. For more information, see the following example: Create indexnclix_orderdetails_productid ondbo. orderdetails (productid) or you can use SQL manager to create Step 2: create an appropriate Covering Index. Therefore, you have created all the appropriate indexes in your database, OK? Assume that you are creating a productid in the sales table (selesid, salesdate, salespersonid, productid, qty. Now, suppose that productid is a highly selective column (select the productid value of not more than 5% rows in the query). Can some query statements be quickly queried from this table? Yes. Compare it with creating an index on the foreign key column and create a full table scan (scan all related pages for required data in the table ). However, the query range is improved. Let's assume that the sales table contains 10000 rows of records and uses SQL to query 400 rows (4% rows of records ). Statement: Select salesdate, salespersonid from sales where productid = 112 let's try to understand how the database engine is executed. 1. The sales table first looks for the productid column of the non-clustered index, so it looks for the productid = 112 on the non-clustered index tree; 2. The index page contains productid = 112 and all clustered index columns (all primary key values, that is salesids, that has productid = 112 assuming that the primary key column is already created in the sales table ); 3. On the primary key (400 here), On the compliant data page, the database engine queries the clustered index column to find the actual column location; 4. queries the primary key. When it is found, from the matched rows, the database engine queries the values of the salesdate and salespersonid columns. Remember the preceding steps to query the productid = 112 information. The database engine queries clustered indexes and retrieves additional rows. For example, the contained clustered index page may also contain the other two columns (salesdate, salespersonid) specified by the query ). The SQL engine must perform Steps 3 and 4 in the preceding steps. Therefore, the SQL engine can quickly select its results from the index page code: Create index nclix_sales_productid -- index nameon DBO. sales (productid) -- column on which index is to be createdinclude (salesdate, salespersonid) -- Additional column values to include Step 3: reorganize the index, you have created all the appropriate indexes in your table. Or, mostly, the index has been created in the database table. However, you may not be able to achieve better results based on your expectations. It is possible that the index is fragmented. What is index fragmentation? Index fragmentation refers to the index page Split Caused by insert, update, and delete operations on the index page. If the index has many fragments, it takes more time to scan and query the index. Therefore, the data retrieval operation is slow. There are two types of fragments: Internal fragments: because data is being deleted or updated, indexes or data pages are scattered (resulting in many blank rows ). Increase the query time. External fragmentation: in the index page, data insertion or update is performed to reduce the where query result set by configuring a new index page in the file system. In addition, the database server cannot use pre-read operations, and the relevant data pages are not accessible. These pages may be stored in data files anywhere. How do I know if index fragmentation has occurred? Execute the following SQL statements in your database. (Sql2005 or earlier database, replace the database name "adventureworks") Select object_name (DT. object_id) tablename, Si. nameindexname, DT. avg_fragmentation_in_percent asexternalfragmentation, DT. avg_page_space_used_in_percent asinternalfragmentationfrom (select object_id, index_id, avg_fragmentation_in_percent, avg_page_space_used_in_percent from sys. dm_db_index_physical_stats (db_id ('adventureworks'), null, 'detail') Where Index _ Id <> 0) as dT inner join sys. indexes Si ON Si. object_id = DT. object_idand Si. index_id = DT. index_id and DT. avg_fragmentation_in_percent> 10and DT. avg_page_space_used_in_percent <75 order by avg_fragmentation_in_percent DESC Based on the results, you will find the index fragmentation occurs, using the following standards. 1. externalfragmentation value> 10 indicates that the corresponding internal fragment has occurred. 2. internalfragmentation value <75 indicates that an external fragment has occurred. How to manage these fragments? You have two ways: 1. reorganize the index and execute the following statement: Alter index all on tablename reorganize2, re-create the index, and execute the following statement: alter index all on tablename rebuild with (fillfactor = 90, online = on) You can also re-create or sort individual indexes in the table, replacing the "all" keyword with the index name. Alternatively, use the SQL manager to recreate the index. When will I use sorting indexes and when will I create indexes again? The external fragment value is 10-15 and the internal fragment value is between 60-75. You can select to rebuild the index in other cases. There is an important thing when re-indexing an index. When re-indexing an index on a special table, the entire table will be locked. Therefore, locking in a large product database is not allowed, because it takes several hours to re-create an index. Fortunately, there is a way in SQL Server 2005. You can use the online operation to recreate an index without locking the table. Part 2: :::: Step 4: Move the SQL statement from the application to the database server. I know you do not like this suggestion. You can obtain the access SQL that has been generated using Orm. Or, you or your team may have a principle to keep SQL in your application. However, if you need to optimize the database access performance or find a fault in your application, I suggest you move your SQL statement to the database. Why? Well, I have the following reasons: 1. You can use stored procedures, views, functions, and triggers to remove some duplicate SQL statements from the application. This ensures the availability of your SQL code. 2. Implementing all SQL code using database objects may find inefficient SQL statements, which is the responsibility for slow performance. In addition, this will process the SQL code. 3. You can use advanced index sets for your SQL statements (check the latest part of the series ). In addition, this will help you eliminate all the SQL statements you have written, although in fact the index will allow you to quickly find the performance problem of the fault point in your application, these four steps may immediately help you to achieve a real performance. However, this will mainly enable you to perform other subsequent optimization steps and easily apply for other technologies that further optimize your data access practices. If you use ORM to access data in your application, you may find that your application can be well executed in your development and testing environment. However, the production environment is not necessarily the same. When you select an Orm, you can change your database type at any time. Step 5: Identify inefficient SQL statements and analyze the best way to use good indexes in your database. If you use simple conditions to access the database, you will surely experience slow performance. We always want to write the code, right? When we write a data operation for a special requirement... However, many people with different abilities, experiences, and opinions must work together in a team. Our team should write code in different ways and lose the optimal method. When writing code, we first think of letting him work. However, if our code runs in the production environment, there will be problems. Now time verification code, the best time to verify your code. I have some SQL optimization methods you can do. However, I am sure you already know the majority. The problem is that, in practice, you are not doing well in your code (of course, you always have some good reasons not to do so ). But what will happen? Your code runs slowly and your customers become unhappy. Therefore, it is not enough to know the optimal method. More importantly, it is more important for you to write better SQL statements. Do not write the SQL statement "select *" in some optimal SQL methods. 1. Unnecessary columns increase the time consumed. 2. The database engine cannot use "covered Index) "(discussed in the previous article), the execution will be slow. Avoid unnecessary columns and table join conditions 1. query unnecessary columns will increase query overhead, especially when columns are of large type; 2. join queries containing unnecessary tables will force the database engine to retrieve unnecessary data and add execution events. Do not use subqueries containing count (), for example, select column_list from table where 0 <(select count (*) from Table2 where ..) you can use: Select column_list from table where exists (select * From Table2 where ...) 1. When you use count (), SQL server does not know what you want to do is an existing verification. It counts all matched values and scans the smallest non-clustered index through a table scan. 2. When you use exists, SQL Server knows that you are performing an existing verification. It will find the first matched value, and it will stop viewing if it returns true. Use count () instead of in or any. Try to avoid joining two types of columns. 1. When two columns of different types are joined, one column must be converted to another. Whose type is converted to lower. 2. If you join an incompatible table and one of the tables can use an index, this index cannot be used for query optimization. For example, if select column_list from small_table, large_table wheresmalltable. float_column = large_table.int_column, the int column is converted to the float column. Because int is lower than float. It cannot use the index of the int column of The larget table, although it can use the index of the float column of the small table. Try to avoid locking 1. Always access tables in the same sequence stored procedures and triggers. 2. Keep your transactions as short as possible. Exchange as little data as possible. 3. Do not wait for user input in a transaction. Set Based Approach instead of procedural approach for SQL writing. 1. The database engine is set to optimal based on set SQL. Therefore, you should avoid using program methods (using cursors to process set rows) to operate large result sets (more than 1000 rows). 2. How to clear "program SQL "? Let's look at these examples: -- replace User-Defined Functions With inline subqueries. -- Replace the cursor code with a related self-query; -- if the program code is required. Replace the cursor operation result with at least one table variable. Do not use count (*) to obtain the number of records of the table. Do not use dynamic SQL to avoid replacing like queries with temporary tables, to query text data using full characters, try to use Union to replace or to operate on large objects using a delayed loading policy using varchar (max), varbinary (max), and nvarchar (max) 1. The sqlserver2000 row cannot store a value larger than bytes. The SQL Server internal page is limited to 8 KB. Therefore, to store more data in a single column, you need to use data types such as text, ntext, and image. Implementation of follow-up exercises in user-defined functions implementation of follow-up exercises in Stored Procedures implementation of follow-up exercises in triggers implementation of follow-up exercises in views implementation of follow-up exercises in transactions follow-up exercises third part ;;; step 6: Apply some advanced indexing techniques to create indexes on XML columns. Create primary XML indexindex_nameon <Object> (xml_column) secondary index create XML index_nameon <Object> (xml_column) using XML index primary_xml_index_namefor {value | path | property} Step 7: use abnormal, historical tables and pre-columns; Part 4; Step 8: diagnose performance problems, effective use of SQL profiler and performance monitoring tool http://www.codeproject.com/KB/database/DiagnoseProblemsSQLServer.aspx
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.