Objective
The previous article describes how to improve the query performance of a database by building an index, which is really just the beginning. Subsequent to the lack of proper maintenance, the index you created earlier can even become a drag, becoming an accomplice to the decline in database performance.
Find Fragments
Eliminating fragmentation may be the most common task for index maintenance, and Microsoft's official recommendation is to use REORGANIZE to "reorganize" the index when the fragmentation level is between 5% and 30%, and to "rebuild" the index using REBUILD if it reaches more than 30%. There are a number of factors that you may want to consider when deciding which means and timing to use, and the following 4 are what you have to consider:
- Plan for backup
- Load on the server
- Disk space remaining
- Reply (Recovery) model
PS: Although fragmentation is closely related to performance, in some specific cases he can be ignored. For example, you have a table with a clustered index, and almost all of the processing for that table is simply extracting a single piece of data from the primary key. The impact of fragmentation on this occasion can be negligible.
So how do you determine the fragmentation status of an index? Use System functions sys.dm_db_index_physical_stats and System catalog sys. Indexes, the sample script is as follows:
--Get information for all indexes on the specified table (example: Orddemo) Select sysin.name as IndexName , sysin.index_id , func.avg_fragmentation_in _percent , Func.index_type_desc as Indextype , Func.page_countfrom sys.dm_db_index_physical_stats (DB_ ID (), object_id (N ' Orddemo '), NULL, NULL, NULL) as Funcjoin sys.indexes as Sysinon func.object_id = sysin.object_ ID and func.index_id = sysin.index_id--The index_id of the clustered index is a non-clustered index of index_id>1--the following script is filtered with a WHERE clause (excluding tables without indexes)--The script returns All indexes on the database may take a long time! SELECT Sysin.name as IndexName , sysin.index_id , Func.avg_fragmentation_in_percent , func.index_ Type_desc as Indextype , Func.page_countfrom sys.dm_db_index_physical_stats (db_id (), NULL, NULL, NULL, NULL ) as Funcjoin sys.indexes as Sysinon func.object_id = sysin.object_id and func.index_id = Sysin.index_idwhere s ysin.index_id>0;
The output is as follows
The fragment of the sample database is 0 because the fragments were generated when the additions and deletions were performed, and our database has not done anything like this.
Fill factor
As mentioned above, the data is stored in the database as a 8KB data page, assuming you have a table with a clustered index, and whenever data is inserted, the database will find the Insert location (data page) and write the information according to the primary key. If the data page is full or not enough space to hold the new data, the database creates a new 8KB data page, and the new process creates I/O consumption.
Fill factor to reduce the occurrence of this situation, if you set a fill factor of 10, then your data initially only use 10% of the 8KB data page, when inserting a new record basically do not have to worry about unnecessary I/O consumption, because the data page reserved 90% of space.
The fill factor is also a double-edged sword, which reduces the performance of read operations while increasing write performance.
The fill factor works only when an index or rebuild (REBUILDI) index is established and is not valid for general DML operations (data pages are always populated to 100%) "
The following script helps you understand the index's fill factor values:
SELECT object_name (object_id) as TableName , NAME as IndexName , Type_desc , Fill_factorfrom Sys.indexeswhere --This is where filtering is done to represent only clustered and nonclustered indexes type_desc<> ' HEAP '
You can also view the default fill factor values on the data server:
SELECT Description , value_in_usefrom sys.configurationswhere Name = ' fill factor (%) '
ps:0 indicates that no reserved space is reserved.
Set the fill factor value by using the following script:
ALTER INDEX [Idx_refno] on [Orddemo]rebuild with (fillfactor=) go--if you want to set the default value on the server, use the following script sp_configure ' show advanced opt Ions ', 1goreconfiguregosp_configure ' fill factor ', 90GORECONFIGUREGO
A large fill factor (more than 90%) is recommended on a table with a static table (accidental update), and a lower fill factor (70%-80%) is recommended on tables that read and write frequently. In particular, when your clustered index is built on a self-increment field, it is no problem to set the fill factor to 100%, because the newly inserted data is always at the end of all data and does not occur between inserting records and records.
Rebuilding (REBUILD) indexes to improve indexing efficiency
The role of rebuilding the index as the name implies, the benefits he brings include the elimination of fragmentation, statistical value (statistics) updates, and the alignment of the physical sort order in the data page. In addition, he compresses the data page based on the fill factor and, if necessary, adds a data page. A lot of benefits, but this operation is very resource-intensive, it will take quite a long time. If you decide to start rebuilding the index, you also need to know that he has two modes of work:
Offline Mode: This is the default Rebuild index mode, which locks the table until the rebuild is complete. If the table is large, it can cause the user (for several hours) to be unable to use the table. Offline mode works faster than online mode and consumes less space in tempdb.
Online mode: If the objective condition does not allow you to lock the table, you can only select the online mode, which will consume more time and server resources. It is worth mentioning that if your table contains varchar (max), nvarchar (max), Text Type field, you will not be able to rebuild the index in this mode.
"Tip: This mode is only supported in the dev/Enterprise Edition, other versions use offline mode by default! 】
The following is a sample script to rebuild the index:
--Rebuild index in online mode idx_refnoalter index [IDX_REFNO] on [Orddemo]rebuild with (fillfactor=80, online=on) go--offline mode rebuild index Idx_refno Alter INDEX [IDX_REFNO] on [Orddemo]rebuild with (fillfactor=80, Online=off) go--rebuild all indexes on the Orddemo table alter index all on [ord Demo]rebuild with (fillfactor=80, Online=off) go--Rebuild Index Idx_reno (drop_existing=on) CREATE CLUSTERED Index [IDX_REFNO] on [ Orddemo] (REFNO) with (drop_existing = On,fillfactor = 70,online = ON) go--use DBCC DBREINDEX to rebuild all indexes on the Orddemo table DBCC DBREINDEX (' Orddemo ') go--using DBCC DBREINDEX to reconstruct an index on the Orddemo table DBCC dbreindex (' Orddemo ', ' idx_refno ', ' a ') GO
"DBCC Dbreindex will be deprecated in subsequent releases"
Based on the author's personal experience, it is better to use bulk-log recovery (bulk-logged recovery) or simple recovery for rebuilding operations on a table with a large amount of data, which prevents the log file from being too large. However, it is necessary to remind you that the backup chain of the database will be interrupted when switching the recovery mode, so if you are in full recovery mode (recovery), remember to rebuild and then switch back.
It is very dangerous to rebuild with patience, long may take 1 days, and it is risky to interrupt him (the database may go into recovery mode).
The user performing the operation must be the owner of the table, or a member of the sysadmin of that server, or the db_owner/db_ddladmin of the database.
Restructure (REORGANIZE) index to improve indexing efficiency
The reorganization will not lock any objects, he is an optimization of the current b-tree, organize data page processing and defragmentation. The example script for the reorganization index processing is as follows:
--Restructure the "IDX_REFNO" index on the "Orddemo" table by ALTER INDEX [IDX_REFNO] on [orddemo]reorganizego--Reorganization Orddemo Table All indexes on the ALTER index all on [ orddemo]reorganizego--restructure all indexes on Orddemo table in AdventureWorks2012 database DBCC INDEXDEFRAG (' AdventureWorks2012 ', ' Orddemo ') go-- Re-index IDX_REFNODBCC indexdefrag (' AdventureWorks2012 ', ' Orddemo ', ' idx_refno ') on Orddemo table in AdventureWorks2012 database GO
Note: The user performing the operation must be the owner of the table, or the sysadmin of the server, or the db_owner/db_ddladmin of the database.
Missing index found
Now that you've learned about the performance gains from indexing, it's hard to actually build enough correct and necessary indexes in the first place, how can we tell which tables need to be indexed and which are not?
Typically, SQL Server executes a query script with an existing index, and if no index is found, it is automatically generated and stored in the DMV (dynamic management view). This information is purged whenever the SQL Server service restarts, so it is best to keep the SQL Server service running until all of the business logic runs through the process of getting the missing indexes.
You can refer to the following links for more information about:
- Sys.dm_db_missing_index_details
- Sys.dm_db_missing_index_group_stats
- Sys.dm_db_missing_index_groups
- Sys.dm_db_missing_index_columns (Index_handle)
Provides a ready-made script:
SELECT Avg_ Total_user_cost * Avg_user_impact * (User_seeks + User_scans) as Possibleimprovement, Last_user_seek, Last_user_scan , statement as Object, ' CREATE INDEX [idx_ ' + CONVERT (varchar,gs. Group_handle) + ' _ ' + CONVERT (varchar,d.index_handle) + ' _ ' + replace (replace (replace ([statement], '] ', '), ' [', ' "), '. ' , ') + ' + ' + ' on ' + [statement] + ' (' + ISNULL (equality_columns, ') + case when equality_columns are not Null and Inequality_columns is isn't NULL then ', ' ELSE ' END + ISNULL (inequality_columns, ') + ') ' + Isnul L (' INCLUDE (' + included_columns + ') ', ') as Create_index_syntaxfrom sys.dm_db_missing_index_groups as Ginner JOIN Sys.dm_db_missing_index_group_stats as Gson gs.group_handle = G.index_group_handleinner JOIN sys.dm_db_missing_index_ Details as DON g.index_handle = D.index_handleorder by possibleimprovement DESC
PS: The information you obtain is a list of proposals, the final decision is yours, and the DMV saves up to 500 indexes.
Found unused indexes we built indexes to improve performance, but it would be cumbersome if the established index was not exploited.
For the same reason as the previous section, keep the SQL Server service running until all the business logic runs through it. Run the script:
SELECT Ind. index_id, obj. Name as TableName, Ind. Name as IndexName, Ind. Type_desc, Indusage.user_seeks, Indusage.user_scans, indusage.user_lookups, indusage.user_ Updates, Indusage.last_user_seek, indusage.last_user_scan, ' drop index [' + Ind.name + '] on [' + Obj.name + '] ' as dropindexcommandfrom sys.indexes as Indjoin sys.objects as Objon ind.object_id=obj. Object_idleft JOIN sys.dm_db_index_usage_stats indusageon ind.object_id = Indusage.object_idand ind . Index_id=indusage.index_idwhere ind.type_desc<> ' HEAP ' and obj.type<> ' S ' and objectproperty ( obj.object_id, ' isusertable ') = 1AND (isnull (indusage.user_seeks,0) = 0AND isnull (indusage.user_scans,0) = 0AND isnull (indusage.user_lookups,0) = 0) ORDER by Obj.name,ind. Namego
After obtaining this information, you decide what action to take. But when you decide to delete an index, note the following two points:
- If the current index is a primary key or a unique key, he can guarantee the integrity of the data
- A unique index, even if it is not used by itself, can provide information to the optimizer to help it generate a better execution plan
Build an indexed view (indexed view) to improve performance
A view is a stored query that behaves like a table. It has two main benefits:
- Restrict users to access only specific fields and specific data in a few tables
- Allows developers to organize raw information into user-oriented logical views in a customized way
Indexed views Parse/Optimize query statements when they are created, and store relevant information in a physical form in a database. Before you decide to use indexed views, consider the following recommendations:
- Views should not refer to other views
- Try to refer to any original table
- The field name must explicitly define the appropriate alias
It is not appropriate to use an indexed view if the processing query for the object is much less updated, or if the original table is a frequently updated table.
If you have a query that contains more totals (aggregation)/unions (joins) and the table has a large amount of data, consider using indexed views. The following parameters must be set using the indexed view (Numeric_roundabort is off and the rest is on)
- ARITHABORT
- Concat_null_yields_null
- Quoted_identifier
- Ansi_warnings
- Ansi_nulls
- Ansi_padding
- Numeric_roundabort
Sample script:
CREATE VIEW poviewwith schemabindingasselect POH. PurchaseOrderID , POH. OrderDate , EMP. LoginID , v.name as VendorName , SUM (POD. OrderQty) as OrderQty , SUM (POD. Orderqty*pod. UnitPrice) as Amount , COUNT_BIG (*) as Countfrom [purchasing].[ PurchaseOrderHeader] as Pohjoin [purchasing].[ PurchaseOrderDetail] as Podon POH. PurchaseOrderID = POD. Purchaseorderidjoin [humanresources].[ Employee] as Empon POH. Employeeid=emp. Businessentityidjoin [purchasing].[ Vendor] as VON POH. Vendorid=v.businessentityidgroup by POH. PurchaseOrderID , POH. OrderDate , EMP. LoginID , v.namego--build a clustered index on the view so that it becomes indexed view create UNIQUE CLUSTERED index Indexpoview on Poview ( PurchaseOrderID) GO
You can compare query statements with query execution plans for indexed views, which provide better query performance in the way that they are indexed:
The query optimizer of SQL Server always tries to find the best execution plan, sometimes although you build an indexed view, but the optimizer still uses the index on the original table, you can use the with NOEXPAND to force the index on the indexed view (not the index on the original table).
Indexed views are supported on various editions of SQL Server 2012, and query processors in either the development or Enterprise editions can even optimize queries that match indexed views.
The indexed view must be built with the with SCHEMABINDING to ensure that the fields used are not modified.
If the indexed view contains a GROUP by clause, you must include COUNT_BIG (*) in the SELECT clause, and you cannot specify having, CUBE, and ROLLUP.
Use indexes on calculated fields (Computed Columns) to improve performance
First, let's introduce the calculated field (Computed Columns), which refers to the other fields of the same table by an expression, and then computes a result. The value of this field will be recalculated each time it is called, unless you are building with the PERSISTED tag.
There are a few things to consider before deciding whether to index on a calculated field:
- Calculated fields as Image, Text, or ntext, which can only be used as a non-critical field of a nonclustered index (non-key column)
- calculated field expression cannot be a REAL or FLOAT type
- The calculated field should be accurate (? )
- The calculated field should be deterministic (enter the same value, output the same result)
- Calculated fields If a function is used, whether it is a user function or a system function, the owner of the table and function must be the same
- Functions for multiple rows of records (e.g. SUM, AVG) cannot be used in a calculated field
- Adding or deleting changes will change the value of the index on the calculated field, so the following 6 parameters must be set.
SET ANSI_NULLS on
SET ansi_padding on
SET ansi_warnings on
SET ARITHABORT on
SET Concat_null_yields_null on
SET QUOTED_IDENTIFIER ON
SET Numeric_roundabort OFF
Let's look at a complete example:
1. Set the system variables and build our test data sheet
SET ansi_nulls ONSET ansi_padding ONSET ansi_warnings ONSET ARITHABORT ONSET concat_null_yields_null ONSET quoted_identif IER ONSET numeric_roundabort offselect [SalesOrderID] , [salesorderdetailid] , [Carriertrackingnumber] , [OrderQty] , [ProductID] , [Specialofferid] , [Unitprice]into salesorderdetaildemofrom [AdventureWorks2012]. [Sales]. [SalesOrderDetail] GO
2. Create a user-defined function, and then create a calculated field and use this function
CREATE Function[dbo]. [Udftotalamount] (@TotalPrice Numeric (10,3), @FreightTINYINT) RETURNS Numeric (10,3) with Schemabindingasbegindeclare @NetPrice Numeric (10,3) SET @NetPrice = @TotalPrice + (@TotalPrice * @Freight/100) RETURN @NetPriceENDGO--adding computed column Salesorderdetaildemo tablealter TABLE Salesorderdetaildemoadd [Netprice] as [dbo]. [Udftotalamount] (orderqty*unitprice,5) GO
3. Create a clustered index, open the performance indicator switch, and execute a query (note that at this point we have not indexed the calculated field!) )
CREATE Clustered Index Idx_salesorderid_salesorderdetailid_salesorderdetaildemoon Salesorderdetaildemo ( Salesorderid,salesorderdetailid) go--checking Salesorderdetaildemo with statistics option on to--measure Performanceset STATISTICS IO ONSET STATISTICS time ongo--checking SELECT statement without have Index on Computedcolumns Elect * from Salesorderdetaildemo WHERE netprice>5000go
The output performance results are as follows:
SQL Server parse and compile time: CPU time = 650 ms, Elapsed time = 650 Ms. SQL Server parse and compile time: CPU time = 0 ms, Elapsed time = 0 Ms.
(3864 row (s) affected)
Table ' Salesorderdetaildemo '. Scan count 1, logical reads 757, Physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB Physical reads 0, lob read-ahead reads 0.
SQL Server Execution times: CPU time = 562 ms, Elapsed time = 678 Ms. |
4. Before establishing an index on a calculated field, you can use the following script to confirm that the previously mentioned creation requirements are met: (return value: 0 not satisfied, 1 satisfied)
Selectcolumnproperty (object_id (' Salesorderdetaildemo '), ' netprice ', ' IsIndexable ') as ' indexable? ', ColumnProperty ( OBJECT_ID (' Salesorderdetaildemo '), ' netprice ', ' isdeterministic ') as ' deterministic? ', ObjectProperty (' OBJECT_ID ' Udftotalamount '), ' isdeterministic ') ' Udfdeterministic ', ColumnProperty (object_id (' Salesorderdetaildemo '), ' Netprice ', ' isprecise ') as ' precise? '
5. Build the index when required, and execute the previous query statement again
CREATE INDEX Idx_salesorderdetaildemo_netpriceon Salesorderdetaildemo (netprice) goselect * from Salesorderdetaildemo WHERE Netprice>5000go
The performance results are as follows:
SQL Server parse and compile time: CPU time = 0 ms, Elapsed time = 0 Ms. SQL Server parse and compile time: CPU time = 0 ms, Elapsed time = 0 Ms.
(3864 row (s) affected)
Table ' Salesorderdetaildemo '. Scan count 1, logical reads 757, Physical reads 0, Read-ahead reads 0, LOB logical reads 0, LOB Physical reads 0, lob read-ahead reads 0.
SQL Server Execution times: CPU time = 546 ms, Elapsed time = 622 Ms. |
Confirm the disk space occupied by the index
SELECT case index_id if 0 Then ' HEAP ' while 1 Then ' Clustered index ' ELSE ' non-clustered index ' END As Index_type, SUM (case is filledpage > Pagetodeduct then (filledpage-pagetodeduct) ELSE 0 END) * 8 Index_sizefrom ( SELECT partition_id, index_id, SUM (Used_page_count) as Filledpage, SUM (case if (index_id < 2) then (In_row_data_page_count + Lob_used_page_count + row_overflow_used _page_count) ELSE lob_used_page_count + row_overflow_used_page_count END ) as Pagetodeduct from sys.dm_db_partition_stats GROUP by partition_id,index_id) as Innertablegroup by Index_idgo
PS: The unit of output is in KB