MS SQL Server Query optimization method

Source: Internet
Author: User
Tags end execution functions sql server query mssql sql variables version
There are many reasons for the slow server| optimization query, common as follows: 1, no index or no index (this is the most common problem of query, is the defect of program design) 2, I/O throughput is small, resulting in a bottleneck effect. 3. No computed columns were created to cause the query to be not optimized. 4, less than 5 memory, slow network speed 6, the amount of data queried is too large (you can use multiple queries, other methods to reduce the amount of data) 7, lock or deadlock (this is also the most common problem of query, is the defect of program design) 8, sp_lock,sp_who, active user view, because of reading and writing competitive resources. 9, returned unnecessary row and column 10, the query statement is not good, no optimization can be done by the following methods to optimize the query:  1, the data, log, index to the different I/O devices, increase the reading speed, before the tempdb should be placed on the RAID0, SQL2000 not support. The larger the amount of data (size), the more important I/O is.  2, longitudinal, transverse partition table, reduce the size of the table (Sp_spaceuse)  3, upgrade hardware  4, according to query conditions, index, optimize the index, optimize access mode, limit the data volume of the result set. Note that the fill factor should be appropriate (preferably with the default value of 0). The index should be as small as possible, using a small number of bytes of Lie Jian index Good (reference to the creation of the index), not to a limited number of characters Jianjian a single index such as Gender field  5, improve speed; &NBSP;6, expanding the server's memory, Windows 2000 and SQL Server 2000 can support 4-8g memory. Configure virtual Memory: The virtual memory size should be configured based on the services that are running concurrently on the computer. Run Microsoft SQL Server? When you are 2000, consider setting the virtual memory size to 1.5 times times the physical memory installed on your computer. If you installed the Full-text Search feature and intend to run the Microsoft search service to perform full-text indexing and querying, consider: Configure the virtual memory size to be at least 3 times times the physical memory installed on your computer. Configure the SQL Server max server memory server configuration option to 1.5 times times the physical memory (half of the virtual memory size setting).  7, increase the number of server CPUs; but it must be understood that parallel processing of serial processing requires resources such as memory. The use of parallel or serial stroke is the automatic evaluation of MSSQL selection. A single task can be run on a processor by breaking it into multiple tasks. For example, delays in sorting, linking, scanning, and group by words, SQL Server determines the optimal level of parallelism based on the load of the system, and complex queries that consume a large number of CPUs are best suited andRow processing. However, the update operation Update,insert,delete cannot be processed in parallel.  8, if you are using like to query, the simple use of index is not possible, but full-text indexing, space consumption. Like ' a% ' using the index as '%a ' without using the index as a '%a% ' query, query time is proportional to the total length of the field value, so you cannot use the char type, but the varchar. A full-text index that is long for a field's value.  9, DB server and Application server separation, OLTP and OLAP separation  10, distributed partitioned views can be used to implement a federation of database servers. A consortium is a set of separately managed servers, but they work together to share the processing load of the system. This mechanism for creating a federation of database servers through partitioned data can expand a set of servers to support the processing needs of large, multi-tiered Web sites. For more information, see Designing a federated database server. (referencing the SQL Help file ' partitioned view ')   a, before implementing a partitioned view, you must first   b a partitioned table, define a distributed partitioned view on each member server after the member table is created, and each view has the same name. In this way, queries that reference the distributed partitioned view name can run on any one member server. The system operates like a copy of the original table on each member server, but in fact there is only one member table and one distributed partitioned view per server. The location of the data is transparent to the application. &NBSP;11, rebuilds the index DBCC REINDEX, DBCC INDEXDEFRAG, shrinks data and logs DBCC SHRINKDB,DBCC shrinkfile. Set up automatic shrink logging. For large databases do not set up automatic database growth, it can degrade server performance. There is a great deal of attention in the writing of T-SQL, the following list of common points: first, the process that the DBMS processes the query plan is as follows:   1, lexical, grammatical check   2 of the query statement, query optimizer that submits the statement to the DBMS    3, optimizer to do algebraic optimization and access path optimization   4, from the precompiled module to generate query planning   5, and then submitted to the system at the right time to process the execution   6, and finally return the results to the user second, Look at the structure of SQL Server data storage: A page size of 8K (8060) bytes, 8 pages for a panel, according to the B-treeStored. The difference between  12, commit, and Rollback Rollback: Roll back all things. Commit: Commit the current thing. There is no need to write things in dynamic SQL, if you want to write please write in the outside such as: Begin TRAN EXEC (@s) commit trans or write dynamic SQL as a function or stored procedure. &NBSP;13, in the query SELECT statement with the WHERE clause to limit the number of rows returned, to avoid table scanning, if the return of unnecessary data, waste the server's I/O resources, increased the burden of the network to reduce performance. If the table is large, locks the table during table scans, and prevents other joins from accessing the table, with serious consequences. &NBSP;14, SQL annotations Declare no effect on execution  15, use the cursor as much as possible, and it takes up a lot of resources. If you need to row-by-row, try to use non-cursor technology, such as: In the client loop, with temporary tables, table variables, with subqueries, with case statements and so on. Cursors can be categorized according to the extraction options it supports: Only the rows must be fetched in the order from the first line to the last line. Fetch NEXT is the only extraction operation allowed and the default way. Scrollable can randomly extract any row anywhere in the cursor. The technique of cursors becomes powerful under SQL2000, and his purpose is to support loops. There are four concurrent options READ_ONLY: Cursor positioning update (update) is not allowed, and there are no locks in the rows that make up the result set. Optimistic with ValueS: Optimistic concurrency control is a standard part of the theory of transaction control. Optimistic concurrency control is used in situations where there is only a small opportunity for a second user to update a row in the interval between opening the cursor and updating the row. When a cursor is opened with this option, there is no lock to control the rows in it, which helps to maximize its processing power. If the user attempts to modify a row, the current value of this row is compared to the value obtained when the row was last fetched. If any value changes, the server knows that someone else has updated the row and returns an error. If the value is the same, the server performs the modification. Select this concurrency option optimistic with row VERSIONING: This optimistic concurrency control option is based on row versioning. With row versioning, the table must have a version identifier that the server can use to determine whether the row has changed since it was read into the cursor. In SQL Server, this performance is provided by the timestamp data type, which is a binary number that represents the relative order of changes in the database. Each database has a global current timestamp value: @ @DBTS. Change with timestamp column every time in any way, SQL Server stores the current @ @DBTS value in the timestamp column first, and then increases the value of the @ @DBTS. If a table has a timestamp column, the timestamp is recorded at the row level. The server can compare the current timestamp value of a row with the timestamp value stored at the last fetch to determine whether the row has been updated. The server does not have to compare the values of all columns, just compare the timestamp columns. If the application requires optimistic concurrency based on row versioning for tables that do not have timestamp columns, then Reise considers optimistic concurrency control based on numeric values. SCROLL LOCKS This option to implement pessimistic concurrency control. In pessimistic concurrency control, the application attempts to lock the database row when the row of the database is read into the cursor result set. When a server cursor is used, an update lock is placed on the row when it is read into the cursor. If a cursor is opened within a transaction, the transaction update lock remains until the transaction is committed or rolled back, and the cursor lock is dropped when the next row is fetched. If you open a cursor outside of a transaction, the lock is discarded when the next row is fetched. Therefore, whenever a user needs full pessimistic concurrency control, the cursor should open within the transaction. Updating a lock prevents any other task from obtaining an update or exclusive lock, preventing other tasks from updating the row. However, the update lock does not block shared locks, so it does not prevent other tasks from reading rows unless the second task also requires a read with an update lock. Scroll locks can generate scroll locks based on the lock hints specified in the SELECT statement defined in the cursor. The scroll lock is fetched on each line during extraction and is persisted to the next fetch or the cursor is closed, whichever occurs first. The next time the fetch is fetched, the server acquires a scroll lock for the row in the new fetch and releases the scroll lock from the last fetch of the rows. The scroll LOCK is independent of the transaction lock and can be persisted after a commit or rollback operation. If the option to close the cursor at commit is off, the commit statement does not close any open cursors, and the scroll lock is persisted to the commit to maintain isolation of the extracted data. The type of the obtained scroll lock depends on the cursor concurrency option and the lock hint in the cursor SELECT statement. Lock prompt Read Only optimistic value optimistic row version control lock silent unlocked unlocked Unlocked update NOLOCK unlocked unlocked unlocked unlocked HOLDLOCK shared share share update UPDLOCK error update update update TABLOCKX error unlocked Lock updates Other unlocked unlocked unlocked updates * Specifies that the NOLOCK prompt will cause the table specified for that hint to be read-only in Gianpene. &NBSP;16, use Profiler to track the query, get the time needed for the query, find out the problem of SQL; Optimize indexing with the index optimizer  17, attention Union and UniOn the difference of all. UNION ALL good  18, pay attention to use distinct, do not use when not necessary, it is same as union will make query slow. Duplicate records in the query is no problem  19, query do not return unwanted rows, columns  20, with sp_configure ' query governor cost limit ' or set Query_governor_cost_ Limit to limit the resources that the query consumes. When an estimate query consumes more resources than the limit, the server automatically cancels the query and kills it before the query. Set Locktime sets the time of the lock  21, using the Select top 100/10 Percent to limit the number of rows returned by the user or set rowcount to restrict the line  22 of the operation, before SQL2000, generally do not use the following words: ' Is NULL ', ' <> ', '!= ', '!> ', '!< ', ' not ', ' not ' EXISTS ', ' Not in ', ' Don't like ', and ' like '%500 ', because they don't go the index is all table scan. Also, do not add a function to the column name in the WHERE clause, such as convert,substring, if you must use a function, create a computed column and then create an index instead. You can also work around it: WHERE SUBSTRING (firstname,1,1) = ' M ' instead of where FirstName like ' m% ' (index Scan), be sure to separate the function from the column name. And the index cannot be built too much and too large. Not in will scan the table multiple times, using EXISTS, not EXISTS, in, and left OUTER joins to replace them, especially the left-hand connection, and EXISTS is faster than in. The slowest is the not operation. If the value of the column is empty, the previous index does not work, and the 2000 optimizer is now able to handle it. The same is NULL, ' not ', ' not EXISTS ', ' not in ' can optimize her, and ' <> ' is still not optimized and cannot be indexed.  23, using Query Analyzer, view the query plan for the SQL statement and evaluate whether the analysis is optimized for SQL. The average 20% of the code occupies 80% of the resources, and the focus of our optimization is these slow places. &NBSP;24, if you use in or or, and so on, find that the query is not indexed, specify the index using the display declaration: SELECT * FROM PersonmeMber (INDEX = ix_title) WHERE ProcessID in (' Male ', ' female ')  25, the results of the need query are calculated in advance and placed in the table, then select the query. This is the most important means before SQL7.0. For example, hospital hospitalization costs are calculated.  26, MIN () and MAX () can use the appropriate index. &NBSP;27, the database has a principle is the code from the data closer to the better, so priority to choose Default, followed by Rules,triggers, Constraint (constraints such as the health of the main health checkunique ..., the maximum length of the data type, etc. are constraints), Procedure. This not only maintenance work small, the writing program quality is high, and the execution speed is fast. &NBSP;28, if you want to insert a large binary value into an image column, use a stored procedure and never insert with inline inserts (I don't know if Java is). Because the application first converts the binary value to a string (twice times the size), the server is converted to a binary value after the character. Stored procedures do not have these actions: Method: Create procedure P_insert as insert into table ( Fimage) VALUES (@image), which call this stored procedure in the foreground to pass in binary parameters, which improves the processing speed significantly.  29, between are faster than in-speed at some point, and between can quickly find a range based on the index. The query optimizer shows differences. SELECT * from Chineseresume where title in (' Male ', ' female ') select * from Chineseresume where between ' men ' and ' women ' are the same. Because in is more than once, it is sometimes slower.  30, where it is necessary to create indexes on global or local temporary tables, can sometimes improve speed, but not necessarily, because indexes also consume a lot of resources. He created the same as the actual table.  31, don't build things that don't work. For example, waste resources when generating reports. Use it only when it is necessary to use things.  32, or words, can be decomposed into multiple queries, and multiple queries are connected by union. Their speed is only related to whether or not the index is used, and if the query needs to use a federated index, it is more efficient to execute with UNION all. Multiple or phrases are not indexed, converted to union, and then try to match the index. A key question is whether to use the index. &NBSP;33, use as little as possibleView, it's inefficient. The view operation is slower than the direct table operation, and you can replace her with stored procedure. Specifically, do not use view nesting, nested views increase the difficulty of finding raw materials. We look at the nature of the view: It is an optimized SQL stored on the server that has generated a query plan. When retrieving data from a single table, do not use a view that points to multiple tables, retrieve it directly from the table, or simply include a view that contains the table, otherwise it adds unnecessary overhead and interferes with the query. MSSQL increases the functionality of the view index in order to expedite query for the view. &NBSP;34, do not use distinct and order by when necessary, these actions can be changed to client execution. They add extra overhead. This is the same reason as Union and union all. SELECT top ad.companyname,comid,position,ad.referenceid,worklocation, convert (varchar, ad.postdate,120) as Postdate1,workyear,degreedescription from Jobcn_query.dbo.COMPANYAD_query ad where Referenceid in (' JCNAD00329667 ', ' JCNAD132168 ', ' JCNAD00337748 ', ' JCNAD00338345 ', ' JCNAD00333138 ', ' JCNAD00303570 ', ' JCNAD00303569 ', ' JCNAD00303568 ', ' JCNAD00306698 ', ' JCNAD00231935 ', ' JCNAD00231933 ', ' JCNAD00254567 ', ' JCNAD00254585 ', ' JCNAD00254608 ', ' JCNAD00254607 ' , ' JCNAD00258524 ', ' JCNAD00332133 ', ' JCNAD00268618 ', ' JCNAD00279196 ', ' JCNAD00268613 ') Order by postdate Desc, In the in post-face list, place the most frequently occurring values at the front, with the fewest occurrences at the end, reducing the number of judgements. &NBSP;36, when used with select INTO, locks the system table (Sysobjects,sysindexes, and so on), blocking access to other connections. When you create a temporary table, you use a display declaration statement instead of SELECT INTO. DROP table T_LXH BEGIN TRAN SELECT * into T_lxh from chineseresume where name = ' XYZ '--commit in another connection select * from Sysobje The CTS can see that the SELECT into locks up the system table, and the Create table locks the system table (whether it is a temporary table or a system table). So don't use it in things!!! In this case, use a real table, or a temporary table variable, if it's a temporary table that you often use.  37, generally before the group by a having words can eliminate the extra lines, so try not to use them to do the work of culling. Their execution order should be optimal: Select the WHERE clause selects all the appropriate rows, group by is used to group a statistic row, and the HAVING clause is used to remove the unwanted groupings. So the group by has a having the expense is small, the query is fast. For large data rows to be grouped and having very consuming resources. If the purpose of group by does not include calculation, just grouping, then use distinct faster  38, update more than one record score more than once each time a fast, that is batch processing good  39, less temporary table, as far as possible with the result set and table class variables to replace it, A variable of table type is better than a temporary table  40, under SQL2000, the calculated field is indexed, and the conditions that need to be met are as follows:   a, the expression of the calculated field is determined   b, cannot be used in Text,ntext, Image data type   c, must be prepared with the following options ansi_nulls = on, ansi_paddings = ON, ...  41, try to put the data processing work on the server, reduce the overhead of the network, such as using stored procedures. Stored procedures are compiled, optimized, and organized into an execution plan, and stored in the database of SQL statements, is the control of the Flow language collection, of course, fast. Dynamic SQL, which is executed repeatedly, can use temporary stored procedures, which (temporary tables) are placed in tempdb. Because SQL Server did not support complex math calculations before, it was necessary to put this work on other layers and increase the overhead of the network. SQL2000 supports UDFs, now supports complex mathematical calculations, the return value of the function is not too large, such a large overhead. User-defined functions consume a large amount of resources, like cursors, if large results are returned using stored procedures  42, not in aUse the same function repeatedly, waste resources, put the result in the variable and call faster  43, SELECT COUNT (*) efficiency teaches low, try to adapt his writing, and exists fast. Note also the difference: Select COUNT (Field of NULL The return value of the From table and select count (Field of NOT null) from table is different!!! &NBSP;44, when the server has enough memory, the number of threads = maximum number of connections +5, which can maximize the efficiency; otherwise, use the thread pool of the configuration threads < maximum connections to enable SQL Server thread pooling to resolve, if still quantity = maximum number of connections +5, severely damage server performance. &NBSP;45, visit your table in a certain order. If you lock table a first and then lock table B, then lock them in this order in all stored procedures. If you (inadvertently) lock table B in a stored procedure and then lock table A, this can cause a deadlock. If the locking order is not well designed in advance, deadlocks can be difficult to detect  46, monitor the load of the corresponding hardware through SQL Server Performance Monitor Memory:page faults/sec counter If the value is occasionally higher, Indicates that the thread was competing for memory. If it continues to be high, memory may be a bottleneck. Process:   1,% DPC time refers to the percentage of the processor used to receive and provide services in the deferred program call (DPC) during the sample interval. (DPC is running at a lower interval than the standard interval priority). Because DPC is performed in privileged mode, the percentage of DPC time is part of the privileged time percentage. These times are calculated separately and are not part of the total interval calculation. This total shows the average busy time as a percentage of the instance time.   2,%processor Time counter if the value of this parameter lasts more than 95%, the bottleneck is CPU. Consider adding a processor or switching to a faster processor.   3,% Privileged Time is the percentage of the non idle processor hours used for privileged mode. (Privileged mode is a processing pattern designed for operating system components and manipulating hardware drivers.) It allows direct access to hardware and all memory. Another mode is User mode, which is a kind of limited processing mode designed for application, environment and integer system. The operating system converts application threads into privileged mode to access operating system services. % of privileged time included for interBreak and DPC provide the service time. A high privileged time ratio may be caused by a large number of gaps in a failed device.  This counter will display the average busy time as part of the sample time.   4,% User time represents CPU-consuming database operations, such as sorting, executing aggregate functions, and so on. If the value is high, consider adding an index to reduce the value by using simple table joins, and horizontally dividing the large table. Physical DISK:CURRETN Disk Queue Length counter this value should be no more than 1.5~2 times the number of disks. To improve performance, you can increase the disk. Sqlserver:cache Hit ratio counter the higher the value the better. If you continue below 80%, you should consider adding more memory. Note the value of this parameter is cumulative since the start of SQL Server, so the value will not reflect the current value of the system after a period of time.  47, analysis Select Emp_name form employee where salary > 3000 If salary is a float type in this statement, the optimizer optimizes it to convert (float,3000). Since 3000 is an integer, we should use 3000.0 in programming and not allow the DBMS to be transformed while it is running. Conversion of the same character and integer data.  48, query associated with write order select A.personmemberid, * from Chineseresume a,personmember b where Personmemberid = B.referenceid D a.personmemberid = ' JCNPRH39681 ' (a = b, b = ' number ') Select A.personmemberid, * from Chineseresume A,personmember B where A.personmemberid = B.referenceid and A.personmemberid = ' JCNPRH39681 ' and B.referenceid = ' JCNPRH39681 ' (a = b, b = ' number ', A = ' number ') Select A.personmemberid, * from Chineseresume a,personmember b where B.referenceid = ' JCNPRH39681 ' and A.personmemberid = ' JCNPRH39681 ' (B = ' number ', a = ' number ')  49, (1) IF No owner code is entered THEN code1=0 code2=9999 ELS E code1=code2= Owner code End If Execute SQL statement as: SELECT owner name from P2000 WHERE owner code &GT;=:CODE1 and owner code <=:code2 (2) IF no owner code input T HEN Select owner name from P2000 ELSE code= owner code SELECT owner code from P2000 WHERE owner code =:code end IF The first method uses only one SQL statement, the second method uses two SQL Statement. When the owner code is not entered, the second method is obviously more efficient than the first method because it has no constraints; When you enter the owner code, the second method is still more efficient than the first, not only a limited condition, but also the fastest query operation because of the equality operation. We write programs don't be afraid of trouble  50, about JOBCN now query paging new methods (as follows), using Performance Optimizer to analyze performance bottlenecks, if the I/O or the speed of the network, such as the next method optimization effective, if the CPU or memory, with the current method better. Please distinguish the following methods to show that the smaller the index the better. Begin DECLARE @local_variable Table (FID int identity (1,1), Referenceid varchar) insert INTO @local_variable (referenc EID) Select top 100000 Referenceid to Chineseresume ORDER by Referenceid select * from @local_variable where Fid > 40 FID <= and begin DECLARE @local_variable table (FID int identity (1,1), Referenceid varchar) insert INTO @l Ocal_variable (Referenceid) select top 100000 REferenceid from Chineseresume to updatedate select * from @local_variable where FID > and FID <= Begin CREATE TABLE #temp (FID int identity (1,1), Referenceid varchar) insert INTO #temp (referenceid) SELECT top 10000 0 Referenceid from Chineseresume to updatedate select * from #temp where FID > and FID <= a drop table #te MP End

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.