2. What factors do you think will affect SQL Server efficiency in writing T-SQL (including stored procedures, functions, and views?
Summary-index usage
L no index, no index used, low I/O throughput, no computing column created, resulting in query optimization, lock or deadlock, insufficient query statement optimization, etc.
L for large tables. Statements that do not take the index, scan errors, and other statements that force the use of hint may affect the generation of the optimal plan by SQL Server over time.
L avoid the use of index filtering conditions. For example, where DBO. ufn. Test (ID) = 1
L ensure that an index suitable for query exists
L create indexes based on query conditions, optimize indexes, optimize access methods, and limit the data volume of the result set. Note that the fill factor should be appropriate (preferably the default value 0 ).
L if you use like for query, you cannot simply use index, but the full-text index consumes space. Like 'a % 'when the index like' % a' is used and like '% A %' is not used for the query, the query time is proportional to the total length of the field value, so the char type cannot be used, but varchar. Create a full-text index with a long field value
Summary -- cursor:
L frequent "On/Off" of the cursor, which consumes resources and adds a "Lock" to the relevant table"
L you can use a set operation to process data without using a cursor.
L The Stored Procedure places the function functions in the cursor calculation through the cursor method, which improves the speed.
L consider using other methods to rewrite the cursor (whether the verification method is valid or not, which may be worse than using the cursor directly), or ensure that the cursor is efficient
L individual supplement:
A) For cases where row-by-row processing is required, it is recommended that you use a cursor directly. To avoid the impact on the source data of the cursor, you can consider using a static cursor. When you can use the set processing mode to replace the continent logo, you must also consider the complexity of the query and the amount of data involved. If the query produced by this modification is very complex or involves a large amount of data, therefore, using a cursor may produce better or more stable results. It is recommended that the final scheme be determined through testing and verification.
B) Select the cursor type.
C. usually use a large amount of data in the cursor segment. consider adjusting the cursor threshold option (asynchronous generation control)
Summary-temporary tables and table Variables
L The use of table variables and temporary tables must be specified and the size must be estimated.
L temporary tables and memory tables should be treated separately
L individual supplement: Table variables are not controlled by transactions, and their data writing efficiency is higher than that of temporary tables. However, SQL server does not generate statistical information for table variables, which means that if Table variables are involved in queries, the query optimizer cannot evaluate the data of table variables, leading to the inability to effectively assess the cost of the query solution. Therefore, if data needs to be queried, it is always recommended to use temporary tables (especially in complex queries, this is very important ). Note that table variables do not exist only in the memory. When the data volume is large, the data is still written to tempdb.
Discussion Summary-Comprehensive
L one of the advantages of stored procedures, views, and functions is the reuse of query plans (pre-compiled). However, you must note that re-compilation operations may occur, for example, reference and temporary tables in Stored Procedures, view structure change, etc.
L use inappropriate data types, such as inconsistent with table fields or implicit conversion (conversion of characters to numbers)
L set nocount on to avoid unnecessary network transmission for DML operations
L The most typical function is added to the field for further judgment. Incorrect implementation logic will cause performance problems .. Of course, this is not a DB platform thing. It's a real thing.
L when writing SQL statements, make sure that the SQL statements are simple and easy to read when necessary. The causes of efficiency offset include:
A) Too large transaction operations cause serious blocking between transactions
B) The returned amount of useless data causes disk and network pressure.
C) Too many non-set operations, such as frequently calling a function in the returned Column
D) use negative logic in queries
E) unnecessary sorting and querying unnecessary data
F) redundant judgment logic, nested reference of multi-layer views.
L check for poor SQL statements and check whether there is any optimizable content in the statement.
A) Check the subquery. Check whether SQL subqueries can be rewritten using simple connections.
B) database Optimizer
C) Avoid large transaction operations and improve system concurrency
D) avoid returning large data volumes to the client as much as possible, such as Paging
E) Avoid using cursors to connect and traverse large-scale data.
L optimization of the Database Engine itself. The query engine of each database is constantly evolving, and some previously efficient statements appear outdated.
L. Database Design Optimization: The design structure of the data will also make the SQL statement writing different for different themes and different analysis directions.
L 70% performance problems are caused by poor SQL statements. The condition does not comply with SAR specifications, the cursor is abused, the transaction is too long, complicated business logic is implemented to pursue functions, and performance is not considered.
L avoid distinct of large tables
L T-SQL writing to the complexity of the provisions, not unlimited nesting or union and so on. The size of Retrieved Data in T-SQL also needs to be estimated
L use the WHERE clause in the query SELECT statement to limit the number of returned rows to avoid table scanning. If unnecessary data is returned, the server's I/O resources are wasted, increasing the network burden and reducing performance. If the table is large, the table is locked during the table scan and other connections are prohibited from accessing the table. The consequence is serious.
L the storage process should be properly designed and prompt database efficiency, mainly in the structure of tables and the use of reasonable query methods
L view data should be stored in the memory for execution to improve efficiency, but memory should be added
L index hit should be hit during join, and more than one Embedding should be used. You can use the hour table for test.
L after designing the SQL statement, run the explain command to check whether the index is used and whether filesort exists. Check whether the number of retrieved rows (rows) is too large. In general .:
A) rows <1000 is acceptable.
B) OWS is in the range of 1000 ~ Performance problems may occur during intensive access within, but it is acceptable if the access is not too frequent (once every minute) and it is difficult to optimize it. However, you should pay attention to it.
C) When rows is greater than 10 thousand, you should carefully consider the SQL design, optimize SQL, and optimize the database. Generally, frequent operation is not allowed (the frequency is less than 1 hour)
D) when the rows reaches the 10 million level, it cannot be used as a real-time running SQL statement. Except for data import, the time and frequency of data import must be well controlled.
L explain in advance in the test environment or even the actual environment when designing SQL statements, especially SQL statements that are slightly complex.
L SQL statement to avoid non-conforming SARS Conditions
L avoid using cursors and triggers whenever possible
L avoid long transactions and congestion and deadlock
L complex logic can be implemented multiple times
L Dynamic concatenation of SQL statements is one aspect that affects SQL Server Efficiency
L The Connection query of multiple tables in the stored procedure also affects the efficiency of SQL Server. Of course, it is a problem with many databases, probably because the CBO generation and execution plan is not good, use temporary tables as a last resort
L use functions or stored procedures instead of complex views to improve performance
Personal supplement: Pay attention to evaluation when using custom functions. It is best to ensure that a function returns a single value is a deterministic function. The query optimizer does not create statistical information for multi-statement Table value functions. Therefore, if multi-statement Table value functions are used for associated queries, poor query solutions may be generated
L SQL server does not have non-blocking reads from Oracle, not only writing/writing, but also reading/writing conflicts. It should be avoided with great efforts, such as using short transactions and setting and using indexes properly, if possible, perform read/write splitting. There are many factors that affect the SQL bottleneck, including insufficient memory, insufficient hardware, insufficient memory for SQL Server, and lack of useful indexes, poor network communication and disk configuration, such as tempdb configuration, whether the query optimizer provides the most favorable conditions for optimizing complex queries
Personal supplement: Read/write conflicts. You can control the transaction isolation level (or related table prompts ).
L. Data magnitude: efficient SQL statements cannot be used when the data volume is large and low, and vice versa
Personal supplement: T-SQL execution includes compilation (query scheme evaluation selection) and execution of two steps, in the query scheme selection steps, the query optimizer can evaluate the effective query scheme according to the statistical information, object structure, of course, if the query is too complex, or the information required for the query cost evaluation is not accurate, the evaluation results may also have errors. In most cases, the query results will not need to be adjusted at any time based on the data volume.
L T-SQL, SQL Server lack of Oracle complete analysis functions, but also lack of some column and column conversion, tree directory structure functions, although can be solved by recursive CTE, XML and other functions
Personal supplement: The number of built-in functions cannot be compared. Each database has its own considerations, but common functions are basically indispensable. If you do not need them, you can consider writing CLR functions by yourself. There are two types of row-column conversions: histogram and unregister. hierarchyid data types and corresponding functions are used in hierarchyid
Personal supplement
L complex queries require special control. The more complex the query, the more query schemes you can select and the wider the query evaluation involves. This means that the higher the cost of the query solution evaluated by the query optimizer, the more factors involved, this means that the assessment solution is less accurate. Therefore, complicated queries mean that the performance is difficult to guarantee. Understanding the SQL statement processing process helps you understand this problem (search for "SQL statement processing" in the online help. In addition, pay special attention to two complex queries:
A) a query composed of multiple CTE definitions. A single view of the definition of each CTE is very simple, but the final combination is probably a very complicated query, which can be determined by viewing the execution plan. Note that the execution steps are clearly and clearly written through the CTE definition, and the final execution is basically not performed according to the write steps.
B) complex queries caused by nested views (including table value functions. Each view involved in the query may be very simple, but a combination of layers may eventually form a complex query, which can also be regarded as the execution plan. (Avoid using modular design ideas in databases whenever possible)
L pay attention to the control of implementation methods. The following are common examples:
A) Use join instead of exists (including in ). This is often prone to problems, especially when the structure definition cannot determine whether a join will generate one-to-multiple queries. Exists only needs to consider whether the data exists, join also needs to consider whether the data is one-to-one, one-to-many, or many-to-many. The two results in different query cost evaluations.
B) Too many in values ). This is basically determined as a bunch of or conditions, and each value needs to be evaluated. In many cases, the query cost is relatively high. You can consider placing values in the temporary table first, and then in the temporary table.
C) reduce or conditions. In many cases, changing the or condition to multiple unionall results will be more efficient.
D) Pay attention to the data type conversion in the condition and join. Use explicit data type conversion as much as possible, and place the conversion target on the constant, variable, parameter, or less data side. Implicit (automatic) data type conversion determines the conversion target based on the Type priority. If the conversion operation occurs on the large data volume side, the query will be relatively poor.
Discussion post
Previous topics:1. What factors do you consider when designing SQL Server objects to avoid performance problems?
Subsequent topics:
3. What do you think should you pay attention to when designing database operation programs to ensure effective use of the database?
4. What performance problems are confusing to you during your SQL server usage?