SQL Server FAQs and Cases (performance + maintenance)

Source: Internet
Author: User
Tags data structures joins cpu usage

Missing index

Case: There are a number of interface requests failed on a certain line, and the CPU usage is high on the server. The Activity Analyzer (a recent resource-intensive query) found that several queries had CPU time (milliseconds/sec) of more than 3000. It is inferred that the query did not take advantage of the index, which solved the problem by indexing the related tables.

Analysis: Most performance issues are caused by missing indexes or index failures (bookmark lookups). This problem is due to the fact that the connection between the underlying table (the configuration table) is not indexed, and as more data is configured, the query computation becomes larger, and the problem becomes apparent when the amount of data reaches a certain magnitude (thousands of to tens of thousands of) and the concurrency ratio is higher (thousands of). These issues can be indexed when writing queries (by estimating the total amount of data and concurrency), or periodically through the activity parser for more CPU-time queries.

tempdb contention

Case: A high-concurrency interface in the system has recently failed to request, and the CPU usage and hard disk time are found to be more normal on the server. The buffer latch wait time (milliseconds/sec) was found to be more than 30000 through the Activity Analyzer (resource wait). This issue was resolved after several targeted optimizations, by analyzing the performance of the stored procedure piecemeal and by looking at the online data to locate the tempdb latch issue.

Analysis: A latch is a lock used by a database to protect internal data structures, such as to prevent multiple threads from writing a data page at the same time by adding a latch before writing. The creation and deletion of temporary tables involves the allocation and recycling of data pages, and the use of temporary tables in high concurrency interfaces results in frequent changes to the page allocated by the record space, resulting in more lock waits (affecting concurrency performance). A temporary table is replaced with a table variable during the period, but the effect is not ideal due to similar mechanisms for staging tables and table variables in storage. It was later found that table joins used the Split function, and the temporary table is used in the function, which results in the creation and deletion of the N-fold temporary table, which is significant by replacing the split function with the like effect. Later, it is estimated that the concurrency is large and the request fails, and the optimization logic joins the filtered select into to the temporary table, which can reduce the creation of the temporary table (the number of SELECT into rows 0 o'clock seems to have no effect on performance) and has some effect. Then there was a problem, and by creating multiple tempdb files, you could share the allocation of temporary tablespace to multiple pages, reducing the contention for locks, and no more problems. These optimizations are gradual and incremental, although the final step is to solve the problem, but the greater the magnitude of the previous step optimization, the more concurrent high (>5000) and the need to use temporary table scenarios should be considered on the web side of the logical optimization (such as the addition of Redis cache).

Shrinking the database for too long

Case: There is not enough disk space in the system database, data cannot be written, it is urgent to make room by shrinking the database after deleting some historical data. Shrinking the entire database takes a long time (a few minutes), which affects the execution of the database and should be managed to reduce the impact of time.

Analysis: Due to improper planning of database space or a sharp increase in business data resulting in insufficient hard disk space to write data, you must shrink the database. The principle of database contraction is to move the data at the end of the file to the front of the file and truncate the file, which is proportional to the amount of data moved. You can shrink a fixed size of space in a few times (there is space to write, several times for example, each time 1G easy to grasp the duration of each contraction), to the early morning through the maintenance plan to shrink the database, reduce the impact on the online users.

Job execution failed

Case: A period of time every night an interface will have several access failed records, and later found to be due to maintenance plan job failure. The re-organization index in the maintenance plan was removed, the build index was recreated, and the statistics were updated without any such problems.

Analysis: Some of the problems that just happen in a certain time period may be due to job execution, which can be done by comparing the time between the start and end of the job and the occurrence of the error, and which job is causing the job. Some of the operations in the maintenance plan, such as re-organizing the index, recreating the build index, updating statistics, and so on, seem to have little effect (the database automatically updates statistics by default), but it is time-consuming and affects the use of databases on the line, and you can choose to delete them.

Failed to modify data structure

Case: A major table partition was made on a certain line, and the interchange failed during the exchange table name step. Later changes to the previous backup table to the online table caused the CPU to rise to more than 100, the intention to change the original table to the online table has been modified failed. After you temporarily disable SQL Server account, you can modify the name of the table.

Analysis: There is a schema-stable lock during SQL execution, and it is easy to modify the data structure failure for highly concurrent access tables. It is recommended that you temporarily stop the business or disable the SQL Server account before modifying the data structure.

Summarize
    1. If the CPU is high, you can find the longest-consuming query through the Activity Analyzer, which is a recent resource-intensive query. If you have more time-consuming queries (CPU time (milliseconds/sec) greater than 3000), you should try to optimize by missing indexes.
    2. If CPU usage and disk time are normal, you can find the wait category with the longest waiting time through the Activity Analyzer (resource waits). If buffer latch takes a long time (waiting time (milliseconds/second) to reach more than 30000), you should try to optimize for tempdb contention.
    3. You can view the job execution record to determine whether the problem is caused by the job
    4. It is best to shrink the database in batches, so that time can be controlled and the impact on the online business can be reduced.
    5. You need to modify the data structure in an online emergency, but you can temporarily stop the business or disable the SQL Server account if the operation fails due to a lock or other cause.

SQL Server FAQs and Cases (performance + maintenance)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.