SQL server Full-text index performance problems

Source: Internet
Author: User
Tags error code memory usage ranges cpu usage management studio sql server management sql server management studio server memory


The full-text index performance of SQL Server has never been very good. I heard from my colleagues today and found a skill. I went back and tried it, and found that it can significantly improve the full-text index performance, generally, it can be increased by 2 Times +

 


Original SQL statement:


Select * from Table1 where Contants (*, 'test ')

Optimized SQL statement:


View sourceprint? Select * from Table1 where Contains (Content, Title), 'test ')

 


The difference between the two SQL statements is whether or not the column names after Contanis are specified. After testing, specifying the full-text index column can indeed greatly improve the full-text index performance of SQL Server.


To use all processors or kernels to the maximum extent, set sp_configure 'Max full-text crawl Ranges' to the number of CPUs of the system. For more information about the configuration options, see max full-text crawl range options.

Make sure that the base table has a clustered index. Use the integer data type for the first column of the clustered index. Avoid using GUID in the first column of the clustered index. Performing multi-range filling on clustered indexes results in the highest filling speed. We recommend that you use the integer data type for the column that acts as the full-text key.

Use the update statistics statement to UPDATE the STATISTICS of the base table. More importantly, update the statistics of the clustered index or full-text key for full filling. This helps to fill in multiple ranges to generate good partitions on the table.

To improve the incremental filling performance, generate secondary indexes for the timestamp column.

Before full filling on a large multi-CPU computer, we recommend that you set the max server memory value to temporarily limit the size of the buffer pool so that sufficient memory is available for the fdhost.exe process and operating system. For more information, see "estimate memory requirements for the filter background program host process (fdhost.exe)" next to this topic ".

Solving the problem of full filling performance
--------------------------------------------------------------------------------

To diagnose performance problems, check the full-text crawler logs. For more information about crawling logs, see troubleshooting errors in full-text filling (crawling.

If you are not satisfied with the fully-filled performance, we recommend that you perform the following troubleshooting steps in sequence.

Physical memory usage
During the full text period, the memory of fdhost.exe or sqlservr.exe may be insufficient. If the full-text crawling log shows that fdhost.exe is being restarted repeatedly, or the system returns error code 8007008, it means that one of these processes has insufficient memory. If fdhost.exe is generating a dump (especially on a large multi-CPU computer), the memory of the process may be insufficient.

Note:
For details about the full-text crawling memory buffer, see sys. dm_fts_memory_buffers (Transact-SQL ).
 

Possible reasons are as follows:

If the amount of physical memory available during full filling is zero, the SQL Server buffer pool may be occupying most of the system's physical memory.

The sqlservr.exe process attempts to occupy all available memory of the buffer pool (the maximum server memory configured ). If the memory size of the allocated hosts server is too large, the fdhost.exe process may face insufficient memory and shared memory allocation failure.

Note:
On a multi-CPU computer (such as a 64-channel IA64 computer), the buffer pool memory contention may occur between fdhost.exe and sqlservr.exe. Insufficient shared memory will cause batch retry, memory jitter, and the fdhost.exe process to be dumped.
 

You can solve this problem by setting the "maximum Server memory" value of the SQL Server buffer pool. For more information, see "estimate memory requirements for the filter background program host process (fdhost.exe)" next to this topic ". It may also be useful to reduce the batch size used for full-text indexing.

Paging problems

If the page file size is insufficient, the memory of fdhost.exe or sqlservr.exe may also be insufficient, for example, on a system with a small page file with limited growth.

If the crawling log does not indicate any memory-related faults, the performance may be degraded due to excessive paging.

Estimated memory demand for the filter background program host process (fdhost.exe)
The memory size required by the fdhost.exe process depends on the number of full-text crawling ranges used, the size of the inbound shared memory (ISM), and the maximum number of ISM instances.

You can use the following formula to roughly estimate the memory occupied by the filter background program host (in bytes ):

Number_of_crawl_ranges * ism_size * max_outstanding_isms * 2

The default values of the variables in the preceding formula are as follows:


Variable
Default value
 
Number_of_crawl_ranges
Number of CPUs
 
Ism_size
The x86 computer is 1 MB

X64 computers are 4 MB, 8 MB, or 16 MB, depending on the total physical memory
 
Max_outstanding_isms
X86 computer is 25

X64 is 5
 

The following table lists the criteria for estimating the memory demand for fdhost.exe. The formulas in this table use the following values:

F, which is the estimated memory size (MB) required by fdhost.exe ).

T, which is the total amount of physical memory available in the system (MB ).

M, which is the best "maximum server memory" setting.

Important
For basic information about formulas, see 1, 2, and 3 below.
 


Platform
Estimated memory demand for fdhost.exe (MB)-F1
Formula for calculating the maximum server memory-M2
 
Disable AWE x86
F = Number of crawl ranges * 50
M = minimum (T, 2000)-F-500
 
Enable AWE x86
F = Number of crawl ranges * 50
M = T-F-500
 
X64 or IA643
F = Number of crawl ranges * 10*8
M = T-F-500
 

1. If multiple full fills are in progress, calculate the memory demand for each fully filled fdhost.exe, such as F1 and F2. Then M is calculated based on T-sigma (Fi.

2 500 MB is the estimated memory size required by other processes in the system. If the system is performing other work, add this value accordingly.

3. ism_size is assumed to be 8 MB for x64 platforms.

Example: estimate the memory demand for fdhost.exe

This example is applicable to AMD64 computers with 8 gm ram and 4 dual-core processors. Calculate the estimated memory value F required by fdhost.exe. The number of network crawlers is 8.

F = 8*10*8 = 640

Then calculate the optimum value of "Max server memory" M. The total available physical memory (MB) T of the system is 8192.

M = 8192-640-500 = 7052

Example: set the maximum server memory

In this example, use the sp_configure and RECONFIGURETransact-SQL statements to set "maximum server memory" to the M value calculated in the previous example, that is, 7052.

Copy
USE master;
GO
EXEC sp_configure 'Max server memory ', 7052;
GO
RECONFIGURE;
GO

Set maximum server memory configuration options

Server memory options

How to view or change Server properties (SQL Server Configuration Manager)

How to set a fixed amount of memory (SQL Server Management Studio)

Factors that can reduce CPU usage
We hope that the performance of full filling is not optimal when the average CPU usage is lower than about 30%. This section discusses some factors that affect CPU usage.

Long wait page

To see if the waiting time for the page is too long, execute the following statements:

Copy
Execute select top 10 * FROM sys. dm_ OS _wait_stats ORDER BY wait_time_ms DESC;

The following table describes the types of waiting.


Wait type
Description
Possible solutions
 
PAGEIO_LATCH_SH (_ EX or _ UP)
This may indicate an IO bottleneck. In this case, the average disk queue length is usually high.
Moving full-text indexes to other file groups on other disks may help reduce IO bottlenecks.
 
PAGELATCH_EX (or _ UP)
This may indicate a large number of contention among multiple threads attempting to write the same database tutorial file.
Adding files to the file group where the full-text index is located may help reduce such contention.
 

For more information, see sys. dm_ OS _wait_stats (Transact-SQL ).

Low efficiency in scanning base tables

Full filling scans the base table to generate batches. In the following cases, such a table scan may be inefficient:

If a full-text index is being created for a base table with a high percentage of non-row columns, scanning the base table to generate batches may become a bottleneck. In this case, using varchar (max) or nvarchar (max) to move small data in rows may be useful.

If the base table is very fragmented, scanning may be inefficient. For details about how to calculate out-of-row data and index fragmentation, see sys. dm_db_partition_stats (Transact-SQL) and sys. dm_db_index_physical_stats (Transact-SQL ).

To reduce fragments, you can reorganize or regenerate clustered indexes. For more information, see reorganizing and re-indexing.

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.