Mssqlserver fulltext full-text retrieval and analysis

Source: Internet
Author: User
Tags microsoft sql server mssqlserver create database

Like '% keyword %', which is used to search for each word after word segmentation.

Syntax:
Contains:

The code is as follows: Copy code

SELECT Field 1, field 2
FROM table name
WHERE contains (field, '"word 1" or "word 2 "')

Sort by similarity of search results

The code is as follows: Copy code
SELECT Field 1, field 2
FROM table name
Inner join containstable (table name, field, '"word 1" or "word 2"', 10) as k
On table name. id = k. [key]
Order by k. RANK DESC


Freetext:

The code is as follows: Copy code

SELECT Field 1, field 2
FROM table name
WHERE freetext (field, 'word 1 word 2 ')

Sort by similarity of search results

The code is as follows: Copy code
SELECT Field 1, field 2
FROM table name
Inner join freetexttable (table name, field, 'word 1 2', 10) as k
On table name. id = k. [key]
Order by k. RANK DESC

In the above text, 10 of freetexttable or containstable indicates that 10 pieces of data are obtained.


I recently searched for the full-text search and found some problems, which are summarized as follows:

Full-text index and query concepts (from SQL online help)

The primary design requirement for full-text indexing, query, and synchronization is that a unique full-text key column (or a single column primary key) exists on all tables registered for full-text retrieval ). Full-text indexes track the important words used and their locations.

For example, assume that there is a full-text index to the DevTools table. The full-text index may indicate that the word Microsoft is found in the 423rd and 982nd words in the Abstract column, and the row is associated with ProductID 6. This index structure supports effective search and advanced search operations for all items that contain indexed words, such as phrase search and adjacent search.

To prevent full-text indexing from being bloated because it contains many words that are not helpful for retrieval, additional words such as a, and, is, and the are ignored. For example, specifying "the products ordered during these summer months" is the same as specifying "products ordered during summer months. Rows with both strings will be returned.

The MssqlFtdataSqlserverConfig directory provides a list of interference words in multiple languages. This directory is created when Microsoft® SQL Server™ with full-text search support is installed and the interference word file is installed. The interference word file can be edited. For example, the system administrator of a high-tech company can add the word computer to their interference vocabulary. (If you edit the interference Word file, you must refill the full-text directory before the change takes effect .) The following table shows the interference Word files and their corresponding languages.

Interfering word file language
-----------------------
Noise. chs Simplified Chinese
Noise. cht Traditional Chinese
Noise. dat language neutral
Noise. deu German
Noise. eng English (UK)
Noise. enu English (USA)
Noise. esn Spanish
Noise. fra French
Noise. ita Italian
Noise. jpn Japanese
Noise. kor Korean
Noise. nld Dutch
Noise. sve Swedish


When processing full-text queries, the search engine returns the key values of the rows that meet the check criteria to Microsoft SQL Server. For example, there is a SciFi table where the Book_No column is the primary key column.

The code is as follows: Copy code

Book_No Writer Title
---------------------------------------------
A025 Asimov Foundation's Edge
A027 Asimov Foundation and Empire
C011 Clarke Childhood's End
V109 Verne Mysterious Island


Suppose you want to use a full-text search query to find the name of the book containing the word Foundation. In this example, the values A025 and A027 are obtained from the full-text index. Then, SQL Server uses these key values and other column information to respond to the query.

The following table shows the languages used to store full-text index data. These languages set identifiers based on the Unicode sorting rules area selected during SQL Server installation.

The language in which the full-text data is stored.
------------------------------------------------------
Chinese phonetic symbols (Taiwan) Traditional Chinese
Chinese pinyin Simplified Chinese
Chinese strokes Simplified Chinese
Chinese strokes (Taiwan) Traditional Chinese
Dutch
English (UK)
French
General Unicode English (USA)
German
German phone book German
Italian
Japanese
Japanese Unicode Japanese
Korean
Korean Unicode Korean
Spanish (modern) Spanish
Swedish/Finnish Swedish

 

All other Unicode collation regions not in this list set the identifier values are mapped to the break characters and dry separators of neutral language words that use spaces to separate words.

This section describes how to set the identifier of the Unicode sorting rule region for all data types (such as char and nchar) that can be fully indexed ). If it is the language type set for the sorting order of char, varchar, or text columns, it is not the identifier language set for the Unicode sorting rule region, when you perform full-text indexing and query on columns of the char, varchar, and text types, you still use the Unicode sorting rule area to set the identifier value.

 

Create a full-text index (take the index image as an example, and the fields of other types are roughly the same)

Full-text index of the title image column, full guide!
Author pengdali [original]
Keyword full-text index image


Today, the "never met in a hundred years" went out of service and read a book. In the evening, I made a full-text index and decided to post my experiences. I would like to write details as much as possible. Let's study together. Please correct me!

1. Start the Microsoft Search service
Start menu --> SQL program group --> Service manager --> drop-down basket --> Microsoft Search service --> start it

2,
.. Microsoft SQL ServerMSSQLFTDATASQLServerConfig directory to create a non-empty noise. chs file
The non-empty noise. chs file is also called the empty noise. chs file, but every time I write a few useless letters in it.

3. Establish the environment
Open Query Analyzer --> execute the following script:
--------------------------------------------

The code is as follows: Copy code

Create database test --- create a test database
Use test --- select test Database
Create table dali (ID int not null primary key, MyImage image, FileType varchar (255), FileNmae varchar (255) --- create a dali table
-- The Id, MyImage, and FileType columns in the dali table are required. To index the image column, you must have a primary key column, an image column, and a column that stores the file type.
-- In windows, file types are differentiated by Extensions. Therefore, the FileType column is used to store the file extension.
--------------------------------------------

Sp_fulltext_database 'enable' -- enable database for full-text index
Sp_fulltext_catalog 'My _ FullDir ', 'create' --- create a full-text directory named My_FullDif

Declare @ Key sysname; select @ Key = c. name from syscolumns a, sysconstraints B, sysobjects c where. id = object_id ('Dali ') and. name = 'id' and. id = B. id and B. constid = c. id and c. name like 'PK %'
Exec sp_fulltext_table 'Dali ', 'create', 'My _ fulldir', @ Key ---- the two statements are full-text indexes, marking the table

Sp_fulltext_column 'Dali ', 'myimag', 'Add', 0x0804, 'filetype' --- this statement specifies the full-text index column for The MyImage column, and FileType indicates the type column.

------------------------------------------------
4. Place a word file with the doc extension, an excel file with the xls extension, a webpage file with the htm extension, and an image with the bmp extension on drive C.
There are 4 in total. You can add them as needed!

5. Insert data
Create the following stored procedure

The code is as follows: Copy code

--------------------------------------------------
Create procedure sp_textcopy
@ Srvname varchar (30 ),
@ Login varchar (30 ),
@ Password varchar (30 ),
@ Dbname varchar (30 ),
@ Tbname varchar (30 ),
@ Colname varchar (30 ),
@ Filename varchar (30 ),
@ Whereclause varchar (40 ),
@ Direction char (1)
AS
/* This is to use the textcopy tool to insert files into the database. If there is a front-end tool, you can use the front-end development tool to insert the files. Here is a demonstration */
DECLARE @ exec_str varchar (255)
SELECT @ exec_str = 'textcopy/s' + @ srvname + '/u' + @ login +'/p' + @ password + '/d' + @ dbname +'/T '+ @ tbname +'/C' + @ colname + '/W "' + @ whereclause + '"/F "' + @ filename + '"/' + @ direction
EXEC master .. xp_mongoshell @ exec_str
----------------------------------------------------

Insert dali values (1, 0x, 'Doc', 'vigorously Doc') --- the second column is 0x. It is a hexadecimal number corresponding to the image column and is required, do not write null. The third column is of the file type, with the extension

Sp_textcopy 'Your server name', 'sa ', 'Your password', 'test', 'Dali', 'myimag', 'C: powerful doc.doc ', 'Where ID = 1', 'I'
------- In turn, the parameters are: instance name, user name, password, database name, table name, image column name, path and file name, condition (you must ensure that only one row is selected), I
Bytes ---------------------------------------------------------------------------------------------------------------------
Insert dali values (2, 0x, 'bmp ', 'image ')
Sp_textcopy 'Your server name', 'sa ', 'Your password', 'test', 'Dali', 'myimage', 'C: Image .bmp ', 'Where ID = 2', 'I' -- note that the condition is ID = 2

Insert dali values (3, 0x, 'XLS ', 'Excel file ')
Sp_textcopy 'Your server name', 'sa ', 'Your password', 'test', 'Dali', 'myimag', 'C: excelfile .xls ', 'Where ID = 3', 'I' -- note that the condition is ID = 3

Insert dali values (4, 0x, 'htm', 'Website ')
Sp_textcopy 'Your server name', 'sa ', 'Your password', 'test', 'Dali', 'myimag', 'C: Web page .htm ', 'Where ID = 4', 'I' -- note that the condition is ID = 4

---------- In the preceding statement, ensure that the type is the same, the path is correct, and the condition is unique and correct.

6. Full-text index filling

The code is as follows: Copy code

Sp_fulltext_table 'Dali ', 'start _ full' --- the first parameter is the table name, and the second parameter is the full-text index filling of the startup table.

7. You can start your experiment.

The code is as follows: Copy code

Select * from dali where contains (MyImage, 'J instructor ')

Select * from dali where contains (MyImage, '')

------ END ----------
-- Debugging environment: SQLServer2000 Enterprise Edition and Windows2000 Advanced Server

Questions about full-text index:

1. An error occurred while searching:
Server: message 7619, level 16, status 1, row 2
The query clause only contains ignored words.

In this case, modify the interference word list file of the corresponding language in MssqlFtdataSqlserverConfig.

2. Modified the interference word file. The above problem still occurs when querying Chinese characters.
A. First, check whether your SQL has installed the latest patch. The check method is to run in the query Analyzer:
Select @ version
If the version is earlier than 8.00.760, it indicates that you have not installed the sp3 patch.

SQL patch download:
Aspx? Displaylang = zh-cn & FamilyID = 9032f608-160a-4537-a2b6-4cb265b80766 "> http://www.microsoft.com/downloads/details.aspx? Displaylang = zh-cn & FamilyID = 9032f608-160a-4537-a2b6-4cb265b80766

Note that after downloading the package, decompress the package and execute setup. bat in the decompressed directory.

B. When configuring the full-text index, select "Chinese (China)" for The Broken Word )"

The c. Noise. chs file contains at least one word, for example :?

D. If you can modify the interference word file during full-text search, it means that this file is not used for full-text search.
If you need to use this file for full-text search
Enterprise Manager -- expand your database -- right-click the full-text directory -- re-create the full-text Directory

3. The data in the table cannot be retrieved after it is changed.
Method 1. Right-click your table -- full-text index table -- enable incremental filling
Method 2: Right-click your table-full-text index table-change tracking, so that later modifications will be automatically filled (with a certain delay)

First look at an instance of sql2005

 

The code is as follows: Copy code

-- Check that full-text indexing is enabled for the current database status 1.
Select databaseproperty ('database name', 'isfulltextenabled ')
-- Enable FullText.
Execute sp_fulltext_databse 'enable'
-- Disable this function
Execute sp_fulltext_databse 'disable'
-- Create a full-text Index Directory
-- Delete the full-text directory drop fulltext directory name
Create fulltext catalog directory name
-- Each table can only have one full-text index, which is stored in the specified index Directory. You can create a full-text index either by using the wizard or by using SQL.
-- Indexname refers to the existing unique index name based on the specified table, instead of the unique index column name. If the index does not exist, you need to create a unique index first.
-- Delete full-text index drop fulltext index on table name
Create fulltext index on name
(Index 1, index 2 ...)
Key index indexname ON table name
-- Full-text index query
Select * from table name
Where contains (column name, '"202 *" or "2 *"')
Select * from table name
Where FREETEXT (column name, '" 202 * "and" 2 *"')
/* Note:
The FREETEXT statement is used to search for a free text string in all or specified columns of a table, and
Returns the data row that matches the string. Therefore, the FREETEXT statement is also called a free full-text query.

Functions of the CONTAINS statement
Is to search in all or specified columns of the table: one word or phrase; the prefix of one word or phrase; the other similar to one word
Characters; a derived word; a repeated word.
*/

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.