Three solutions for searching Word content in combination with SQL Server full-text search

Source: Internet
Author: User



In addition to using the APIS provided by office to search for Word documents, this article briefly summarizes three solutions for searching Word files based on SQL Server's full-text retrieval technology.

1. Full-text search with the Windows Index Service

Summary:

A detailed example, refer to here: http://database.ctocio.com.cn/51/11440551.shtml

Advantages: Files can be physically stored in directories independently, and these files are stored in doc format.

Disadvantages: Read-only, not write.

2. Full-text search with BLOB Data

Solution Abstract: store the doc file in the database table in BLOB Data Format varbinary (max), and then perform full-text search on the table. This is the most common solution.

An example of simple table insertion:

------- Binary file query example
/*************************************/
Use Master
Go
If exists (SELECT name FROM sys. databases WHERE name = n' blobdatademodb ')
Drop database BlobDataDemoDB
GO
USE Master
GO
Create database BlobDataDemoDB
GO
-------- Enable full-text search
/*************************************/
Execute sp_fulltext_database 'enable'
Go
Use blobDataDemoDB
GO
-- Create a table containing BlOB Columns
/*************************************/
If OBJECT_ID ('sampleblobtable') is not null
Drop table SampleBlobTable
Go
Create table SampleBlobTable
(
[PKID] int identity (1, 1) primary key,
[FileType] Nvarchar (32) null,
[FileName] Nvarchar (255) null,
[FileContent] VARBINARY (MAX) NULL,
[AddTime] datetime default (getdate ())
)
GO
If exists (SELECT * FROM sys. objects WHERE
Object_id = OBJECT_ID (n' [dbo]. [CPP_InsertOneBlobDataToTable] ')
AND type in (n'p', n'pc '))
Drop procedure [dbo]. [CPP_InsertOneBlobDataToTable]
GO
-- Create a stored procedure for inserting data to SQL server
/*************************************/
Create procedure CPP_InsertOneBlobDataToTable
(
@ FileType nvarchar (32 ),
@ FileName nvarchiar (255 ),
@ FileContent VARBINARY (MAX)
)
AS
INSERT SampleBlobTable ([FileType], [FileName], [FileContent], [AddTime])
VALUES (@ FileType, @ Filename, @ FileContent, getdate ())
GO
 
//////////////////////////////////////// ///////
Using System;
Using System. Collections. Generic;
Using System. Linq;
Using System. Text;
Using System. IO;
Using System. Data. SqlClient;
Using System. Data;
Namespace BlobDataSearchDemo
{
Class Program
{
Const string conn = @ "Server = login \ Agronet09; DataBase = BlobDataDemoDB; uid = sa; pwd = ;";
Static void Main (string [] args)
{
SaveDoc2SQLServer (@ "D: \ 2008Data \ StreamData \ Doc \ dancing. Doc", conn );
SaveDoc2SQLServer (@ "D: \ 2008Data \ StreamData \ Doc \ tianlong Babu .doc", conn );
SaveDoc2SQLServer (@ "D: \ 2008Data \ StreamData \ Doc \ English.doc", conn );
Console. ReadKey ();
}
Private static void SaveDoc2SQLServer (string filepath, string conn)
{
FileInfo fi = new FileInfo (filepath );
If (fi. Exists)
{
// Open the stream and read it back.
Using (FileStream fs = File. OpenRead (filepath ))
{
Byte [] B = new byte [fi. Length];
SqlConnection Conn;
SqlCommand cmdUploadDoc;
UTF8Encoding temp = new UTF8Encoding (true );
While (fs. Read (B, 0, B. Length)> 0)
{
Conn = new SqlConnection (conn );
// Setting the SqlCommand
CmdUploadDoc = new SqlCommand ("CPP_InsertOneBlobDataToTable", Conn );
CmdUploadDoc. CommandType = CommandType. StoredProcedure;
CmdUploadDoc. Parameters. Add ("@ FileName", SqlDbType. NVarChar, 200). Value = fi. Name;
CmdUploadDoc. Parameters. Add ("@ FileContent", SqlDbType. VarBinary, 0). Value = B;
CmdUploadDoc. Parameters. Add ("@ FileType", SqlDbType. NVarChar, 32). Value =
Fi. Extension. Replace (".","");
Conn. Open ();
CmdUploadDoc. ExecuteNonQuery ();
Conn. Close ();
}
}
}
}
}
}

Query results:

 

Note:

Advantages: Import the doc file into the SQL Server database for easy reading and full-text retrieval. If necessary, the file can also be written.

Disadvantages: Varbinary (Max) is limited by the size of 2 GB, and the database stores a large amount of BLOB data, which will become abnormally bloated and greatly reduce the retrieval speed.

3. Full-text search with FileStream

Solution Abstract: similar to solution 2, only the FileStream technology is used to store the doc file in a physical file outside the database in the data format varbinary (max), and then perform full-text search on the table.

Prerequisites: You must install full-text retrieval and enable FileStream.

 

Refer:

Http://msdn.microsoft.com/zh-cn/library/bb933993.aspx

Http://www.cnblogs.com/downmoon/archive/2010/05/06/1727546.html

Http://www.cnblogs.com/downmoon/archive/2010/05/08/1730044.html

Advantages: Import the doc file into the SQL Server database for easy reading and full-text retrieval. You can also write the file if necessary, and overcome the disadvantages of solution 2. The varbinary (Max) field only stores indexes, while the actual content is stored outside the database. The size is limited only by the physical size of the NTFS folder.

Summary: This article briefly summarizes how to combine the full-text retrieval technology of SQL Server to search the content of Word files. I think both solution 1 and solution 3 can be implemented. Welcome to the discussion.




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.