In addition to using the APIS provided by office to search for Word documents, this article briefly summarizes three solutions for searching Word files based on SQL Server's full-text retrieval technology.
1. Full-text search with the Windows Index Service
Summary:
A detailed example, refer to here: http://database.ctocio.com.cn/51/11440551.shtml
Advantages: Files can be physically stored in directories independently, and these files are stored in doc format.
Disadvantages: Read-only, not write.
2. Full-text search with BLOB Data
Solution Abstract: store the doc file in the database table in BLOB Data Format varbinary (max), and then perform full-text search on the table. This is the most common solution.
An example of simple table insertion:
------- Binary file query example
/*************************************/
Use Master
Go
If exists (SELECT name FROM sys. databases WHERE name = n' blobdatademodb ')
Drop database BlobDataDemoDB
GO
USE Master
GO
Create database BlobDataDemoDB
GO
-------- Enable full-text search
/*************************************/
Execute sp_fulltext_database 'enable'
Go
Use blobDataDemoDB
GO
-- Create a table containing BlOB Columns
/*************************************/
If OBJECT_ID ('sampleblobtable') is not null
Drop table SampleBlobTable
Go
Create table SampleBlobTable
(
[PKID] int identity (1, 1) primary key,
[FileType] Nvarchar (32) null,
[FileName] Nvarchar (255) null,
[FileContent] VARBINARY (MAX) NULL,
[AddTime] datetime default (getdate ())
)
GO
If exists (SELECT * FROM sys. objects WHERE
Object_id = OBJECT_ID (n' [dbo]. [CPP_InsertOneBlobDataToTable] ')
AND type in (n'p', n'pc '))
Drop procedure [dbo]. [CPP_InsertOneBlobDataToTable]
GO
-- Create a stored procedure for inserting data to SQL server
/*************************************/
Create procedure CPP_InsertOneBlobDataToTable
(
@ FileType nvarchar (32 ),
@ FileName nvarchiar (255 ),
@ FileContent VARBINARY (MAX)
)
AS
INSERT SampleBlobTable ([FileType], [FileName], [FileContent], [AddTime])
VALUES (@ FileType, @ Filename, @ FileContent, getdate ())
GO
//////////////////////////////////////// ///////
Using System;
Using System. Collections. Generic;
Using System. Linq;
Using System. Text;
Using System. IO;
Using System. Data. SqlClient;
Using System. Data;
Namespace BlobDataSearchDemo
{
Class Program
{
Const string conn = @ "Server = login \ Agronet09; DataBase = BlobDataDemoDB; uid = sa; pwd = ;";
Static void Main (string [] args)
{
SaveDoc2SQLServer (@ "D: \ 2008Data \ StreamData \ Doc \ dancing. Doc", conn );
SaveDoc2SQLServer (@ "D: \ 2008Data \ StreamData \ Doc \ tianlong Babu .doc", conn );
SaveDoc2SQLServer (@ "D: \ 2008Data \ StreamData \ Doc \ English.doc", conn );
Console. ReadKey ();
}
Private static void SaveDoc2SQLServer (string filepath, string conn)
{
FileInfo fi = new FileInfo (filepath );
If (fi. Exists)
{
// Open the stream and read it back.
Using (FileStream fs = File. OpenRead (filepath ))
{
Byte [] B = new byte [fi. Length];
SqlConnection Conn;
SqlCommand cmdUploadDoc;
UTF8Encoding temp = new UTF8Encoding (true );
While (fs. Read (B, 0, B. Length)> 0)
{
Conn = new SqlConnection (conn );
// Setting the SqlCommand
CmdUploadDoc = new SqlCommand ("CPP_InsertOneBlobDataToTable", Conn );
CmdUploadDoc. CommandType = CommandType. StoredProcedure;
CmdUploadDoc. Parameters. Add ("@ FileName", SqlDbType. NVarChar, 200). Value = fi. Name;
CmdUploadDoc. Parameters. Add ("@ FileContent", SqlDbType. VarBinary, 0). Value = B;
CmdUploadDoc. Parameters. Add ("@ FileType", SqlDbType. NVarChar, 32). Value =
Fi. Extension. Replace (".","");
Conn. Open ();
CmdUploadDoc. ExecuteNonQuery ();
Conn. Close ();
}
}
}
}
}
}
Query results:
Note:
Advantages: Import the doc file into the SQL Server database for easy reading and full-text retrieval. If necessary, the file can also be written.
Disadvantages: Varbinary (Max) is limited by the size of 2 GB, and the database stores a large amount of BLOB data, which will become abnormally bloated and greatly reduce the retrieval speed.
3. Full-text search with FileStream
Solution Abstract: similar to solution 2, only the FileStream technology is used to store the doc file in a physical file outside the database in the data format varbinary (max), and then perform full-text search on the table.
Prerequisites: You must install full-text retrieval and enable FileStream.
Refer:
Http://msdn.microsoft.com/zh-cn/library/bb933993.aspx
Http://www.cnblogs.com/downmoon/archive/2010/05/06/1727546.html
Http://www.cnblogs.com/downmoon/archive/2010/05/08/1730044.html
Advantages: Import the doc file into the SQL Server database for easy reading and full-text retrieval. You can also write the file if necessary, and overcome the disadvantages of solution 2. The varbinary (Max) field only stores indexes, while the actual content is stored outside the database. The size is limited only by the physical size of the NTFS folder.
Summary: This article briefly summarizes how to combine the full-text retrieval technology of SQL Server to search the content of Word files. I think both solution 1 and solution 3 can be implemented. Welcome to the discussion.