Use SQL Server to import and index Microsoft Word documents

Source: Internet
Author: User
Q: I need to import Microsoft Word documents to SQL Server and index these documents so that they can be used in relational queries. How to import and index documents?
A: SQL Server allows you to import Word documents in multiple ways. Let's take a look at several of the most common methods. Note that before importing a document to SQL Server, you must create an image data type column to store data. Then, you can use the textcopy.exe command line tool to read the image file into the database to import the file. If you need basic instructions for this tool, type textcopy/? In the command prompt state /?. Another way to import Word documents to SQL Server is to use the Microsoft ActiveX Directory Object (ADO) Stream interface to write the import code. You can find the sample code in the Microsoft product Support Service (PSS) article about accessing and modifying SQL Server BLOB data by using the ADO Stream object.
In addition, you can move binary data to SQL Server. For a detailed description of this method, refer to the document on using ADO to retrieve and update the SQL Server text domain in PSS. Mobile binary data allows you to store part of the data in the database, which is especially useful when you need to control the data format. For example, if you only need 1,000 to 1,010 bytes of data, the speed of importing binary data is much higher than that of using the ADO Stream interface, this is because the amount of data retrieved from the disk by SQL Server is greatly reduced. This technique is usually used to store bit masks, which are used to indicate the on or off sign bits of an application.
SQL Server 2000 comes with sample code that explains how to move binary data. To view the code, select the Program FilesMicrosoft SQL Server80ToolsDevToolsSamplesado path on the drive where the sample code is installed on the SQL Server 2000 disc. Expand the executable file and find the Samples subdirectory under the Visual Basic directory. In the example of Employee, pay attention to how the code uses the FillDataFields () function.
To index Word documents, both SQL Server 7.0 and SQL Server 2000 provide full-text search components. This component uses a mix of technologies to index large text and image columns. When performing a full-text search, you must specify the file type contained in the image column and the filter required to extract information from binary data ). For more information about using full-text indexes, see related topics in SQL Server online books and read the article David Jones posted on the SQL Server Magazine website in July 2000 titled building a better search engine. Note that indexed word documents do not automatically generate a group of relational tables containing keywords in the document. However, index files allow you to include these word documents in your search. The following is a feasible method for retrieving keywords from data:
Use OLE to automatically read user-defined keywords from the document. When this document is loaded, these keywords are stored in the relational table.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.