Hubbledotnet open-source full-text search database project-create full-text index for existing tables or views of the database (I) append only mode

Last Update:2018-12-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hubbledotnet allows you to easily create full-text indexes for existing tables or views in the database. manual intervention takes no more than 5 minutes. I will explain how to create full-text indexes for existing data tables in a few sections. This article describes how to create a full-text index in the append only mode.

Before creating a full-text index for an existing table or view, we still need to create a database in hubbledotnet. for how to create a database, see hubbledotnet open-source full-text search database project-create and delete a database.

After creating the hubbledotnet database, you can create a full-text index for existing tables or views in the relational database.

The following uses the news library as an example to create a full-text index.

Open the query analyzer and right-click the News Database Section and select create table, as shown in

Create a full-text search for Chinese news and configure basic information for hubbledotnet Data Tables

As shown in, I will first demonstrate how to create a full-text index for Chinese news. Follow the prompts on the page and enter the table name of hubbledotnet. Enter cnews here, enter the directory where the full-text index is located, and select the database Adapter. Here, because my relational database is sqlserver 2005, select sqlserver 2005 as the index. This adapter applies to SQL Server 2005 and later versions.

Then configure the connection string of the relational database. click the button below to test whether the connection string is correct. Click Next to go to the next step.

Select Index Mode

As shown in, we need to select the index mode for this step.

Because a full-text index is created from an existing data table, select

Build index from exist table

In the text box below, enter the name of the actual data table or view in the relational database. Enter news here.

There are two options in incremental mode.

The append only mode is applicable to the mode where data only grows without modification. In this mode, the full-text index field can be applied as long as it is not modified. This mode consumes less memory than append, delete, and update modes, and is faster. If you want to use this mode, the corresponding data table or view in the relational database must have a docid field, which must have a unique index (preferably a clustered index), and if it is self-increasing, or at least make sure that the values inserted later are larger than those inserted earlier.

Append, delete, and update modes. This mode can be used to add, delete, and modify data. The memory usage is larger than that of the previous method (4 bytes more per record ). In this mode, the corresponding data table or view in the relational database cannot be named docid, but there must be an int type Id field, the ID field name can be any name except "docid. If the table has a non-int type primary key field and an index is created, I will explain it later.

Next we will introduce the append only mode, as shown in. This is the structure of the corresponding data table in the relational database:

After configuring this step, click Next to go to the field setting step.

Note that, in versions 8.3.0 and earlier, if the data table contains some special data types, a TCP closed error may occur. This is a bug. Please upgrade it to version 8.3.0.1 or later, for how to upgrade, see the hubbledotnet open-source full-text search database project-how to upgrade. After the upgrade, A Correct prompt will be displayed. I will elaborate on the handling of special types in the future.

Configure index fields

As shown in, hubbledotnet will automatically list all indexed fields. Here we choose

The title and content fields are full-text index fields and tokenized fields. The word segmentation method is based on Chinese news. We choose pangusegment and pangusegment.

For the time field, we select a single-value index and an untokenized index.

If the URL field is not indexed, select none.

For the data types of hublledotnet, see hubbledotnet open-source full-text search database project-data and index types of data tables.

In the figure, the checkbox on the left of each field is used to delete the field. After the field is selected, click Delete to delete the selected field. If the field is not deleted, this checkbox is useless.

After completing this step, click Next to enter the last step.

Complete Index

This step lists the creation statements. You can perform the final check. If you are sure there is no problem, click Finish.

The prompt is displayed.

If you plan to start indexing immediately, select Yes

The rebuild table interface is displayed.

Click rebuild to create a full-text index.

After the full-text index is created, we can optimize it, as shown in

After optimization, you can search. (You can also search without optimization, and the performance will be slower)

Next let's see how to search

Search for Chinese news Example 1

Search for all records with two keywords "Beijing" and "" in the title and sort them in descending order based on the score size.

The parameter meanings following the word component are as follows:

The first parameter indicates the weight of the word component, which is 5000.

The second parameter indicates the actual position of the word component in the input searched sentence. For example, the "Beijing" position is 0 and the university start position is 2.

Top 10: output the first 10 matching records

Example 2

Search for all records with the title or content containing the "Beijing" and "" keywords and sort them in reverse order based on the score size.

Here, the title field is followed by a parameter 2, which indicates that the title field has a weight of 2, that is, the field is set to a weight value through this method.

Between 0 to 9 indicates that records from 0 to 9 are output. This method can be used for paging.

Example 3

Search for all records whose titles contain both the "Beijing" and "" keywords and sort them in descending order based on the score size.

The contains search can be used for exact matching. Here we find that the data searched by contains is much less than the match. Because only the words "Beijing" and "university" are included in the output.

Example 4

Search for the keywords "Beijing" and "" in the title, and all records whose time is later than January 1, January 1, 2007 and earlier than January 1, August 16, 2007 are sorted in reverse chronological order.

Example 5

The search title contains two keywords: "Beijing", "", and all records whose time is later than January 1, 2007 and earlier than August 16, 2007 are sorted in descending order by time and score.

That is, records sorted by time, records with the same time, and records with high scores are ranked first.

Return to hubble.net technical details

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Hubbledotnet open-source full-text search database project-create full-text index for existing tables or views of the database (I) append only mode

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support