Document directory
- Table structure to be indexed
- TriggerTable)
- Set TableInfo
- Create an update trigger
- Create a delete trigger
- Incremental Synchronization
- Execute synchronization operations
-
- Synchronization Process
- Impact of triggers on Performance
- Instantiation of TableSynchronization
- Trigger synchronization:
- Get synchronization progress:
- Stop Synchronization
Author: eaglet
Reprinted with the source
For most applications, the full-text search function is only part of the application's functions, not all of them. Many systems often lack full-text search design at the beginning of design, most of the search functions are implemented using the like statement of the database. As the system capacity increases and the number of users increases, such like cannot meet the requirements of intra-site search in terms of performance and function. HubbleDotNet provides a loosely coupled system integration solution for these users. In less than one hour, users can implement the background part of the full-text search function of the existing system, you do not need to make any changes to the existing database table structure, and do not need to write a lot of code. Automatic synchronization with existing table data is an important part of this solution. This document describes how to configure full-text indexes and synchronize data from existing data tables.
Applicable version: HubbleDotNet 0.9.6.0
Home: http://www.hubbledotnet.com/
: Http://hubbledotnet.codeplex.com/
Applicability: This article applies to two modes of passive mode index Append Only and Updatable
Function introduction:
Index existing data tables. You can only create an index in passive mode.
For details, see: create full-text indexes for existing tables or views of the database (1) Append Only mode and HubbleDotNet open-source full-text search database project-create full-text indexes for existing tables of the database (2) Updatable Mode
Before Version 0.9, the passive mode index (IndexOnly = true) cannot be automatically synchronized. Because of the lack of trigger mechanisms between the HubbleDotNet full-text index and the database, the number of data tables in the database increases, changes and deletions cannot be known by hubbledotnet. You must synchronize data with the database manually. For details, see synchronize data with existing tables or views through programs.
Later than version 0.9.6.0, HubbleDotNet provides a mechanism for automatic synchronization with existing tables in the database. You only need to call Hubble. A class provided by SqlClient can trigger HubbleDotNet to synchronize with the database. You can determine when to trigger synchronization based on your actual situation and the synchronization cycle. HubbleDotNet does not provide automatic synchronization tasks for the time being. This is because the current requirement is not very clear. Different projects may have different requirements on the synchronization trigger time and synchronization cycle, for example, some users may want to trigger synchronization at night, and some users may want to synchronize once every five minutes. HubbleDotNet's current version has decided to give the user the initiative to trigger synchronization due to the large demand difference. Of course, this trigger synchronization operation is very simple and can be completed with only three lines of code.
The following describes how to set automatic synchronization.
Append only mode
In Append only mode, automatic synchronization is easy. You only need to set TableSynchronization to true in Table Info. You can.
Synchronization Process
As shown in, the user is executing Hubble. after the Synchronize method of TableSynchronization provided by SqlClient is used, it first determines whether synchronization is in progress. If synchronization is in progress, False is returned. You can wait until the synchronization is completed, execute this method to trigger synchronization. After the synchronization is triggered, the HubbleDotNet Server scans new records that are not indexed in the database to index these records. After the index is completed, the indexes are optimized according to the optimization scheme specified by the user.
The query analyzer (QueryAnalyzer) of HubbleDotNet provides the interface for calling synchronous data, which is also used as an example code for user reference.
The following is a simple example. The table structure is as follows:
The synchronization settings are as follows:
Execute synchronization operations
Start Synchronization
Parameter description:
Step: the number of records read from the database each time during synchronization. If it is set to 5000 and the number of records to be synchronized is 12000, the Server performs three batch indexes after synchronization, and the first two batch indexes are 5000 at a time, the last 2000 entries. Optimize all records to be synchronized after batch indexing. If you don't know what this is about, keep this value unchanged. 5000 is an ideal value I have tested.
Optimize option: Specifies the optimization policy. By default, optimization is performed in the minimal mode. If the index file is large, the optimization in the smallest way will take a long time. You can choose to optimize in this way, that is, the Middle method.
Click Start to Start synchronization. You can click Stop to Stop synchronization.
Synchronization completed
Updatable Mode
The Updatable mode is the original Append, update, and delete modes. This mode is complicated due to deletion and change operations. We need to create a secondary table to implement
The following is an example:
Table structure to be indexed
TriggerTable)
create table HBTrigger_EnglishNews
(
Serial bigint identity(1,1) not null primary key,
Id int not null,
Opr char(16),
Fields nvarchar(4000) null,
)
Go
Create index ITriggerOprSerial on HBTrigger_EnglishNews (Opr, Serial)
The table structure is described above. The secondary table must be created according to the table structure, and the table name can be set as needed.
The Id Field is used for the Docid Field in the index table. If the DocId Field in the index table is bigint, you must specify it as bigin.
The Opr field tells hubbledotnet whether to update or delete the changes.
The Fields field is valid only when it is updated. It tells hubbledotnet which Fields have been modified by the Update operation.
If your database is not an SQL SERVER, create a secondary trigger table for the corresponding database according to the table structure.
Note: The secondary trigger table must be in the same database as the primary table. To query Opr and Serial at the same time, we recommend that you create a composite index based on the preceding SQL statement.
Set TableInfo
Set TableSynchronization to true as shown in the following figure and specify the table name of the secondary table.
Create an update trigger
If your existing table has an update operation, you must create an update trigger. when the data is updated, the update trigger writes the updated Id and field information to the auxiliary trigger table.
All fields of the Tokenized and Untokenized types must be set in the update trigger. The sample code for updating the trigger is as follows:
Create Trigger HBTrigger_EnglishNews_Update On EnglishNews for Update As DECLARE @updateFields nvarchar(4000) set @updateFields = '' if Update(GroupId) begin set @updateFields = @updateFields + 'GroupId,' end if Update(SiteId) begin set @updateFields = @updateFields + 'SiteId,' end if Update(Time) begin set @updateFields = @updateFields + 'Time,' end if Update(Title) begin set @updateFields = @updateFields + 'Title,' end if Update(Content) begin set @updateFields = @updateFields + 'Content,' end if @updateFields <> '' begin insert into HBTrigger_EnglishNews select id, 'Update', @updateFields from Inserted end
Here, EnglishNews is the name of the main table, and HBTrigger_EnglishNews is the name of the table supporting the trigger table.
GroupId, SiteId, and so on are the untokenized and tokenized fields in the master table. The trigger is used to record which fields have changed when the table is updated.
This Code cannot be copied when the trigger is implemented for a specific table. You need to modify Time, Title, Content, and so on based on the index field of the specific table.
Similarly, if the database is not an SQL SERVER, create a trigger according to the trigger syntax of the corresponding database.
Create a delete trigger
If your existing table has a delete operation, you must create a delete trigger. when data is deleted, the delete trigger writes the deleted Id information to the secondary trigger table.
Create Trigger HBTrigger_EnglishNews_Delete
On EnglishNews
for Delete
As
insert into HBTrigger_EnglishNews select id, 'Delete', '' from Deleted
Incremental Synchronization
For incremental (Insert) data synchronization, the Updatable and Append only methods are similar and do not need to be implemented through triggers. Therefore, if your table is frequently incremental, you are not quite worried about the impact of the trigger on data insertion performance, because no trigger is triggered during incremental operations.
Execute synchronization operations
Start Synchronization
Parameter description:
Step: the number of records read from the database each time during synchronization. If it is set to 5000 and the number of records to be synchronized is 12000, the Server performs three batch indexes after synchronization, and the first two batch indexes are 5000 at a time, the last 2000 entries. Optimize all records to be synchronized after batch indexing. If you don't know what this is about, keep this value unchanged. 5000 is an ideal value I have tested.
Optimize option: Specifies the optimization policy. By default, optimization is performed in the minimal mode. If the index file is large, the optimization in the smallest way will take a long time. You can choose to optimize in this way, that is, the Middle method.
Click Start to Start synchronization. You can click Stop to Stop synchronization.
Synchronization completed
Synchronization Process
Impact of triggers on Performance
The trigger only affects the performance of the change and delete operations. Because the trigger only records the change ID and change field, but does not record the actual content of the change, even if there is an impact, this impact is also limited. In some applications, users feel that this impact is unacceptable. Therefore, you can only use this article to synchronize data with existing tables or views through programs.
Automatic and timed synchronization through background tasks
You can use either of the following two methods to trigger synchronization:
Automatically synchronize or optimize indexes using background tasks
You can also write a program to trigger synchronization operations. The following describes how to call the program.
Program call
See FormTableSynchronization. cs in QueryAnalyzer.
Reference Hubble. SQLClient
Instantiation of TableSynchronization
TableSynchronization _TableSync;
TableSynchronization.OptimizeOption option = TableSynchronization.OptimizeOption.Minimum;
int step = (int)numericUpDownStep.Value;
HubbleConnection conn = new HubbleConnection(connectString);
conn.Open();
_TableSync = new TableSynchronization(DataAccess.Conn, TableName, step, option);
Trigger synchronization:
_TableSync.Synchronize();
This function returns True, indicating that synchronization is successfully triggered.
Get synchronization progress:
double progress = _TableSync.GetProgress();
This function returns the percentage of the synchronization progress. If 100 is returned, the synchronization is completed.
Stop Synchronization
_TableSync.Stop();
If the current table is being synchronized, the synchronization operation is terminated.
Return to Hubble.net technical details