TrieTree service-Component Composition and functions

Source: Internet
Author: User
Tags log4net

In the previous article, we had a general understanding of the TrieTree service. I don't know if you have actually played this TrieTree service after the download. If you have not played this service, it doesn't matter, this article will teach you how to configure and use the TrieTree service step by step.

The TrieTree Service is composed of several major components, such

The Dictionary component is a core library that provides basic data definition, configuration information Definition, Data Structure Representation, and POSType (refer to the Pangu Part of Speech definition ). Because TrieTree uses memory to load data, the design of this component directly determines the memory usage and data query performance. Dictionary. the Providers component is mainly responsible for providing various custom data Providers (DataProvider). You can regard it as a dictionary data loader, for example, the built-in PanguDictProviders is responsible for loading pangu's own dict format dictionary. The loaders of the TrieTree service are highly configurable. You can select the loaders you need through the configuration file, as shown below:
Copy codeThe Code is as follows:
<DictionaryService>
<Provider name = "pangu_dict" uri = "F: \ Dropbox \ research \ NLP \ TrieTreeService \ DictionaryService. unitTest \ Data \ panguDict. dct "type =" BluePrint. dictionary. providers. panguDictProvider, BluePrint. dictionary. providers "/>
<Provider name = "IKdict" uri = "F: \ Dropbox \ research \ NLP \ TrieTreeService \ DictionaryService. unitTest \ Data \ IKdict. dic "type =" BluePrint. dictionary. providers. txtFileProvider, BluePrint. dictionary. providers "/>
</DictionaryService>

The TxtFileProvider is used to load the IKdict. dic file in IKAnalyzer. After the service is started (in debug mode), you will see a similar prompt:

Because ColoredConsoleAppender of log4net is used in TrieTree, different color prompts can be displayed. You will see the loading time of pangu_dict and IKdict in the log. The name here is set by the name attribute of provider in app. config. In fact, TrieTree also supports loading MongoDB-based dictionaries. It is not explained in this article because it involves complicated MongoDB configurations and some concepts, I will consider providing it in subsequent tutorials.

The DictionaryService component is the container component of the TrieTree service. It mainly includes the implementation of the Windows Service and the installer of the Windows service. This component is a console program that provides users with two running modes: debug mode and Service mode. The debugging mode is to directly run the console and provide log information based on log4net to facilitate debugging and breakpoint. The Service mode is to directly run as a Windows Service, which is mainly used for testing and production environments. Because it is a console program, the switching mode is completed by parameters. For example,-I indicates that the windows service is installed,-u indicates that the windows service is uninstalled, and-c indicates that the Console mode is started.

The above are the three core components of the TrieTree service, but I plan to introduce a very useful additional component, DictionaryQuery.

Although the name is also called a query analyzer, it is not a level similar to the SQL query analyzer. You do not need to compare it. This is mainly used for two purposes: first, to test the running status of the TrieTree service; and second, to check the state of the words in the dictionary after the dictionary is loaded. You can also use the POS filter on the right to filter and select multiple representation or relationships. For example, if you select a place name and name, you can search for "Shanghai" and the result is "Shanghai". The frequency is 251, type: Place Name (A_NS). If you cannot find it, the red "no proper word found" is displayed, as shown below.

You can also select the matching method, that is, the maximum positive match, the maximum reverse match, and the full match. This does not need to be explained. By the way, the dictionary service must be enabled before running this service, and you must point to the port of the TrieTree service you configured. The default value is 7010. In the figure, dict: // 127.0.0.1: 7010 is configured, note that the dictionary service URI starts with dict.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.