Chinese Word segmentation search tool sharing under ASP.

Source: Internet
Author: User
Jieba is a search library under Python, someone has migrated this library to the ASP. NET platform, can completely replace the lucene.net and Pangu participle collocation

The reason why write this, actually because yesterday interview, was asked to the website keyword search How do you do? I just said it. SQL fuzzy Query and SQL statement optimization, caching. Previous contact with the keyword participle, but there is no mature word search library in the. NET platform, unlike Java has Lucene, although also ported to. NET, but the update is slow. Before I learned Python, I noticed Python's word search and the word cloud, and wondered if there was a Python word retrieval library ported to. NET to check out the Python Jieba library.
The original introduction: Jieba Chinese participle. NET version: Jieba.net
. NET platform on the common sub-phrase pieces are pangu participle, but has not been updated for a long time. The most obvious is the built-in dictionary, Jieba's dictionary has 500,000 entries, and Pangu's dictionary is 170,000, which will result in a significantly different word segmentation effect. In addition, for the non-login word, Jieba "adopts the HMM model based on Chinese characters ' ability, using the Viterbi algorithm", the effect looks good.

We can search for downloads directly in VS2013 's NuGet Package Manager:

See the comments inside someone said, will work letter Maiden monthly through subordinate departments to tell the 24 switch and other technical device installation work participle test, can be divided well, I tested the next:


var segmenter = new Jiebasegmenter ();                        Console.WriteLine ("Original search statement: Work Letter Virgo Officer monthly through subordinate departments have to tell the 24-port switch and other technical device installation work"); var segments1 = Segmenter.            Cut ("Work Letter Virgo officer every month through subordinate departments to tell the 24-port switch and other technical device installation work", cutall:true); Console.WriteLine ("[Full mode]: {0}", String.                        Join ("/", segments1)); var segments2 = Segmenter.  Cut ("Work Letter Virgo officer every month through subordinate departments to tell the 24-port switch and other technical device installation work"); The default is exact mode Console.WriteLine ("Exact mode": {0} ", String.)                        Join ("/", segments2)); var segments3 = Segmenter.  Cut ("Work Letter Virgo officer every month through subordinate departments to tell the 24-port switch and other technical device installation work"); The default is the exact mode, and also uses the HMM model Console.WriteLine ("New word recognition": {0} ", String.                        Join ("/", segments3)); var segments4 = Segmenter. Cutforsearch ("Work Letter Virgo officer monthly through subordinate departments have to tell the 24-port switch and other technical device installation work"); Search engine mode Console.WriteLine ("Search engine mode": {0} ", String.)                        Join ("/", SEGMENTS4)); var segments5 = Segmenter.            Cut ("Work Letter Virgo officer every month through subordinate departments to tell the 24-port switch and other technical device installation work"); Console.WriteLine ("Ambiguity cancellation": {0} ",String.            Join ("/", SEGMENTS5)); Console.read ();

Operation Result:

Well, except for the full pattern, the rest of us will be able to meet the order we read.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.