Chinese Word segmentation Search Tool under ASP.-Jieba.net

Source: Internet
Author: User

Jieba is a search library under Python, someone has migrated this library to the ASP. NET platform, can completely replace the lucene.net and Pangu participle collocation

The reason why write this, actually because yesterday interview, was asked to the website keyword search How do you do? I just said it. SQL fuzzy Query and SQL statement optimization, caching. Previous contact with the keyword participle, but there is no mature word search library in the. NET platform, unlike Java has Lucene, although also ported to. NET, but the update is slow. Before I learned Python, I noticed Python's word search and the word cloud, and wondered if there was a Python word retrieval library ported to. NET to check out the Python Jieba library.
The original introduction: Jieba Chinese participle. NET version: Jieba.net
. NET platform on the common sub-phrase pieces are pangu participle, but has not been updated for a long time. The most obvious is the built-in dictionary, Jieba's dictionary has 500,000 entries, and Pangu's dictionary is 170,000, which will result in a significantly different word segmentation effect. In addition, for the non-login word, Jieba "adopts the HMM model based on Chinese characters ' ability, using the Viterbi algorithm", the effect looks good.

Code Address Github:https://github.com/anderscui/jieba.net
We can search for downloads directly in VS2013 's NuGet Package Manager:

See the comments inside someone said, will work letter Maiden monthly through subordinate departments to tell the 24 switch and other technical device installation work participle test, can be divided well, I tested the next:

varSegmenter =NewJiebasegmenter (); Console.WriteLine ("Original Search statement: Work Letter Virgo officer every month through subordinate departments to explain the 24-port switch and other technical device installation work"); varSegments1 = Segmenter. Cut ("Work Letter Virgo officer every month through subordinate departments have to tell the 24-port switch and other technical device installation work", Cutall:true); Console.WriteLine ("[Full mode]: {0}",string. Join ("/ ", Segments1)); varSegments2 = Segmenter. Cut ("Work Letter Virgo officer every month through subordinate departments have to tell the 24-port switch and other technical device installation work");//default to exact modeConsole.WriteLine (""exact mode": {0}",string. Join ("/ ", segments2)); varSegments3 = Segmenter. Cut ("Work Letter Virgo officer every month through subordinate departments have to tell the 24-port switch and other technical device installation work");//The default is the exact mode, and the HMM model is also usedConsole.WriteLine (""New word recognition": {0}",string. Join ("/ ", Segments3)); varSEGMENTS4 = Segmenter. Cutforsearch ("Work Letter Virgo officer every month through subordinate departments have to tell the 24-port switch and other technical device installation work");//Search engine ModeConsole.WriteLine (""search engine mode": {0}",string. Join ("/ ", SEGMENTS4)); varSEGMENTS5 = Segmenter. Cut ("Work Letter Virgo officer every month through subordinate departments have to tell the 24-port switch and other technical device installation work"); Console.WriteLine (""Ambiguity cancellation": {0}",string. Join ("/ ", SEGMENTS5)); Console.read ();

Operation Result:

Well, except for the full pattern, the rest of us will be able to meet the order we read.

Chinese Word segmentation Search Tool under ASP.-Jieba.net

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.