The stuttering participle of Chinese participle ~ ~ ~ Use scene +demo

Source: Internet
Author: User
Tags idf

Common skills (update ing): Http://www.cnblogs.com/dunitian/p/4822808.html#skill

Skill Master (update ing): http://www.cnblogs.com/dunitian/p/5493793.html

Online Demo: http://cppjieba-webdemo.herokuapp.com

Full demo:https://github.com/dunitian/tempcode/tree/master/2016-09-05

First of all, pay attention, stutter participle he did not go to the word breaker once, we have to do it ourselves; Dictionaries have to be configured or set to output to the bin directory

Application Scenario Example (search for that piece everybody knows, say something else)

——————————————————————————————————————————————————

To the point: Look at a group of folk statistics: (Non-net version, refers to the official version)

NET version of Ikanalyzer and Pangu participle has not been updated for many years, so the choice of stuttering participle ( this name is also very consistent with the artistic conception of participle ~ ~ Stuttering to speak, is it a way of participle?) )

Here's a quick demo:

1. Introduce the package first:

2. Dictionary settings:

3. Simple packaged Help class:

Using system.linq;using jiebanet.segmenter;using system.collections.generic;namespace LoTLib.Word.Split{#region Type public enum Jiebatypeenum {//<summary>/////////////////---The most basic and natural pattern, trying to cut the sentence to the most precise, suitable for text analysis// /</summary> Default,////<summary> Full mode---words can be scanned, faster, but not ambiguous///< /summary> Cutall,///<summary>////////Search engine mode---on the basis of accurate mode to re-slice long words, improve recall rate, suitable for search engine participle///&L T;/summary> Cutforsearch,///<summary>///Precision mode-without HMM///</summary> Ot Her} #endregion///<summary>///stutter participle///</summary> public static partial class Wordspli Thelper {//<summary>////For string collection after word breaker//</summary>//<param name= "obj        Str "></param>//<param name=" type "></param>//<returns></returns> public static Ienumerable<string> getsplitwords (String objstr, Jiebatypeenum type = Jiebatypeenum.default) {var Jieba = new J            Iebasegmenter (); Switch (type) {case JiebaTypeEnum.Default:return Jieba.                 Cut (OBJSTR); Precision mode-with HMM case JiebaTypeEnum.CutAll:return Jieba.   Cut (Objstr, cutall:true); Full mode case JiebaTypeEnum.CutForSearch:return Jieba.        Cutforsearch (OBJSTR); Search engine mode Default:return Jieba.   Cut (Objstr, False, false);  Precision mode-Without Hmm}}//<summary>//////for string after Word///</summary>// <param name= "Objstr" ></param>//<param name= "type" ></param>//&LT;RETURNS&GT;&L         t;/returns> public static string Getsplitwordstr (This string objstr, jiebatypeenum type = Jiebatypeenum.default) {var words = getsplitwords (objstr, type); No result returns an empty string if (words = = NULL | | words. Count () < 1) {return string.            Empty; } words = words. Distinct ();//Sometimes the words are duplicated, so you have to handle the return string yourself. Join (",", words);//return according to individual needs}}}

The call is simple:

            String str = "Bootstrap-datetimepicker further follow ~ ~ ~ Start time and end time style display";            Console.WriteLine ("\ n Precision mode-with hmm:\n");            Console.WriteLine (str. Getsplitwordstr ());            Console.WriteLine ("\ n Full mode: \ n");            Console.WriteLine (str. Getsplitwordstr (Jiebatypeenum.cutall));            Console.WriteLine ("\ n search engine mode: \ n");            Console.WriteLine (str. Getsplitwordstr (Jiebatypeenum.cutforsearch));            Console.WriteLine ("\ n precision mode-without hmm:\n");            Console.WriteLine (str. Getsplitwordstr (Jiebatypeenum.other));            Console.readkey ();

Effect:

--------------------------

One might say, what is the keyword extraction? = = "Don't worry, look at the following:

The dictionary that corresponds to this way is it = "idf.txt

Simply say constants== "

Effect:

Full Help Class (latest look at GitHub): https://github.com/dunitian/TempCode/tree/master/2016-09-05

Using system.linq;using jiebanet.segmenter;using system.collections.generic;using jiebanet.analyser;namespace lotlib.word.split{#region Parts of the type public enum Jiebatypeenum {//<summary>////Precision Mode---most basic and natural mode  The best way to cut the sentence, suitable for text analysis//</summary> Default,////<summary>//full-mode---can be word of words are scanned out, Faster, but not ambiguous///</summary> Cutall,///<summary>///Search engine mode---on the basis of accurate mode, the long words are again sliced,        Improved recall, suitable for use in search engine segmentation//</summary> Cutforsearch,///<summary>//Precision mode-without HMM </summary> Other} #endregion//<summary>//Stutter participle//</summary> public Static partial class Wordsplithelper {#region Common Series///<summary>/////For string collection after word breaker//         /</summary>//<param name= "OBJSTR" ></param>//<param name= "type" ></param> <returns></returns> public static ienumerable<string> getsplitwords (string objstr, jiebatypeenum type = Jiebatypeenum.def            Ault) {var Jieba = new Jiebasegmenter (); Switch (type) {case JiebaTypeEnum.Default:return Jieba.                 Cut (OBJSTR); Precision mode-with HMM case JiebaTypeEnum.CutAll:return Jieba.   Cut (Objstr, cutall:true); Full mode case JiebaTypeEnum.CutForSearch:return Jieba.        Cutforsearch (OBJSTR); Search engine mode Default:return Jieba.   Cut (Objstr, False, false); Precision mode-Without Hmm}}//<summary>//Extract Article keyword Collection///</summary>// <param name= "Objstr" ></param>///<returns></returns> public static Ienumerable<s            Tring> getarticlekeywords (String objstr) {var IDF = new Tfidfextractor (); REturn IDF.        Extracttags (Objstr, ten, Constants.nounandverbpos);//nouns and verbs}//<summary>////return the stitched string        </summary>//<param name= "words" ></param>//<returns></returns> public static string Joinkeywords (ienumerable<string> words) {//No result returns an empty string if (words = = NULL | | Words. Count () < 1) {return string.            Empty; } words = words. Distinct ();//Sometimes the words are duplicated, so you have to handle the return string yourself. Join (",", words);//return according to individual needs #endregion #region extension Related//<summary>///Get character after word breaker string//</summary>//<param name= "OBJSTR" ></param>//<param name= "type" >&lt ;/param>//<returns></returns> public static string Getsplitwordstr (This string objstr, Jie Batypeenum type = jiebatypeenum.default) {var words = GetsplitwoRDS (OBJSTR, type);        return joinkeywords (words); }///<summary>//Extract the article keyword string///</summary>//<param name= "Objstr" ></        param>//<returns></returns> public static string Getarticlekeywordstr (this string objstr)            {var words = getarticlekeywords (OBJSTR);        return joinkeywords (words); } #endregion}}

  

Stuttering Chinese participle Related:

Https://github.com/fxsjy/jieba

Https://github.com/anderscui/jieba.NET

Http://cppjieba-webdemo.herokuapp.com

The stuttering participle of Chinese participle ~ ~ ~ Use scene +demo

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.