C # Chinese Word Segmentation learning materials

Source: Internet
Author: User
Introduction to ictclassharpictclas Word Segmentation System (9) dictionary Expansion

Introduction to sharpictclas word splitting system (8) Others

Introduction to sharpictclas word splitting system (7) optimumsegment

Introduction to sharpictclas Word Segmentation System (6) Segment

Introduction to sharpictclas System (5) NShortPath-2

Introduction to sharpictclas System (4) NShortPath-1

Introduction to sharpictclas word splitting system (3) dynamicarray

Introduction to sharpictclas Word Segmentation System (2) Initial Word Segmentation

Introduction to sharpictclas Word Segmentation System (1) Reading dictionary Library

Porting ICTCLAS to the C # Platform

ICTCLAS text splitting system code (2)

Tianshu-like ICTCLAS Word Segmentation System Code (1) Other Resources 1 Name: shootsearch Chinese Word Segmentation component (C # Open Source)
Http://gforge.osdn.net.cn/frs? Group_id = 96
Rating: a complete and available Chinese Word Segmentation component. It supports both Chinese and English, and digital hybrid recognition. Thanks to developers for sharing. The personal name is based on a simple "last name + name" recognition method, which is not reliable. Based on the forward maximum matching algorithm, the accuracy is not very high. In addition, the architecture and code quality of the entire component are relatively general. However, it is based on the current situation of domestic. NET developers. It is not easy to achieve this level and it is still open-source.

2 Name: Mini word divider (Java open source)
Address: http://sourceforge.net/projects/wordsegment/
Rating: it is just a demo and only considers "Chinese" splitting. However, this demo provides a good prototype, and its architecture and design ideas are worth learning from. It also provides detailed Chinese design documents. It is a rare reference.

3 name: Basic Chinese Information Processing
Address: http://ccl.pku.edu.cn/doubtfire/Course/Chinese%20Information%20Processing/2002_2003_1.htm
Evaluation: Peking University's Chinese Department of Linguistics graduate course, do not have to read Chinese word segmentation. It also provides good language materials.

Other scattered materials are of little significance. I can thoroughly study the above three documents. It is estimated that the developed word segmentation program is more accurate. So far, I haven't finished reading "Chinese Information Processing basics". To learn more, I guess I have to go to the "data structure self-testing Website" to lay the algorithm foundation well.

4 ktdictseg
Http://www.cnblogs.com/eaglet/tag/%e5%88%86%e8%af%8d/

5. http://www.cnblogs.com/kwklover/archive/2007/03/19/679327.html

Reprinted from: http://ruyu108.blog.163.com/blog/static/10123108200992262747545/

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.