HANLP basic knowledge of Chinese natural language processing

Source: Internet
Author: User
Tags solr

Natural Language Processing definitions:

Natural Language processing is A cross-discipline of computer science, artificial intelligence, and linguistics. Although language is only part of artificial intelligence (AI also includes computer vision), it is a very unique part of it. There are many creatures on this planet that have more than human vision systems, but only humans have such high-level languages.

The goal of natural language processing is to allow computers to process or "understand" natural languages in order to accomplish meaningful tasks, such as booking a ticket for shopping or QA. It is extremely difficult to fully understand and express language, and perfect language comprehension is equivalent to implementing artificial intelligence.

several levels involved in natural language processing :

as input a total of two sources, voice and text. So the first level is speech recognition and OCR or Word segmentation (in fact, skipping the word is certainly not a syntactic analysis, but the character level can do a lot of direct application). The following is the morphology, which cites the definition of "statistical natural language Processing": Morphology (morphology): Morphology (also known as "lexical morphology" or "morphology") is a branch of linguistics that studies the internal structure of words, including two parts of inflectional change and word-formation. Because the word has phonetic features, syntactic features and semantic features, morphology is in the combination of phonology, sentence law and semantics, so morphology is a subject that every linguist should pay close attention to .

HANLP Natural Language Processing development package:

people who work in big data are bound to be unfamiliar with natural language, The most user-sourced Chinese Natural language processing tool on GitHub is HANLP. The initial version of HANLP was developed at the beginning of 2014 and opened open source on GitHub in March. 2015 time integration in the big fast search DKNLP, the current big fast already DKNLP technology results have been open source, and as a whole as HANLP project, HANLP version has reached V1.50.

HANLP Natural Language Processing technology Advantages:

Support Chinese word segmentation ( N-Shortest participle, CRF participle, index participle, user-defined utterance ineffable, POS tagging, named entity recognition (Chinese people, transliteration people, Japanese people, place names, entity name recognition), keyword extraction, automatic summarization, phrase extraction, pinyin conversion, simple and complex conversion, text recommendation, dependent syntactic analysis (maxent dependent syntactic analysis, neural network dependent syntactic analysis). Lucene Search is available, compatible with SOLR and Elasticsearch.

HANLP Natural Language Processing application field:

HANLP has been widely used in Lucene, SOLR, ElasticSearch, Hadoop, Android, resin and other platforms, there are a large number of open source authors to develop a variety of search and development, and is packaged or ported to Python, C #, R, JavaScript and other languages up.


HANLP basic knowledge of Chinese natural language processing

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.