[NLP] Overview of natural language Understanding _NLP

Source: Internet
Author: User

Language is an important symbol of human being different from other animals. Natural language is an oral language (voice) and written language (writing) that distinguishes interpersonal communication from formal or artificial languages such as logical and programming languages. 1. Language and language understanding

Language is the natural medium for human communication, which includes oral and written language as well as body language (such as bookkeeper and semaphore). A more formal formulation is that language is a collection of presentation methods, conventions, and rules for conveying information. Language consists of statements, each of which is made up of words, and some grammatical and semantic rules should be followed when composing statements and languages. Language is composed of speech, vocabulary and grammar. Language and text are the two basic attributes that make up a language. Without all kinds of spoken and written language, such as English, Chinese, French and German, the full and effective communication between human beings is unimaginable. Language is evolving with the development of human society and human itself. Modern languages allow anyone with a normal language ability to communicate thoughts, feelings and skills to others.

To study natural language understanding, we must first have a basic understanding of the composition of natural language.
Language is the lexical and grammatical system combining sound and meaning, and it is the material form of realizing thinking activity. Language is a symbolic system, but it differs from other symbolic systems.

Language is a basic unit of words, and vocabulary is subject to the control of the grammar can constitute a meaningful and understandable sentences, sentences in a certain form again constitute chapters. Vocabulary can be divided into words and idioms, idioms are some of the fixed combination of words, such as Chinese idioms. Words are also composed of morphemes, such as "Teacher" is composed of "teaching" and "division".

Grammar is the law of the Organization of language. Grammatical rules govern how morphemes form words, and words form phrases and sentences. Language is formed in such a strict and restrictive relationship. The rule of using morphemes to form words is called word formation rules, such as teaching + teacher-> teachers. A word has different forms, singular, plural, negative, positive and so on. This form of the rule is called the structure method, such as Teachers +-> teachers. The shape method and the word-formation are called lexical. The other part of the morphology is syntax. Syntax can also be divided into two parts: the phrase construction method and the sentence method. The phrase construction method is the rule of word collocation into phrase, such as red + pencil-> red pencil. Here "red" is an adjective that modifies a pencil, and it is combined with the name "pencil" to form a new noun. The rule of making sentences is to make sentences with words or phrases. "I am a student of computer science", this is a sentence constructed according to Chinese sentence method.

On the other hand, language is a combination of sound and meaning, and each word has its own phonetic form. The pronunciation of a word is composed of one or more syllables, and the syllables are composed of phonemes, and the phonemes are divided into vowels and consonant phonemes. There are not many phonemes involved in natural language, and one language usually has only a few dozens of phonemes. The smallest unit of speech composed of a pronounced action is a phoneme.

So far, there is no uniform and authoritative definition of language understanding, which is different from the perspective of considering the problem. From the microscopic perspective, language understanding is a kind of mapping from natural language to machine interior. On the macro level, language understanding refers to the ability of a machine to perform certain language functions that human beings expect. These features include answers to questions extract material abstract different words and expressions different language translation

However, the understanding of natural language is a very difficult task. It is not easy even to establish a computer system that can only understand film thanks. There are a number of extremely complex coding and decoding problems in between. A computer system capable of understanding natural language needs context knowledge as well as the process of reasoning based on that knowledge and information. Natural language has not only semantic, grammatical and linguistic problems, but also fuzziness and so on. In particular, the difficulty of natural language comprehension is caused by the following three factors: complexity mapping of target representations diversity of the interactions between elements in different source expressions

Natural language understanding is an interdisciplinary subject developed and combined in linguistics, logic, physiology, Psychology, computer science and mathematics, which can understand spoken or written language. Language communication is a kind of knowledge based communication. 2. The concept and definition of natural language processing

Natural language processing is the technology of processing and applying computer to human's verbal and written natural language, which is a frontier subject which is designed by many disciplines such as linguistics, mathematics, computer science and cybernetics, and is an important branch of artificial intelligence and intelligent science. It is also an early and active research field of artificial intelligence.

Natural language processing includes two aspects of natural language understanding and natural language generation. Natural language Understanding Systems transform natural language into a form that computer programs are easier to handle and understand. The natural language generation system converts computer data related to natural language into natural language. 3. The research field of natural language processing summarizes character recognition (optical character RECOGNITION,OCR) speech recognition (speech recognition) machine translation (machine translation) Automatic summarization (automatic summarization) syntactic analysis (syntax parsing) text categorization (text categorization) information retrieval (Information retrieval) Information Acquisition (information extraction) information filtering (information filtering) natural language generation (natural language generation) Chinese automatic segmentation (Chinese word Segmentation) speech synthesis (speech synthesis) question answering system (question answering system) 4, the level of natural language understanding process

Language, although expressed as a series of literal symbols or a stream of sound, but its internal is actually a hierarchical structure, from the composition of the language can be clearly seen this level. A sentence is expressed by the morphemes-> words or forms-> phrases or sentences, while the sentences expressed by the voice are-> syllable-> phonetic words-> sound sentences, each of which is subject to grammatical rules. Therefore, the process of language analysis and understanding should also be a hierarchical process. Many modern linguists have divided this process into 5 levels: analysis of lexical analysis in speech analysis and semantic analysis of syntactic analysis

Reference bibliography
Artificial intelligence and its application (Zai Zixing Xuguangtian)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.