[Language Processing and Python] 9.1 grammar features

Source: Internet
Author: User
Tags nltk

For greater flexibility, we change the way we treat grammar classes, such as S, NP, and V. We break down these atomic tags into dictionary-like structures, so that a series of values can be extracted as features.

9.1 grammar features

Start with a simple example and store features and their values in a dictionary.

>>>kim = {:, : , : >>>chase = {:, : , : }

CAT: grammar type; ORTH: spelling; REF: Give an indicator or link. In a rule-based Grammar context, such feature and feature value pairs are called feature structures.

You can also add features as needed.

>>>chase[] = >>>chase[] = 

AGT: The responsible role. PAT: The responsible role. It is the object here.

For example, we want to deal with the sentence Kim chased Lee.

>>>sent = >>>tokens =>>>lee = {:, : , : >>> fs  fs[] ==>>>subj, verb, obj = lex2fs(tokens[0]), lex2fs(tokens[1]), lex2fs(tokens[2>>>verb[] = subj[] >>>verb[] =obj[] >>> kin [, , , ]: ...  %=>=>=>=>l

The same method can apply to different verbs and add more features, for example:

>>>surprise = {:, : , : : , : }

Syntaxes

The morphological attribute of a verb is changed along with the attribute of the subject and noun phrase, which is used as an agreement ).

For example:

**the dogs runs

We can use the method of improving grammar to deal with this situation. The following is an example. However, this method is very troublesome.

Improved grammar:

(7) S ->->->-> -> -> 

Improved Syntax:

(8) S ->->->->->->-> -> -> -> -> -> 

To avoid this explosive increase, we can use attributes and constraints.

Use attributes and constraints

Det[NUM=sg]-> =pl]-> =sg]-> =pl]-> =sg]-> =pl]-> 

Can we use? N to improve:

S -> NP[NUM=?n]VP[NUM==?n]-> Det[NUM=?n]N[NUM==?n]-> V[NUM=?n]

However, some words are not picky about single and multiple numbers. There are two Representation Methods. Obviously, the second one is simpler and clearer than the first one.

First:

Det[NUM=sg]->  |  | =pl]->  |  | 

Second:

Det[NUM=?n]->  |  | 

The following code demonstrates most of the ideas described in this chapter so far:

>>>nltk.data.show_cfg(%S -> NP[NUM=?n]VP[NUM=NP[NUM=?n]-> N[NUM==?n]-> PropN[NUM==?n]-> Det[NUM=?n]N[NUM==pl]-> N[NUM=VP[TENSE=?t,NUM=?n]-> IV[TENSE=?t, NUM==?t,NUM=?n]-> TV[TENSE=?t,NUM=Det[NUM=sg]->  | =pl]->  | ->  |  | =sg]-> | =sg]->  |  |  | =pl]->  |  |  | =pres, NUM=sg]->  | =pres,NUM=sg]->  | =pres, NUM=pl]->  | =pres,NUM=pl]->  | =past] ->  | =past]->  | 

The following code shows how to parse a sentence:

If the syntax cannot analyze the input, trees is empty. Otherwise, it contains one or more analysis trees. It depends on whether there is syntactic ambiguity in comfort.

>>>tokens = >>> nltk >>>cp = load_parser(, trace=2>>>trees =|.Kim .like.chil.||[----] . .| PropN[NUM=]->  *|[----] . .| NP[NUM=]-> PropN[NUM=]*|[----> . .| S[]-> NP[NUM=?n]*VP[NUM=?n]{?n: |. [----] .| TV[NUM=,TENSE=]->  *|. [----> .| VP[NUM=?n,TENSE=?t]-> TV[NUM=?n,TENSE=?t]*, ?t: |. . [----]| N[NUM=]->  *|. . [----]| NP[NUM=]-> N[NUM=]*|. . [---->| S[]-> NP[NUM=?n]*VP[NUM=?n]{?n: |. [---------]| VP[NUM=,TENSE=-> TV[NUM=,TENSE=]NP[]*|[==============]| S[]-> NP[NUM=]VP[NUM=]*

Finally, you can check the analysis tree:

>>> tree  trees: =] (PropN[NUM==, TENSE==, TENSE==] (N[NUM=] children))))

Terms

Simple values such as sg and pl are usually atomic. A special case of Atomic values is a Boolean value, which only specifies whether an attribute is true or false.

For example, AUX represents a helper verb.

V[TENSE=pres,aux=+]->

Sometimes, we can combine protocol features as different parts of a category to indicate the value of AGR.

Attribute value matrix: AVM

[POS == [PER = 3== fem ]]

When there are complex attributes, you can reconstruct the Syntax:

S -> NP[AGR=?n]VP[AGR==?n]-> PropN[AGR==?t,AGR=?n]-> Cop[TENSE=?t,AGR==pres, AGR=[NUM=sg,PER=3]]-> =[NUM=sg,PER=3]]-> -> 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.