[Language Processing and Python] 9.1 grammar features

Last Update:2013-11-15 Source: Internet

Author: User

Tags nltk

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

For greater flexibility, we change the way we treat grammar classes, such as S, NP, and V. We break down these atomic tags into dictionary-like structures, so that a series of values can be extracted as features.

9.1 grammar features

Start with a simple example and store features and their values in a dictionary.

>>>kim = {:, : , : >>>chase = {:, : , : }

CAT: grammar type; ORTH: spelling; REF: Give an indicator or link. In a rule-based Grammar context, such feature and feature value pairs are called feature structures.

You can also add features as needed.

>>>chase[] = >>>chase[] =

AGT: The responsible role. PAT: The responsible role. It is the object here.

For example, we want to deal with the sentence Kim chased Lee.

>>>sent = >>>tokens =>>>lee = {:, : , : >>> fs  fs[] ==>>>subj, verb, obj = lex2fs(tokens[0]), lex2fs(tokens[1]), lex2fs(tokens[2>>>verb[] = subj[] >>>verb[] =obj[] >>> kin [, , , ]: ...  %=>=>=>=>l

The same method can apply to different verbs and add more features, for example:

>>>surprise = {:, : , : : , : }

Syntaxes

The morphological attribute of a verb is changed along with the attribute of the subject and noun phrase, which is used as an agreement ).

For example:

**the dogs runs

We can use the method of improving grammar to deal with this situation. The following is an example. However, this method is very troublesome.

Improved grammar:

(7) S ->->->-> -> ->

Improved Syntax:

(8) S ->->->->->->-> -> -> -> -> ->

To avoid this explosive increase, we can use attributes and constraints.

Use attributes and constraints

Det[NUM=sg]-> =pl]-> =sg]-> =pl]-> =sg]-> =pl]->

Can we use? N to improve:

S -> NP[NUM=?n]VP[NUM==?n]-> Det[NUM=?n]N[NUM==?n]-> V[NUM=?n]

However, some words are not picky about single and multiple numbers. There are two Representation Methods. Obviously, the second one is simpler and clearer than the first one.

First:

Det[NUM=sg]->  |  | =pl]->  |  |

Second:

Det[NUM=?n]->  |  |

The following code demonstrates most of the ideas described in this chapter so far:

>>>nltk.data.show_cfg(%S -> NP[NUM=?n]VP[NUM=NP[NUM=?n]-> N[NUM==?n]-> PropN[NUM==?n]-> Det[NUM=?n]N[NUM==pl]-> N[NUM=VP[TENSE=?t,NUM=?n]-> IV[TENSE=?t, NUM==?t,NUM=?n]-> TV[TENSE=?t,NUM=Det[NUM=sg]->  | =pl]->  | ->  |  | =sg]-> | =sg]->  |  |  | =pl]->  |  |  | =pres, NUM=sg]->  | =pres,NUM=sg]->  | =pres, NUM=pl]->  | =pres,NUM=pl]->  | =past] ->  | =past]->  |

The following code shows how to parse a sentence:

If the syntax cannot analyze the input, trees is empty. Otherwise, it contains one or more analysis trees. It depends on whether there is syntactic ambiguity in comfort.

>>>tokens = >>> nltk >>>cp = load_parser(, trace=2>>>trees =|.Kim .like.chil.||[----] . .| PropN[NUM=]->  *|[----] . .| NP[NUM=]-> PropN[NUM=]*|[----> . .| S[]-> NP[NUM=?n]*VP[NUM=?n]{?n: |. [----] .| TV[NUM=,TENSE=]->  *|. [----> .| VP[NUM=?n,TENSE=?t]-> TV[NUM=?n,TENSE=?t]*, ?t: |. . [----]| N[NUM=]->  *|. . [----]| NP[NUM=]-> N[NUM=]*|. . [---->| S[]-> NP[NUM=?n]*VP[NUM=?n]{?n: |. [---------]| VP[NUM=,TENSE=-> TV[NUM=,TENSE=]NP[]*|[==============]| S[]-> NP[NUM=]VP[NUM=]*

Finally, you can check the analysis tree:

>>> tree  trees: =] (PropN[NUM==, TENSE==, TENSE==] (N[NUM=] children))))

Terms

Simple values such as sg and pl are usually atomic. A special case of Atomic values is a Boolean value, which only specifies whether an attribute is true or false.

For example, AUX represents a helper verb.

V[TENSE=pres,aux=+]->

Sometimes, we can combine protocol features as different parts of a category to indicate the value of AGR.

Attribute value matrix: AVM

[POS == [PER = 3== fem ]]

When there are complex attributes, you can reconstruct the Syntax:

S -> NP[AGR=?n]VP[AGR==?n]-> PropN[AGR==?t,AGR=?n]-> Cop[TENSE=?t,AGR==pres, AGR=[NUM=sg,PER=3]]-> =[NUM=sg,PER=3]]-> ->

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More