July algorithm-December machine learning online Class-18th lesson notes-Conditional random airport CRF

Last Update:2016-05-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

July Algorithm-December machine Learning online Class -18th lesson Notes-Conditional random airport CRF

July algorithm (julyedu.com) December machine Learning Online class study note http://www.julyedu.com

1, logarithmic linear model

The probability of an event is odds, which is the ratio of the occurrence of the event to the probability that the event does not occur.

1.1 General form of the logarithmic linear model

Make X a sample, Y is the possible mark of X, and the feature of the Logistic/softmax regression

selection of feature functions : eg: natural language processing

1, the characteristic function can be almost arbitrarily selected, even the characteristic function overlaps;

2, each characteristic is related to the current part of speech , at most only with the word of speech of the adjacent words

3, but the feature can be all word related (doing so can turn the model into a chain)

POS Labeling

1, structured forecasts.

2, the tags of adjacent words affect each other, not independent

2, linear condition random field with Airport 2.1 linear conditions can use logarithmic linear model.

Given the parameters, how to estimate the probability

Use a sequence representing n words, representing the corresponding part of speech

is made up of a number of sub-characteristics

2.2 Parameter Training

Two difficulties in parameter inference

1, if given X and W, how to calculate which tag sequence y is the most likely

2, if given X and W, how does P (y|x,w) itself calculate?

2.3 State Relationship Matrix

Features can be replaced with sums of this feature

2.3.1 using forward scoring to select the maximum marker sequence

is a forward score that indicates that the K-word is marked as the maximum score of V (which is the probability when the score value is normalized), i.e.:

2.3.2 State Relation Matrix derivation

Time complexity O (n)

3 parameter Training

given a set of training samples (x, y), find the weight vector w, find the parameters, so that the following form:

Method: To find the stationary point of the logarithmic target function.

Target function:

where, not derivative, just a tick, J and different values, Y and, represent two different y values

Finally use gradient rise, learning parameters

and not independent of each other, but connected.

4, Graph-free model (UGM) Markov random Field/Markov network

A forward graph model, also known as a Bayesian network (Directed graphical Models, DGM, Bayesian Networks)

Probabilistic graph model/probability-free graph model

4.1 Items with Airport

From Bayesian networks to Markov random airports

Connect a child's public father directly, removing all the arrows

Not complete information is not lost (conventional method), conditional independent destruction

Properties of 4.2 MRF

1, pairs of Markov sex

2, Local Markov nature

3, Global Markov nature

The above three properties are equivalent

4.3 Regiment and the largest regiment

Definition: A sub-figure s in a graph G without a direction, if any two nodes in S have edges, then s is called G's Regiment (clique).

Largest Regiment: If C is a regiment of G, and can no longer join any of the nodes of G to make it known as a regiment, then C is called the largest regiment of G (Maximal clique).

The largest group {1,2,3},{2,3,4},{3,5} in the figure, the largest group is not related to the number,

As long as it is no longer possible to join any of the nodes of G to make it known as a regiment

4.4 Hammersley-clifford theorem

The joint distribution of UGM: The form of the product of a function of a random variable on the largest group;

This operation is called UGM factorization (factorization).

Linear chain conditional random field can be used for labeling and other problems

Crf

Summarize

The conditional random field can be expressed using a logarithmic linear model.

Not strictly speaking, the linear chain condition random field can be regarded as the generalization of the hidden Markov model, and the hidden Markov model can be regarded as the special case of the linear chain condition with the airport.

Disadvantages: Supervised learning calculation parameters, parameter learning speed is slow

July algorithm-December machine learning online Class-18th lesson notes-Conditional random airport CRF

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

July algorithm-December machine learning online Class-18th lesson notes-Conditional random airport CRF

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

July algorithm-December machine learning online Class-18th lesson notes-Conditional random airport CRF

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support