July algorithm-December machine learning online Class-18th lesson notes-Conditional random airport CRF

Source: Internet
Author: User

July Algorithm-December machine Learning online Class -18th lesson Notes-Conditional random airport CRF

July algorithm (julyedu.com) December machine Learning Online class study note http://www.julyedu.com

1, logarithmic linear model

The probability of an event is odds, which is the ratio of the occurrence of the event to the probability that the event does not occur.

1.1 General form of the logarithmic linear model

Make X a sample, Y is the possible mark of X, and the feature of the Logistic/softmax regression

selection of feature functions : eg: natural language processing

1, the characteristic function can be almost arbitrarily selected, even the characteristic function overlaps;

2, each characteristic is related to the current part of speech , at most only with the word of speech of the adjacent words

3, but the feature can be all word related (doing so can turn the model into a chain)

POS Labeling

1, structured forecasts.

2, the tags of adjacent words affect each other, not independent

2, linear condition random field with Airport 2.1 linear conditions can use logarithmic linear model.

Given the parameters, how to estimate the probability

Use a sequence representing n words, representing the corresponding part of speech

is made up of a number of sub-characteristics

2.2 Parameter Training

Two difficulties in parameter inference

1, if given X and W, how to calculate which tag sequence y is the most likely

2, if given X and W, how does P (y|x,w) itself calculate?

2.3 State Relationship Matrix

Features can be replaced with sums of this feature

2.3.1 using forward scoring to select the maximum marker sequence

is a forward score that indicates that the K-word is marked as the maximum score of V (which is the probability when the score value is normalized), i.e.:

2.3.2 State Relation Matrix derivation

Time complexity O (n)

3 parameter Training

given a set of training samples (x, y), find the weight vector w, find the parameters, so that the following form:

Method: To find the stationary point of the logarithmic target function.

Target function:

where, not derivative, just a tick, J and different values, Y and, represent two different y values

Finally use gradient rise, learning parameters

and not independent of each other, but connected.

4, Graph-free model (UGM) Markov random Field/Markov network

A forward graph model, also known as a Bayesian network (Directed graphical Models, DGM, Bayesian Networks)

Probabilistic graph model/probability-free graph model

4.1 Items with Airport

From Bayesian networks to Markov random airports

Connect a child's public father directly, removing all the arrows

Not complete information is not lost (conventional method), conditional independent destruction

Properties of 4.2 MRF

1, pairs of Markov sex

2, Local Markov nature

3, Global Markov nature

The above three properties are equivalent

4.3 Regiment and the largest regiment

Definition: A sub-figure s in a graph G without a direction, if any two nodes in S have edges, then s is called G's Regiment (clique).

Largest Regiment: If C is a regiment of G, and can no longer join any of the nodes of G to make it known as a regiment, then C is called the largest regiment of G (Maximal clique).

The largest group {1,2,3},{2,3,4},{3,5} in the figure, the largest group is not related to the number,

As long as it is no longer possible to join any of the nodes of G to make it known as a regiment

4.4 Hammersley-clifford theorem

The joint distribution of UGM: The form of the product of a function of a random variable on the largest group;

This operation is called UGM factorization (factorization).

Linear chain conditional random field can be used for labeling and other problems

Crf

Summarize

The conditional random field can be expressed using a logarithmic linear model.

Not strictly speaking, the linear chain condition random field can be regarded as the generalization of the hidden Markov model, and the hidden Markov model can be regarded as the special case of the linear chain condition with the airport.

Disadvantages: Supervised learning calculation parameters, parameter learning speed is slow

July algorithm-December machine learning online Class-18th lesson notes-Conditional random airport CRF

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.