Machine Learning Algorithm Note 1_2: Classification and logistic regression (classification and logistic regression)

Source: Internet
Author: User

  1. Form:

    Use the sigmoid function: g(Z)= 1 1+ e? Z
    Its derivative is g- (Z)=(1?g(Z))g(Z)
    Assume:

    That

    If there is a sample of M, the likelihood function form is:

    Logarithmic form:

    Using gradient rise method to find its maximum value
    Derivation:

    The update rules are:

    It can be found that the rules form and the LMS update rules are the same, however, their demarcation function hθ (x ) is completely different (the H (x) is a nonlinear function in logistic regression). About this part of the content is explained in the GLM section.
    Note: if H (x) is not a sigmoid function but a threshold function:

    This algorithm is called perceptual Learning algorithm. Although the updated guidelines are similar, they are not an algorithm at all with logistic regression.
  2. Another method of maximizing likelihood function--Newton approximation method
    • Principle: Suppose we want to get a function over 0 points f ( θ ) , you can keep updating it by using the method θ To get:

      Its intuitive interpretation is as follows:

      Given an initial point < Span class= "Mi" id= "mathjax-span-62" style= "font-family:stixgeneral-italic;" >θ 0 Span style= "Display:inline-block; width:0px; Height:2.183em; " > If F ( θ 0 ) and its derivative of the same number indicates that 0 points on the left side of the initial point, otherwise at the initial point to the right, the initial point of updating the store's tangent of over 0 points to continue the above steps, the resulting tangent over 0 points will continue to approximate the final function over 0 points.
    • Application: In logistic regression, we request the maximum (minimum) value of the likelihood function, i.e. the derivative of the likelihood function is 0, so we can use Newton approximation method:

      Since the LR algorithm θ is a vector that is rewritten as:

      whichHis the Hessian matrix:

      Newton's method tends to converge faster than the gradient descent method (batch processing).

Machine Learning Algorithm Note 1_2: Classification and logistic regression (classification and logistic regression)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.