Machine learning-2-linear regression

Source: Internet
Author: User

Machine learning-2-linear regression

First of all, our teacher really sucks in class. It's really rotten.

PPT also only to meaningless formula, but also do not explain what is doing.

Regression
  1. What is regression

    First, regression is a kind of supervised learning , regression problem, try to predict the continuous output, and try to predict the discrete output of the classification problem is the opposite.

    As an example:
    • Forecast rate
    • Predict height
    • ...
  2. Regression model

    Footage:
    • feature \ (x\)
    • Predictive value \ (y\) /span>
    • Training set \ ((x_i,y_i) \)
    • Learning algorithm
    • regression function \ (f\)

    Linear regression:

    \[f (X) = \omega_0 + \sum_{i = 1}^{m}\omega_ix_ i \]

    vectorization (Increase \ (x_0 = 1\) , indicating intercept entries):

    \[f (X) = w^tx\]

    Generalize (when the base function is not a polynomial base function):

    \[y (x,w) = \sum_{i = 0}^{m-1}\omega_i \phi_j (x) = W^t\phi (x) \]

  3. The nature of the problem

    Split it up:
    • Defining target functions
      • Using training set data (real data)
      • The difference between minimizing the predicted value \ (f\) and the true output value \ (y\)
    • Determining parameters in the Model \ (w^t\)

    Objective function (cost function):

    \[j (W) = \frac{1}{2}\sum_{i=1}^{n} (f (x_i)-y_i) ^2\]

    Further to the Mission \ (J (W) \) the smallest \ (w\) can.

Solution Regression
  1. Gradient Descent method

    Strategy:
      • Random assignment \ (w\) initial value
      • Change \ (w_i\) value so that \ (J (w) \) is getting smaller
      • Descending in the opposite direction of the gradient

    The gradient is a vector that indicates that a function obtains the maximum value in the direction derivative of a certain point along the way, that is, the function has the fastest change along that direction, and the rate of change is the highest.

    As an example:

    When climbing, climb in the direction perpendicular to the contour, the road is the steepest

    How to operate:

    \[\omega_j^t = \omega_j^{t-1}-\alpha\frac{\partial}{\partial\omega_j}j (W) \]

    \[\frac{\partial}{\partial\omega_j}j (W) = \sum_{i=1}^{n} (f (x_i)-y_i) \cdot x_{i,j}\]

    All \ (w_i\) updates at the same time, where \ (\alpha\) is the learning rate/update step

    Some derivatives:
      • Batch processing gradient descent
        • Every update takes advantage of all data
        • Iteration is slow under large samples
      • Random gradient descent
        • Use only one sample at a time
        • Faster iteration, more effective under large samples, also known as online learning
    Add:
      • Wunda "Machine learning" course study Notes (iii)--multivariate linear regression and polynomial regression
  2. Standard equations

    Matrix:

    \[j (W) = \frac{1}{2}\sum_{i=1}^{n} (f (x_i)-y_i) ^2 = (xw-y) ^t (xw-y) \]

    Derivative, another 0:

    \[\frac{\partial}{\partial w}j (W) = 2x^t (xw-y) = 0\]

    Solution to:

    \[w = (X^TX) ^{-1}x^ty\]

  3. What is the best or inferior

    Gradient Descent Standard Equations
    Need to choose a learning rate Don't need
    Iterations are many times A
    \ (O (kn^2) \) \ (O (n^3) \)
    Good performance when n is large n is very slow when it is very large
    Data needs to be normalized Don't need

    Conclusion:

    When the sample size is small, the standard equation group is used, and the gradient descent method is adopted to solve the large sample size.

Supplemental connections

Matrix vector derivatives for machine learning

Machine learning-2-linear regression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.