Common machine learning algorithms Principles + Practice Series 1 (Logistic regression)

Source: Internet
Author: User

Logistic Regression for logistic regression

Logistic regression is the extension of linear model y=f (x) =b0+b1*x, generally used to do 2 classification problem, the output Mark C={0,1},c is classification meaning, popular speaking, it is necessary to establish the relationship between Y and C, y is how many times, C is 0,y is how much time, C is 1, the traditional linear regression output of y is real, in the negative infinity to the positive infinity, and C is 0, 12, using the sigmoid function can be y from negative infinity to positive infinity between the transition to 0, 1, the function is: y=1/(1+e-z), where z=wtx,x is an argument, W is the weight. P is recorded as the probability of occurrence 1, namely p=y.

logarithm probability (the logarithm of the ratio that occurs between 1 and 0) Ln (p/(1-p)) =ln ((1/(1+E-Z))/(N (1/(1+E-Z))) =lnez=z= WTX, actually p=1/(1+e-z) =ez/(1+ez).

This means that we calculate the logarithm probability based on the weight vector wt, and then we calculate the p probability (0-1).

Using maximum likelihood estimation to solve, for the training data set, the characteristic data x={x 1, x 2, ..., x m} and corresponding categorical data y={y 1, y 2, ..., y m}. To construct a logistic regression model f (θ), first, for a single sample, the posterior probability is:

So, the maximum likelihood function is:

Log likelihood is:

The logistic regression model f (θ) is equivalent to:

Gradient Descent method is used:

This allows you to iterate θ to converge:

The gradient is the fastest direction for the function to rise, along the gradient direction can quickly find the extremum point, assuming that the gradient is recorded as T, Gradient rise (descent) algorithm iteration formula is as follows:

Weights = weights+alpha*t, where Alpha is called step, Weights is actually the weight matrix we require, that is, the regression coefficient matrix, so to understand, if Alpha is small enough, we each follow the alpha distance, Go in the fastest direction (T), then you will eventually reach the extreme point, which is the constant iteration, knowing a stop condition, such as the number of iterations, or reaching a specified value.

A solution case is shown below using Python:

1, first of all load training data and tags come in, in the form of matrix exist

2, defining the sigmoid function

3, Gradient rise method

The above stop condition is the need to iterate 500 times, and the gradient is actually the difference between the real label and the predicted label after each iteration, and then according to this difference to constantly revise the regression coefficients, it is worth noting that the initial factor at the beginning of the first set to 1, and then constantly revised.

The above iterates 500 times, and each iteration is calculated all the matrix data, when the sample is less than the time is OK, but if the sample is very large, the cost of training is larger, there are some improvements in each iteration of the time does not need to calculate all the matrix data, but randomly pick a row, The latter a few lines to calculate, but also can constantly modify the alpha step (the first algorithm alpha is fixed), sometimes the classification effect is not changed or even better the cost of training greatly reduced, such as the following example:

Common machine learning algorithms Principles + Practice Series 1 (Logistic regression)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.