Logistic regression is a regression method, which can be said to be linear regression mapping to (0, 1) linear regression, suitable for two categories of classification, and when the classification can be given together to give the probability size of the category.
The linear regression model is as follows:
Use the sigmoid function to map to (0, 1) intervals as follows:
Then you can get the following conditional probabilities:
So given a sample x, its likelihood function is
The logarithmic likelihood function is
Next the derivative of W, the result is as follows:
If the cost function is used to define, then the resulting equation is exactly the opposite of a minus sign, as follows:
Then the next is to solve the parameter W, if using the likelihood function, then the optimization problem can be transformed into a gradient rise algorithm, if the cost function, then can be converted to a gradient descent algorithm, the two are equivalent.
——————————————————————————————————————————————————————————————————
The above is a formula derivation to solve W, but the use of gradient descent (or gradient rise, below gradient descent) in real-world problems can be problematic,
Here is the gradient descent algorithm, which is also used in the linear regression, the final optimization equation is the same as the above logical regression. The iteration formula is as follows:
Every time you adjust to the direction of the W, you get the bias of the W, then you initialize a W, and the next iteration is fine. In addition, we note that the biasing of W is a summation of all the data points, so in each iteration, all the data points are traversed, the sample quantity is small, the sample size is large, then the complexity is too high, an improved method is to use a random gradient drop, each time with a sample to update W, All samples are randomly taken and iterated multiple times.
There may be a problem, why use random iterations, here because the data may be cyclical, then in the solution of W, there may be oscillation. Random sampling can avoid the occurrence of this phenomenon.
Resources:
[1] Hangyuan Li, Statistical learning method, 1th edition, March 2012
[2] Peter Harrington, machine learning Combat, 2013 1th Edition
Logical regression of machine learning