See http://blog.csdn.net/acdreamers/article/details/27365941 in the original
Logistic regression is a probabilistic nonlinear regression model, which is a study of the relationship between two classification observation and some influencing factors.
Variable analysis method. The usual problem is to study whether a certain outcome occurs in some factors, such as in medicine, according to some of the patient's symptoms.
Whether they have a certain disease.
Before explaining the logistic regression theory, we start with the LR classifier. The LR classifier, which is the Logistic Regression Classifier.
In the classification case, the learned LR classifier is a set of weights, and when the data input of the test sample, the weight value and the test data are
According to linear addition and get
Here is a characteristic of each sample.
After that, the form of the sigmoid function is calculated as
Since the sigmoid function is defined as domain, the most basic LR classifier is suitable for classifying two kinds of targets.
So the key problem of logistic regression is to study how to obtain this set of weights. This problem is done with maximum likelihood estimation .
The logistic regression model is formally in the following terms.
Consider a vector with an independent variable, and the conditional rate is based on the observed amount relative to the occurrence of an event
Probability. Then the logistic regression model can be expressed as
This is called the logistic function. which
Then the probability of not occurring under the condition is
So the ratio of events to the probability of not occurring is
This ratio is called the occurrence ratio of the event (the odds of experiencing an event), précis-writers is odds.
Take logarithm of odds to get
You can see that logistic regression is carried out around a logistic function. Next, we will talk about how to use the maximum likelihood estimation to find the parameters of the classifier.
Assuming that there is an observation sample, the observed values are set to the probability obtained under given conditions, in the same way,
The probability is that, so the probability of getting an observed value is.
Because each observation sample is independent of each other, their joint distribution is the product of each edge distribution. Get the likelihood function as
Then our goal is to find out the maximum parameter estimation of the likelihood function, and the maximum likelihood estimate is to find the parameter, which makes
Gets the maximum value, the logarithm of the function is obtained
Continue to the deviation of the guide, to obtain an equation, for example, now the parameters of the biased guide, because
So get
There is a total of such equations, so now the problem is transformed into a solution to the equations formed by the equation.
The above equation is more complex, the general method does not seem to solve it, so we quoted Newton-Lafite iterative method to solve.