Copyright NOTICE: This article is original article: http://blog.csdn.net/programmer_wei/article/details/52072939
Logistic Regression (Logistic regression) is a very, very common model in machine learning that is often used in real production environments and is a classic classification model (not a regression model). This paper mainly introduces the principle of logistic Regression (logistic regression) model and the method of parameter estimation and formula derivation. Model Building
Before introducing the logistic regression, let's briefly talk about linear regression, the main idea of linear regression is to fit a straight line through historical data, use this line to predict new data, and linear regression can refer to one of my previous articles.
We know that the formula for linear regression is as follows:
Z=θ0+θ1x1+θ2x2+θ3x3...+θnxn=θtx Z={\theta_{0}}+{\theta_{1}x_{1}}+{\theta_{2}x_{2}+{\theta_{3}x_{3}}...+{\theta_ {N}X_{N}}}=\THETA^TX
For the logistic regression, the idea is also based on linear regression (logistic regression belongs to the generalized linear regression model). The formula is as follows:
hθ (x) =11+e−z=11+e−θtx H_{\theta} (x) =\frac{1}{1+e^{-z}}=\frac{1}{1+e^{-\theta^tx}}
Where Y=11+e−x Y=\frac{1}{1+e^{-x}} is called the sigmoid function, we can see that the Logistic regression algorithm maps the results of the linear function into the sigmoid function.
The sigmoid function graph is as follows:
We can see that the function output of sigmoid is between (0,1), the median is 0.5, so the meaning of the previous formula hθ (x) H_{\theta} (x) is well understood, because the hθ (x) H_{\theta} (x) output is between (0,1), It also indicates that the data belongs to a certain class of probabilities , such as:
hθ (x) H_{\theta} (x) <0.5 indicates that the current data belongs to Class A;
hθ (x) H_{\theta} (x) >0.5 indicates that the current data belongs to Class B.
So we can consider the sigmoid function as the probability density function of the sample data.
With the formula above, the next thing we need to do is to estimate the parameters