This article turns from http://blog.csdn.net/cyh_24/article/details/50359055
The motivation to write this blog is due to the following Weibo:
I was very touched when I saw this microblog, because if Rickjin came to interview me, I think I would die very miserably, because he asked questions I basically can not answer. So, from the bitter taste, I decided that the future understanding of some algorithms can not just stay on the surface, but should at least push forward, try to see farther.
For people learning machine learning, the Logistic regression can be said to be an introductory algorithm, the algorithm itself is not complex, but it is for this reason, many people tend to ignore the algorithm of some intrinsic essence.
In this blog, I intend to summarize some of the questions Rickjin asked:
1. LR principle
2. Mathematical derivation of LR solution
3. Regularization of LR
4. Why LR can be better than linear regression.
5. The relationship between LR and maxent
6. Parallelization of the LR logistic regression model
Although the logistic regression is a return, the real identity is actually a two classifier. After introducing the surname, let's introduce its name, logistic. The name comes from logistic distribution: Logistic distribution
X is a continuous random variable, and x obeys the logistic distribution, which means that X has the following distribution function and density function:
F (x) =p (x≤x) =11+e− (x−μ)/γf (x) =f′ (x≤x) =e− (x−μ)/γγ (1+e− (x−μ)/gamma) 2