0-Background
When defining the cost function of logistic regression, it is not able to be like linear regression, otherwise the cost function becomes a non-function, it is difficult to converge to the global optimal. 1-Linear regression cost function:
The cost function in linear regression:
J (θ) =12m∑i=1m (yi−hθ (xi)) 2 J (\theta) =\frac{1}{2m}\sum_{i=1}^{m} (Y^{i}-h_{\theta} (X^{i})) ^{2}
The practical meaning of the linear regression cost function is the square error. But the logistic regression is not, its prediction function hθ (x) H_{\theta} (x) is non-linear. If the cost function of linear regression is analogous to the logistic regression, then J (θ) j (θ) is probably a non-convex function, that is, there are many local optimal solutions, but not necessarily global optimal solutions. We want to construct a convex function, that is, a bowl-type function as a cost function of logistic regression. 2-Logistic regression cost function:
According to the method of maximum likelihood function, the logistic regression likelihood function:
L (θ) =∏i=1mp (yi|xi;θ) =∏i=1m (hθ (xi)) Yi ((1−hθ (xi))) 1−yi L (\theta) =\prod_{i=1}^{m}p (Y_{i}|x_{i};\theta) =\prod_{i=1} ^{m} (H_{\theta} (X_{i})) ^{y_{i}} ((1-h_{\theta} (X_{i})) ^{1-y_{i}}
where m represents the number of samples, taking the logarithm:
L (θ) =logl (θ) =∑i=1m (yiloghθ (xi