1. Background knowledge
In just the end of the cat Big Data S1 competition, logic regression is common use and the effect of a good algorithm.
(1) return
Let's say what is regression, for example, we have two types of data, each with 500 points, when we draw these points, there will be a line between the two sets of data, we draw this curve (because it is very likely non-linear), is regression. We use a lot of data to find this line, and to fit the expression of this line, and then the data, we use this line as a distinction to achieve classification. The following figure is the two sets of data for a dataset I drew, with a line between two sets of data.
This column more highlights: http://www.bianceng.cnhttp://www.bianceng.cn/Programming/sjjg/
(2) sigmoid function
We have seen the division of two sets of data in the above figure, so how do we find the boundary expression of two sets of data, where we use the sigmoid function. Its shape is roughly (as follows), the formula
Set the characteristic value of the dataset to x1,x2,x3 .... We ask for their regression coefficients. Just set z=w1*x1+w2*x2 ... Using the sigmoid function is to prevent the data from 0 to 1 jump, because the target function is 0 to 1, we have to take into the x1,x2 ... Polynomial data is controlled between this.