using Python's Theano to write a logistic regression for two classification learning, the datasets used can be downloaded here .
We know that the logistic regression is a nonlinear function based on a multivariate linear function, and the commonly used nonlinear function is the sigmoid function. Plus the output after sigmoid we think the probability of the corresponding classification is 1, so the parameters that need to be studied are linear weighted coefficients and intercept (bias).
H (x) = WX + b
G (x) = 1/(1 + exp (-h (x))) = 1/(1 + exp (-wx-b))
Then the probability of a corresponding classification of 1 can be expressed as:
P (Y=1 | x; w, b) = g (x)
Then the probability for a known data is expressed as:
P (y | x; w, b) = g (x) ^y (1-g (x)) ^ (1-y)
So the objective function of the last training is to maximize the likelihood function of the known data, and to multiply the probability of the above is to fit the likelihood function of training data. However, due to the problem of multiplication in terms of computation and precision, the likelihood function is usually log, if it is a single instance of the logarithmic result is:
Log (P) = Ylog (g (x)) + (1-y) log (1-g (x))
This looks a bit like cross-entropy, and adding this to the training data is the last log-like. Of course, the front plus a symbol is the negative log likelihood, the parametric solution is to minimize the negative log likelihood when the corresponding parameter situation. The commonly used method is gradient descent.
The following is affixed with a Python Theano implementation of the two classification of the logistic Regression, the final output of the training data on the error rate, interested students can see. The training data used in the code can be downloaded here.
#-*-Coding:utf-8-*-"" "Created on Sun Nov + 21:37:43 2014@author:brighthushexample for Logistic Regression" "" Import t Imeimport numpyimport theanoimport theano.tensor as Trng = Numpy.randomclass Logisticregression (object): Def __init__ (s Elf, input, n_in): SELF.W = theano.shared (Value=rng.randn (n_in), Name= ' W ', Borr ow=true) self.b = theano.shared (value=.10, name= ' b ') self.p_given_x = 1/(1+t.exp (-t.dot (in Put, SELF.W)-self.b) self.y_given_x = self.p_given_x > 0.5 self.params = [SELF.W, self.b] D EF Negative_log_likelihood (self, y): ll = y * T.log (self.p_given_x)-(1-y) * T.log (1-self.p_given_x) Cos t = Ll.mean () + 0.01 * (SELF.W * 2). SUM () return cost def errors (self, y): Return T.mean (T.NEQ (self.y_gi ven_x, y)) def generate_data (): rng = numpy.random N = feats = 5 D = (RNG.RANDN (N, feats), Rng.randint (siz E=n, Low=0, high=2))x = d[0] y = d[1] x, y = read_data () x_shared = theano.shared (Numpy.asarray (x, DTYPE=THEANO.CONFIG.FLOATX), Borrow=true) y_shared = theano.shared (Numpy.asarray (Y, Dtype=theano.con FIG.FLOATX), borrow=true) return x_shared, T.cast (y_shared, ' int32 ') def sgd_o Ptimization (learning_rate=0.13, n_epochs=1000, batch_size=100): train_x, train_y = Generate_data () N_batches = Train _x.get_value (borrow=true). shape[0]/Batch_size index = t.lscalar () x = T.matrix (' x ') y = t.ivector (' y ') LR = logisticregression (x, Train_x.get_value (). shape[1]) cost = Lr.negative_log_likelihood (y) print ' Comp ile function Test_model ... ' Test_model = Theano.function (Inputs=[index], outputs=lr.er Rors (y), givens={x:train_x[index*batch_size: (index+1) *batch_size], Y:train_y[index*batch_size: (index+1) *batch_size]}) g_w = T.grad (Cost=cost, WRT=LR.W) G_b = T.grad (Cost=cost, wrt=lr.b) updates = [(LR.W, Lr.w-learning_rate*g_w), (Lr.b, Lr.b-learning_rate*g_b)] print ' Complie function Train_model ... ' Train_model = Theano.function (Inputs=[index], Outputs=cost, Updates=updates, givens={ X:train_x[index*batch_size: (index+1) *batch_size], Y:tra In_y[index*batch_size: (index+1) *batch_size]}) Best_train_error = NumPy. INF start_time = Time.clock () for the epoch in Xrange (N_EPOCHS): for Minibatch_indexIn Xrange (n_batches): Batch_cost = Train_model (minibatch_index) train_errors = [Test_model (i ) for I in Xrange (n_batches)] Train_error = Numpy.mean (train_errors) if Best_train_error > Train_error: Best_train_error = train_error print ' epoch%d, Best_train_error%lf, Train_error%lf ' % (epoch, best_train_error, train_error) #print ' iterator%d%lf '% (epoch*n_batches + minibatch_index+1, b Atch_cost) End_time = Time.clock () print ' cost%d '% (end_time-start_time) def read_data (): print ' Load data ... ' data = Numpy.loadtxt ('. \\titanic.dat ', delimiter= ', ', skiprows=8) x = [] y = [] for i in Xrange (data.shape[ 0]): X.append (Data[i,: data.shape[1]-1]) if data[i, -1]==-1.0:y.append (0) Else: Y.append (1) x = Numpy.array (x) y = Numpy.array (y) print '%d examples,%d columns every row '% (data.shape[ 0], data.shape[1]) #normalize the Fatures feature_min = x.min (0) Feature_max = X.max (0) x = X-numpy.array (feature_min) x = X/numpy.array (feature_max-feature_min) print x.min (0), X.max (0) return Numpy.array (x), Numpy.array (y) I F __name__ = = ' __main__ ': sgd_optimization ()
Logistic Regression to do Binary classification