a generalized linear model
a generalized linear model should meet three assumptions:
The first hypothesis is that the distributions of the given x and parameter theta,y obey the distribution of an exponential function family. The second hypothesis is that given X, the goal is to output the mean of T (y) under the X condition, and this T (y) is generally equal to Y, and there are unequal cases, The third hypothesis is to define a variable eta that assumes one. second, the exponential function family
As mentioned above, the exponential function family is defined here, and functions that satisfy the following form constitute the family of exponential functions:
where a,b,t are functions.
third, the export of the Logistic function
Logistic regression hypothesis P (y|x) satisfies the Bernoulli Bernouli distribution
Our goal is to model the parameter Phi for the given X, and then to get phi about X, and how to choose this model is a problem, now we transform the posterior probability of satisfying the Nubery distribution into the form of an exponential function family, and then we derive Phi's model of X.
now we've got Phi's model for x, the parameters are theta, and we're going to create a hypothetical model of our classification:
This means that if we get the parameter theta, for a given x can get y=1 probability, then the probability of y=0 can also find out, the problem is solved, the following is how to solve the parameter theta.
Iv. objective functions and gradients
Now that we know the form of the logistic regression model, we use the maximum likelihood estimate to estimate the theta in order to get the optimal parameters.
Now that we've got the derivative of the optimization function,The parameter update formula can be obtained by using the steepest descent method:
It is also possible to use the Newton method to solve the minimum value, where the Hessian matrix is used.
the parameter update formula of Newton method is
Of course, you can also use other optimization algorithms, BGGS,L-BFGS and so on.
Five, MATLAB experimentThe experiment is the Mnist database, using the handwritten numbers 0 and 1 of the data, using gradient descent method to solve.
%%======================================================================%% STEP 0:initialise Constants and parameters%% here we define and initialise some constants which allow your code% to being used more generally on any arbitr ary input. % We also initialise some parameters used for tuning the Model.inputsize = 28 * 28+1; % Size of input vector (MNIST images is 28x28) numclasses = 2; % Number of classes (MNIST images fall into classes)% lambda = 1e-4; % Weight decay parameter%%======================================================================%% STEP 1:load data% % of this section, we load the input and output data.% for Softmax regression on MNIST pixels,% of the input data is the Images, and% the output data is the labels.%% change the filenames if you ' ve saved the files under different names% on s ome platforms, the files might be saved as% Train-images.idx3-ubyte/train-labels.idx1-ubyteimages = Loadmnistimages (' mn Ist/train-images-idx3-ubyte '); labels = LoadmnistlaBELs (' Mnist/train-labels-idx1-ubyte '); index= (labels==0|labels==1); Images=images (:, index); labels=labels (index); Inputdata = [Images;ones (1,size (images,2))];% randomly initialise theta%%========================================== ============================%% STEP 2:implement softmaxcost%% Implement softmaxcost in SOFTMAXCOST.M. % [Cost, Grad] = Logisticcost (theta, inputsize,inputdata, labels); %%======================================================================%% STEP 4:learning parameters%% Once You Has verified that your gradients is correct,% you can start training your Softmax regression code using softmaxtrain% (which uses minfunc). Options.maxiter = 100;options.alpha = 0.1;options.method = ' Grad '; theta = Logistictrain (InputData, Labels,options); % Although we only use iterations here to train a classifier for the percent MNIST data set, in practice, training for more Iterations is usually% beneficial.%%======================================================================%% STEP 5:testing%% should now test your model against The test images.% to does this, you'll first need to write softmaxpredict% (in softmaxpredict.m), which should return p redictions% given a softmax model and the input data.images = loadmnistimages (' mnist/t10k-images-idx3-ubyte '); labels = lo Admnistlabels (' Mnist/t10k-labels-idx1-ubyte '); index= (labels==0|labels==1); Images=images (:, index); labels=labels (index); inputdata = [Images;ones (1,size (images,2))];% you'll have to implement Softmaxpredict in softmaxpredict.m[pred ] = Logisticpredict (theta, inputdata); acc = mean (labels (:) = = pred (:)); fprintf (' Accuracy:%0.3f%%\n ', acc *);% Accurac Y is the proportion of correctly classified images% after iterations, the results for our implementation were:%% Accur acy:92.200%%% If your values is too low (accuracy less than 0.91), your should check% your code for errors, and make Sur E You is training on the% entire data Set of 60000 28x28 training images% (unless you modified the loading code, this should is the case)
function [Modeltheta] = Logistictrain (Inputdata, labels,option) if ~exist (' Options ', ' var ') options = Struct;endif ~ISFI Eld (Options, ' maxiter ') Options.maxiter = 400;endif ~isfield (options, ' method ') Options.method = ' Newton '; endif ~ISF Ield (options, ' alpha ') Options.method = 0.01;endtheta = 0.005 * RANDN (Size (inputdata,1), 1); iter=1;maxiter=option.maxit er;alpha=option.alpha;method=option.method;fprintf (' Iter\tstep length\n '); Laststeps=0;while iter<=maxIter h= Sigmoid (Theta ' *inputdata);% cost=sum (labels '. *log (h) + (1-labels '). *log (1-h), 2)/size (inputdata,2); grad=inputdata* (Labels ' h) '; If strcmp (method, ' Grad ') >0 steps=alpha.*grad;% else% h = inputdata* diag (h) * DIAG (1-h) * inputdata ' ;% Steps=-alpha.*h\grad; End Theta=theta+steps; Steplength=sum (steps.^2)/size (steps,1); fprintf ('%d\t%f\n ', iter,steplength); If ABS (Steplength) <1e-9 break; End Iter=iter+1;endmodeltheta=theta; function Z=sigmoid (x) z=1./(1+exp ( -1.*x)); EndEnd
function [pred] = logisticpredict (theta, data)% Softmaxmodel-model trained using softmaxtrain% data-the N x M input MA Trix, where each column of data (:, i) corresponds to% a single test set%% Your code should produce the prediction matrix % pred, where pred (i) is Argmax_c P (Y (c) | x (i)). Percent----------YOUR CODE here--------------------------------------% Instructions:compute pred using theta Assuming that the labels start Pred=theta ' *data>0.5;%------------------------- --------------------------------------------End
To be continued .....
Generalized linear model and logistic regression