Ufldl Study Notes and programming assignments: softmax regression (softmax regression)
Ufldl provides a new tutorial, which is better than the previous one. Starting from the basics, the system is clear and has programming practices.
In the high-quality deep learning group, you can learn DL directly without having to delve into other machine learning algorithms.
So I started to do this recently. The tutorial, coupled with Matlab programming, is perfect.
The address of the new tutorial is: http://ufldl.stanford.edu/tutorial/
Link to this study: http://ufldl.stanford.edu/tutorial/supervised/SoftmaxRegression/
Softmax regression is actually an extension of Logistic regression,
Logistic regression is usually used as a classifier of two types,
Softmax is used as a classifier of multiple classes.
In terms of mathematical form, logical regression is actually the case where k = 2 in softmax regression. This tutorial also says.
The derivation of the partial derivative of the target function and parameter of softmax is also clear.
For programming jobs, due to the lack of familiarity with MATLAB implementation, many pitfalls have been skipped.
It took a long time and was only implemented using the for loop.
This time I finally realized that the performance of the For Loop is poor. It has iterated 200 times, and more than an hour.
This model is more complex than the first two models.
Paste the code of the first version first. In the future, I will come up with vectorized programming and try again.
The following code is softmax_regression.m.
Function [f, g] = softmax_regression_vec (Theta, x, y) % arguments: % theta-a vector containing the parameter values to optimize. % in minfunc, Theta is reshaped to a long vector. so we need to % resize it to an n-by-(num_classes-1) matrix. % recall that we assume theta (:, num_classes) = 0.% x-the examples stored in a matrix. % x (I, j) is the I 'th coordinate of the J 'th example. % Y-the label For each example. Y (j) is the J 'th example's label. % m = size (x, 2); n = size (x, 1); % Theta is a matrix. When passing parameters, theta (:) comes in like this, it is a vector with only one column. Now we have to change her to the matrix % Theta is a vector; need to reshape to N x num_classes. theta = reshape (Theta, N, []); num_classes = size (Theta, 2) + 1; % initialize objective value and gradient. f = 0; G = zeros (SIZE (theta); H = Theta '* X; % H (K, I) The K Theta, the first sample is paralyzed and still needs to be cyclically obtained. A = exp (h); A = [A; ones (1, Size (A, 2)]; % add Row B = sum (A, 1); for I = 1: m for j = 1: num_classes if y (I )! = J continue; end F + = log2 (A (J, I)/B (I); end f =-F; % flag = 0; for j = 1: num_classes-1 for I = 1: m if (Y (I) = J) Flag = 1; else flag = 0; end g (:, J) + = x (:, I) * (A (J, I)/B (I)-flag); end % todo: compute the softmax objective function and gradient using vectorized code. % store the objective function value in 'F', and the gradient in 'G '. % before returning g, make sure you form it back into a vector with G = g (:); % Your code here % G = g (:); % make gradient a vector for minfunc
The running result is as follows:
Old tutorial http://deeplearning.stanford.edu/wiki/index.php/Exercise:Softmax_Regression
There are also softmax programming jobs. It also recognizes handwritten numbers.
The accuracy is mentioned.
Our implementation achieved an accuracy of 92.6%. if your model's accuracy is significantly less (less than 91%), check your code, ensure that you are using the trained weights, and that you are training your model on the full 60000 training images. conversely, if your accuracy is too high (99-100%), ensure that you have not accidentally trained your model on the test set as well.
That is to say, in terms of accuracy, my code can still be used.
The next step is to find a way to implement vectorized programming and speed up.
If you have any good ideas, please share them with us!
Linger
Link: http://blog.csdn.net/lingerlanlan/article/details/38410123