I. Algorithm ImplementationFrom the previous theory, we know the formula for Solving Linear Regression with Gradient Descent: the idea of Solving Linear Regression with Gradient Descent:
Algorithm Implementation:Computecost function:
function J = computeCost(X, y, theta)m = length(y); % number of training examplesJ = 0;predictions = X * theta;J = 1/(2*m)*(predictions - y)'*(predictions - y);end
Gradientdescent function:
Function [Theta, j_history] = gradientdescent (X, Y, Theta, Alpha, num_iters) % x is M * (n + 1) matrix % Y is M * 1% Theta is (n + 1) * 1 matrix % Alpha is a Number % num_iters is number of iteratorsm = length (y ); % Number of training examplesj_history = zeros (num_iters, 1); % cost function value change process % pre-defined number of iterations for ITER = 1: num_iterstemp1 = theta (1) -(alpha/m) * sum (x * theta-Y ). * X (:, 1); temp2 = theta (2)-(alpha/m) * sum (x * theta-Y ). * X (:, 2); theta (1) = temp1; theta (2) = temp2; j_history (ITER) = computecost (X, Y, theta); endend
2. Data VisualizationWe can use the algorithm to obtain the function h (x), but we also need to visualize the data: (1) Draw a scatter plot of the training set + a straight line after fitting; (2) draw a three-dimensional curve of J (theta) as the Z axis, theta0 as the X axis, and theta1 as the Y axis. (3) Draw a contour map of the three-dimensional curve of (2;
1. Draw a scatter chart + a straight line to fit
Description: ex1data1.txt. There are two columns of data in the file. Each column represents a dimension, the first column represents X, and the second column represents Y. Use octave to draw a scatter chart (scalar plot ), the data format is as follows:
6.1101, 17.592 5.5277, 9.1302 8.5186, 13.662 7.0032, 11.854 5.8598, 6.8233 8.3829, 11.886 ........ |
Answer: (1) Data = load('ex1data1.txt '); % import the file and assign the data variable (2) x = data (:, 1); y = data (:, 2 ); % assign the two columns X and Y (3) x = [ones (SIZE (x, 1), 1), X]; % Add a column 1 (4) plot (X, Y, 'rx ', 'markersize', 4) to the left of X; % plot, use the X vector as the X axis, Y vector is used as the Y axis. Each vertex is represented by "X", and 'R' indicates the red vertex. The size of each vertex is 4. (5) axis ([4 24-5 25]). % adjust the start and maximum coordinates of X and Y axes; (6) xlabel ('x'); % mark the X axis as 'X'; (7) ylabel ('y'); % indicates 'y' for the Y axis. For details, see [Theta, j_history] = gradientdescent (X, Y, Theta, alpha, num_iters); you can use: plot (x (:, 2), x * theta); % to draw the final fit line and above to present the linear regression result; the following two types are visualized J (theta );2.Surface plotDescription: The data is the same as the previous question. We want to plot the cost function for the data, and we will plot the 3D graph and the contour plot. If we want to plot the cost function, we must write the cost function formula in advance: Function J = computecost (X, Y, theta) M = length (y); j = 0; Predictions = x * Theta; j = 1/(2 * m) * sum (predictions-Y ). ^ 2); EndImplementation:(1) theta0_vals = linspace (-10, 10,100); % Take 100 numbers from-10 to 10 to form a vector (2) theta1_vals = linspace (-1, 4,100 ); % Take 100 numbers from-1 to 4 to form a vector (3) j_vals = zeros (length (theta0_vals), length (theta1_vals); % initialize j_vals matrix, for a theta0 and theta1, j_vals has the corresponding cost function value; (4) Calculate the j_vals corresponding to each (theta0, theta1); for I = 1: length (theta0_vals) for j = 1: length (theta1_vals) t = [theta0_vals (I); theta1_vals (j)]; j_vals (I, j) = computecost (X, Y, t ); endend (5) figure; % create a graph (6) Surf (theta0_vals, theta1_vals, j_vals); % x axis is theta0_vals, Y axis is theta1_vals, Z axis is j_vals; (7) xlabel ('\ theta_0'); % Add the X axis flag (8) ylabel ('\ theta_1'); % Add the Y axis flag. This image can be rotated;2. Contour Plot
Implementation:(1) theta0_vals = linspace (-10, 10,100); % Take 100 numbers from-10 to 10 to form a vector (2) theta1_vals = linspace (-1, 4,100 ); % Take 100 numbers from-1 to 4 to form a vector (3) j_vals = zeros (length (theta0_vals), length (theta1_vals); % initialize j_vals matrix, for a theta0 and theta1, j_vals has the corresponding cost function value; (4) Calculate the j_vals corresponding to each (theta0, theta1); for I = 1: length (theta0_vals) for j = 1: length (theta1_vals) t = [theta0_vals (I); theta1_vals (j)]; j_vals (I, j) = computecost (X, Y, t ); endend (5) figure; % create a graph (6) contour (theta0_vals, theta1_vals, j_vals, logspace (-2, 3, 20); % draw a contour map (7) xlabel ('\ theta_0'); ylabel ('\ theta_1'); if we want to draw the theta0 and theta1 results of linear regression on the contour map, we can: plot (theta (1), theta (2), 'rx ', 'markersize', 10, 'linewidth', 2 );
4. Draw a picture to check whether the learning rate is reasonable
The value returned by the gradientdescent function contains the j_history vector, which records the value of the cost function after each iteration. Therefore, we only need to set the X axis as the number of iterations, if the Y axis is the value of the cost function, you can draw a picture: (1) [Theta, j_history] = gradientdescent (X, Y, Theta, Alpha, num_iters); (2) figure; (3) plot (1: length (j_history), j_history, '-B', 'linewidth', 2); (4) xlabel ('number of iterations '); (5) ylabel ('cost J'); of course, we can also draw different Alpha values on a single graph to compare the changing trend of the cost function when each Alpha value is obtained. (1) alpha = 0.01; (2) [Theta, J1] = gradientdescent (X, Y, zeros (0.03), alpha, num_iters); (3) alpha =; (4) [Theta, J2] = gradientdescent (X, Y, zeros (0.1), alpha, num_iters); (5) alpha =; (6) [Theta, j3] = gradientdescent (X, Y, zeros (3,1), alpha, num_iters); (7) plot (1: numel (J1), J1, '-B ', 'linewidth', 2); (8) plot (1: numel (J2), J2, '-R', 'linewidth', 2); (9) plot (1: numel (J3), J3, '-K', 'linewidth', 2 );