In this course of machine learning, Andrew first mentioned regression analysis under supervised learning. The programming job is to use MATLAB to implement regression. It mainly includes two aspects: computing cost and gradient descent.
The calculation cost can be described in the following formula:
Htheta (x) is the predicted value, and Y is the actual value. The objective is to minimize the gap between the predicted value and the actual value through the training parameter. The objective function has the following: the gradient descent iteration process.
Here, there are only two parameters theta0 and theta1, which can be expanded to multiple parameters:
After learning the formula and principle, it is not very difficult to use MATLAB to implement it.
The following are the ex1.m functions in Job 1. The two main functions are computecost () and gradientdescent ().
% Machine learning online class-Exercise 1: Linear Regression % instructions % ------------ % This file contains code that helps you get started on the % linear exercise. you will need to complete the following funwing % in this exericse: % warmupexercise. M % plotdata. M % gradientdescent. M % computecost. M % gradientdescentmulti. M % computecostmulti. M % featurenormalize. M % normaleqn. M % for this exercise, you will not need to change any code in this file, % or any other files other than those mentioned above. % x refers to the population size in 10,000 S % Y refers to the profit in $10,000 S % initializationclear; close all; CLC % ================================= Part 1: basic Function ===============================% complete warmupexercise. M fprintf ('running warmupexercise... \ n'); fprintf ('5x5 identity matrix: \ n'); warmupexercise () fprintf ('program paused. press enter to continue. \ n'); pause; % ======================== Part 2: ploaders ================================ fprintf ('ploaddata... \ n') Data = load('ex1data1.txt '); X = data (:, 1); y = data (:, 2); M = length (y ); % Number of training examples % plot data % Note: You have to complete the code in plotdata. M % plotdata (x, y); fprintf ('program paused. press enter to continue. \ n'); pause; % ========================= Part 3: gradient Descent ========================== fprintf ('running gradient descent... \ n') x = [ones (M, 1), data (:, 1)]; % Add a column of ones to xtheta = zeros (2, 1 ); % initialize fitting parameters % Some gradient descent settingsiterations = 1500; alpha = 0.01; % compute and Display Initial values (X, Y, theta) % run gradient descenttheta = gradientdescent (x, y, theta, Alpha, iterations); % print Theta to screenfprintf ('theta found by Gradient Descent: '); fprintf (' % F \ n', theta (1 ), theta (2); % plot the linear fithold on; % keep previous plot visibleplot (x (:, 2), x * Theta ,'-') legend ('training data', 'linear regression') Hold off % don't overlay any more plots on this figure % predict values for population sizes of 35,000 and 70,000 predict1 = [1, 3.5] * Theta; fprintf ('for population = 35,000, we predict a profit of % F \ n ',... predict1 * 10000); predict2 = [1, 7] * Theta; fprintf ('for population = 70,000, we predict a profit of % F \ n ',... predict2 * 10000); fprintf ('program paused. press enter to continue. \ n'); pause; % =============== Part 4: Visualizing J (theta_0, theta_1) ============== fprintf ('visualizing J (theta_0, theta_1 )... \ n') % grid over which we will calculate jtheta0_vals = linspace (-10, 10,100); theta1_vals = linspace (-1, 4,100 ); % initialize j_vals to a matrix of 0 'sj _ Vals = zeros (length (theta0_vals), length (theta1_vals); % fill out j_valsfor I = 1: length (theta0_vals) for j = 1: length (theta1_vals) t = [theta0_vals (I); theta1_vals (j)]; j_vals (I, j) = computecost (X, Y, t ); endend % because of the way meshgrids work in the surf command, we need to % transpose j_vals before calling surf, or else the axes will be flippedj_vals = j_vals '; % surface plotfigure; surf (theta0_vals, theta1_vals, j_vals) xlabel ('\ theta_0'); ylabel ('\ theta_1'); % contour plotfigure; % plot j_vals as 15 s spaced logarithmically between 0.01 and 100 contour (theta0_vals, theta1_vals, j_vals, logspace (-2, 3, 20) xlabel ('\ theta_0 '); ylabel ('\ theta_1'); Hold on; plot (theta (1), theta (2), 'rx ', 'markersize', 10, 'linewidth', 2 );
The following figure shows computecost. M. Note that the computecost is calculated using the vectoring method. Although the same effect can be achieved using a loop, it takes a long time to process a large amount of data.
Function J = computecost (X, Y, theta) % computecost compute cost for Linear Regression % J = computecost (X, Y, theta) computes the cost of using Theta as the % parameter for linear regression to fit the data points in X and Y % initialize some useful valuesm = length (y ); % Number of training examples % you need to return the following variables correctly J = 0; % ================================== your code here ====================== =========% instructions: compute the cost of a particle choice of Theta % You shoshould set J to the cost. % For j = 1: M % J = J + (x (J, :) * theta-y (J, :)) ^ 2; % end % J = J/(2 * m); j = sum (x * theta-Y ). ^ 2)/(2 * m ); % ===================================================== ======================================== end
Below is gradientdescent. m
Function [Theta, j_history] = gradientdescent (X, Y, Theta, Alpha, num_iters) % gradientdescent performs Gradient Descent to learn Theta % Theta = partition (X, Y, Theta, Alpha, num_iters) Updates theta by % taking num_iters gradient steps with learning rate Alpha % initialize some useful valuesm = length (y); % Number of training examplesj_history = zeros (num_iters, 1 ); for iter = 1: num_iters % =============================== your code here ================== =========% instructions: perform a single gradient step on the parameter vector % Theta. % hint: while debugging, it can be useful to print out the values % of the cost function (computecost) and gradient here. % k1 = 0; % k2 = 0; % for j = 1: M % k1 = k1 + (x (J, :) * theta-y (J ,:)) * X (J, 1); % k2 = k2 + (x (J, :) * theta-y (J, :)) * X (J, 2 ); % end % theta (1) = theta (1)-Alpha * (1/m) * K1; % theta (2) = theta (2) -Alpha * (1/m) * K2; Theta = theta-Alpha * (1/m) * (x' * (x * theta-y )); % ===================================================== =================================% save the cost J in every iteration j_history (ITER) = computecost (X, Y, theta); endend
By writingCodeI believe I can have a better understanding of this knowledge point. The exercise1 job is related to regression. Linear regression is the simplest of them. When the number of features increases, it can be viewed as a regression analysis in a high-dimensional space to find out what Theta, X, and y represent, what the gradient falls, and what the requirements are.
The first programming job pass, re-learning Matlab.