Gradient Descent optimized linear regression

Source: Internet
Author: User

First, the theory

second, the data set
6.1101,17.5925.5277,9.13028.5186,13.6627.0032,11.8545.8598,6.82338.3829,11.8867.4764,4.34838.5781, A6.4862,6.59875.0546,3.81665.7107,3.252214.164,15.5055.734,3.15518.4084,7.22585.6407,0.716185.3794,3.51296.3654,5.30485.1301,0.560776.4296,3.65187.0708,5.38936.1891,3.138620.27,21.7675.4901,4.2636.3261,5.18755.5649,3.082518.945,22.63812.828,13.50110.957,7.046713.176,14.69222.203,24.1475.2524,-1.226.5894,5.99669.2482,12.1345.8918,1.84958.2111,6.54267.9334,4.56238.0959,4.11645.6063,3.392812.836,10.1176.3534,5.49745.4069,0.556576.8825,3.911511.708,5.38545.7737,2.44067.8247,6.73187.0931,1.04635.0702,5.13375.8014,1.84411.7,8.00435.5416,1.01797.5402,6.75045.3077,1.83967.4239,4.28857.6031,4.99816.3328,1.42336.3589,-1.42116.2742,2.47565.6397,4.60429.3102,3.96249.4536,5.41418.8254,5.16945.1793,-0.7427921.279,17.92914.908,12.05418.959,17.0547.2182,4.88528.2951,5.744210.236,7.77545.4994,1.017320.341,20.99210.136,6.67997.3345,4.02596.0062,1.27847.2259,3.34115.0269,-2.68076.5479,0.296787.5386,3.88455.0365,5.701410.274,6.75265.1077,2.05765.7292,0.479535.1884,0.204216.3557,0.678619.7687,7.54356.5159,5.34368.5172,4.24159.1802,6.79816.002,0.926955.5204,0.1525.0594,2.82145.7077,1.84517.6366,4.29595.8707,7.20295.3054,1.98698.2934,0.1445413.394,9.05515.4369,0.61705
Third, the code implementation
Clear all; Clc;data = Load (' ex1data1.txt '); X = Data (:, 1); y = data (:, 2); m = length (y); % Number of training Examplesplot (x, y, ' Rx '), percent =================== part 3:gradient descent ===================fprintf (' Running Gradient descent ... \ n ')% why add a column 1, in order to calculate J time, theta0 times 1X = [Ones (M, 1), Data (:, 1)]; % Add a column of ones to Xtheta = Zeros (2, 1); % Initialize fitting parameters% Some gradient descent settingsiterations = 1500;alpha = 0.01;% Compute and display Initia L costcomputecost (x, y, theta)% run gradient Descent[theta, j_history]= gradientdescent (x, Y, theta, alpha, iterations); Ho LD on;  % Keep previous plot visibleplot (X (:, 2), X*theta, '-') Legend (' Training data ', ' Linear regression ') hold off% don ' t overlay Any more plots on the figure% Predict values for population sizes of 35,000 and 70,000predict1 = [1, 3.5] *theta;fprintf (' for population = 35,000, we predict a profit of%f\n ',... predict1*10000);p redict2 = [1, 7] * theta;fprintf (' for Popu Lation = 70,000, we predict a profitof%f\n ',... predict2*10000);% Grid over which we'll calculate jtheta0_vals = Linspace ( -10, ten, +); theta1_vals = Li Nspace ( -1, 4, +);% initialize j_vals to a matrix of 0 ' sj_vals = zeros (Length (theta0_vals), Length (theta1_vals));% Fill O      UT j_valsfor i = 1:length (theta0_vals) for J = 1:length (theta1_vals) t = [Theta0_vals (i); Theta1_vals (j)];    J_vals (i,j) = Computecost (X, y, T); endend% Because of the meshgrids work in the surf command, we need to% transpose j_vals before calling surf, or else The axes would be flippedj_vals = J_vals ';% Surface Plotfigure;surf (theta0_vals, Theta1_vals, j_vals) xlabel (' \theta_0 '); Ylabel (' \theta_1 ');% Contour plotfigure;% Plot j_vals as contours spaced logarithmically between 0.01 and 100% base 10 index Logspace (-2, 3, 20) coordinate values range and spacing contour (theta0_vals, Theta1_vals, J_vals, Logspace ( -2, 3,)) Xlabel (' \theta_0 '); Ylabel (' \theta_1 '); hold On;plot (Theta (1), Theta (2), ' Rx ', ' markersize ', ten, ' LineWidth ', 2);

...................

function J = Computecost (X, y, theta) m = length (y); % Number of training examples j = 0;for i=1:m    j = j + (theta (1) *x (i,1) + theta (2) *x (i,2)-y (i)) ^2;  end% divided by 2m is to update the parameters of the time to Count   2 because J is two times, to cheat to the post-generation coefficient 2,%m is to not let J too large (i=1:m is already the second part of the deviation of M, item) j = j/(m*2); end

......

function [Theta, j_history] = Gradientdescent (X, y, theta, Alpha, num_iters) m = Length (y); % Number of training examplesj_history = Zeros (num_iters, 1); J_1 = 0;% partial derivative j_1, j_2j_2 = 0;for iter = 1:num_iters for    i = 1:m        j_1 = j_1 + theta (1) *x (i,1) + theta (2) *x (i,2)-Y ( i);        J_2 = J_2 + (theta (1) *x (i,1) + theta (2) *x (i,2)-y (i)) * X (i,2);    the M in end%j is not removed in the above for, because it is enough to divide only once    j_1 = j_1/m;    j_2 = j_2/m;%     Temp1 = theta (1)-alpha * j_1;%     temp2 = Theta (2)-alpha * j_2;%     theta (1) = temp1;%     theta (2) = Temp2;    Theta (1) = theta (1)-alpha * j_1;    Theta (2) = Theta (2)-alpha * j_2;    J_history (ITER) = Computecost (X, y, theta);  %     Save J_history j_historyendend
iv. Results of Operation

Gradient Descent optimized linear regression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.