Programming Assignment 1:linear Regression

Source: Internet
Author: User

Warm-up Exercise

Follow the instruction, type the code in WARMUPEXERCISE.M file:

A = Eye (5);

Computing cost (for one Variable)

By the formula to cost function (for one Variable):

J (θ0,θ1) = 1/(2m) *∑i=1~m (hθ (x (i)-y (i)) 2

We can implement it in COMPUTECOST.M file by these steps:

 function J = Computecost (x, Y, theta)%computecost Compute cost for linear regression% J = Computecost (x, Y, theta) Computes the cost of using theta as the% parameter for linear regression to fit the data points in X and y% Initialize s ome useful valuesm = Length (y); % Number of training examples% you need to return the following variables correctly J = 0;% ====================== YOUR CO DE here ======================% Instructions:compute the cost of a particular choice of theta% you should s Et J to the cost.predictions = X * THETA; % caculate the hypothesis/predictions vectorsqrerrors = (predictions-y). ^2; % caculate The square error for every elements in the predictions Vectorj = 1/(2*m) *sum (sqrerrors); % summarize the square error vector to get the cost function value% ====================================================== ===================end 


Note:caculating the cost function was useful for plotting the figure, but it's not used in gradient descent because the de Rivative'll make the square caculation become mutiply caculation.

Gradient descent (for one Variable)

By the formula for gradient descent:

θj=θj-α* (∂/∂θj (J (θ0, ... ,partθn))) =θj-α* (1/m) *∑i=1~m (hθ (x (i))-Y (i)) *xj (i) (Update θ0 to partθn simultaneously)

In Octave, only one line of code could accomplish the task since Octave support this v.*m which V has m elements and M have M rows. (Matlab doesn ' t support this feature):

theta = Theta-alpha*sum ((x*theta-y). *x, 1) ';

In Matlab, W-E can implement it by thest steps, this method isn't good because it doesn ' t support the case when th E features n is large than 1. And also I cannot tell too much detail of it since I ' m still isn't good at linear algebra and Matlab (Just not stop debuggin G and coming up with ways to construct the result I want ...), need to improve these 2 subjects later.

function [Theta, j_history] = Gradientdescent (X, y, theta, Alpha, num_iters)%gradientdescent performs gradient descent to Learn theta% theta = gradientdesent (X, y, theta, Alpha, num_iters) updates theta by% taking num_iters gradient steps With learning rate alpha% Initialize some useful valuesm = Length (y); % Number of training examplesj_history = Zeros (num_iters, 1); for iter = 1:num_iters% ====================== YOUR CODE Here ======================% instructions:perform a single gradient step on the parameter vector% T     Heta. % Hint:while debugging, it can be useful to print out the values% of the cost function (Computecost) and G    Radient here.    % Prediction=x*theta;    pre_y = prediction-y;    Pre2=[pre_y Pre_y];        theta = Theta-alpha/m*sum (pre2.*x, 1) ';    % fprintf ('%f%f \ n ', theta (1), Theta (2)); % ============================================================% Save The cost of J in every iteration j_history (itER) = computecost (X, y, theta); endend 

Feature Normalization

Actually there is 3 values we need to return with this Function:mean value, sigma and normalized X matrix. (Even the X_norm is the first returned value in the function, it's caculated last-_-!!!)

For mean value, it's easy to implement:

Mu = mean (X, 1);

For Sigma, it's also easy to implement:

Sigma = STD (X, 0, 1);

For normalized X matrix, it's very hard-get understand what's the result is the one of the exercise wants, I tried many poss Ibilities but still cannot match the result to get correct response. Finally I found their test cases:

>>featurenormalize ([1 2 3] ') ans =-101>>featurenormalize ([1 2 3;6 4 2] ') ans =-1 01-1>>featurenormali  Ze ([8 1 6; 3 5 7; 4 9 2]) ans =1.1339-1.0000 0.3780-0.7559 0 0.7559-0.3780 1.0000-1.1339>>featurenormalize ([1 2 3 1;6 4 2 0;11 3 3 9;4 9 8 8] ') ans =-0.78335 1.16190 1.09141-1.465710.26112 0.38730-0.84887 0.789231.30558-0.38730-0. 84887 0.33824-0.78335-1.16190 0.60634 0.33824

And try in Matlab:

X_norm = [X (:, 1)/sigma x (:, 2)/sigma (+)];

It returns:

>> Featurenormalize ([1 2 3] ') attempted to access X (:, 2); Index out of bounds because size (X) =[3,1]. Error in Featurenormalize (line) X_norm = [X (:, 1)/sigma () x (:, 2)/sigma (+)];

It looks like the former function I composed with support 2 columns matrix operation. This should is supported in mutiple variables cases. So I come up with a which import the amount of features N:

n = size (sigma,2); for I=1:n,     x_norm (:, N) = X (:, N)/sigma (1,n); end

It returns:

>> Featurenormalize ([1 2 3] ') ans =     1     2     3>> featurenormalize ([1 2 3;6 4 2] ') ans =     1     3     2     2     3     1

It seems the result is a very closed to the test case result? What is missing? The mean value!!!
Add it immediately with great hope:

n = size (sigma,2); for I=1:n,     x_norm (:, N) = (X (:, N)-mu (:, N))/sigma (1,n); end

Check The result:

>> Featurenormalize ([1 2 3] ') ans =    1     0     1>> featurenormalize ([1 2 3;6 4 2] ') ans =     1     1
   2     0     3    -1>> featurenormalize ([8 1 6; 3 5 7; 4 9 2]) ans =    8.0000    1.0000    0.3780    3. 0000    5.0000    0.7559    4.0000    9.0000   -1.1339

Why was the last correct? The misuse of I and n ..., finally we have:

function [X_norm, mu, sigma] = featurenormalize (x)%featurenormalize normalizes the features in x% featurenormalize (x) r Eturns a normalized version of X where% the mean value of each feature are 0 and the standard deviation% is 1. This was often a good preprocessing step to does when% working with learning algorithms.% we need to set these values Corr Ectlyx_norm = X;mu = Zeros (1, size (x, 2)), Sigma = zeros (1, size (x, 2));% ====================== YOUR CODE here =========== ===========% Instructions:first, for each feature dimension, compute the mean% of the feature and subtract It from the dataset,% storing the mean value in MU. Next, compute the deviation of each feature and divide% each feature by it ' s Standa Rd deviation, storing% the standard deviation in Sigma. Percent Note that X is a matrix where each column is a% of feature and each row are an example.              You need% To perform the normalization separately for% each feature. Percent Hint:you might find the ' mean ' and ' std ' functions useful.% mu = mean (x, 1), sigma = STD (x, 0, 1); n = Size (sigma, 2); for I=1:n, X_norm (:, i) = (X (:, i)-mu (:, i))/sigma (1,i); end% ====================================================== ======end

All returned results is same as Test cases! Perfect!!!

Computing Cost (for multiple Variables)

Actually it the same as the cost function in one variable. Omitted ...

Programming Assignment 1:linear Regression

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.