Percent method One: Gradient descent method x = Load (' E:\workstation\data\ex3x.dat '); y = load (' E:\workstation\data\ex3y.dat '); x = [Ones (Size (x,1), 1) x ];meanx = mean (x),% mean Sigmax = STD (x),% for standard deviation X (:, 2) = (X (:, 2)-meanx (2))./sigmax (2); X (:, 3) = (X (:, 3)-meanx (3))./sigmax (3 ); figureitera_num = 100; The number of iterations of the% attempt sample_num = size (x,1); % of training samples alpha = [0.01, 0.03, 0.1, 0.3, 1, 1.3];% because almost every 3 times times the learning rate is chosen to test, so directly enumerate out Plotstyle = {' B ', ' R ', ' G ', ' k ', ' b--', ' r--'} ; theta_grad_descent = zeros (Size (x (1,:))); for alpha_i = 1:length (alpha)% try to see which learning rate is best theta = zeros (Size (x,2), 1);%theta The initial value of the assignment is 0 Jtheta = zeros (itera_num, 1); For i = 1:itera_num% calculates the parameter Jtheta (i) = (1/(2*sample_num)) of the number of iterations of the itera_num at a learning rate alpha. * (x*theta-y) ' * (x*theta- y);%jtheta is a line vector grad = (1/sample_num). *x ' * (x*theta-y); theta = Theta-alpha (alpha_i). *grad; End Plot (0:49, Jtheta (1:50), char (Plotstyle (alpha_i)), ' LineWidth ', 2)% it is important to convert hold-on if (1 = = alpha (a) from the Char function here lpha_i)% The theta value of the iteration after the experiment found that the alpha is the 1 o'clock effect is the best value theta_grAd_descent = Theta Endendlegend (' 0.01 ', ' 0.03 ', ' 0.1 ', ' 0.3 ', ' 1 ', ' 1.3 '); Xlabel (' Number of iterations ') Ylabel (' Cost function ')% below is the predictive formula Price_grad_descend = Theta_grad_descent ' *[1 (1650-meanx (2))/sigmax (2) (3-meanx (3)/sigmax (3))] ' Method Two: Normal EQUATIONSX = Load (' E:\WORKSTATION\DATA\EX3X.D At '); y = load (' E:\workstation\data\ex3y.dat '); x = [Ones (Size (x,1), 1) x];theta_norequ = INV ((x ' *x)) *x ' *yprice_norequ = Theta_norequ ' *[1 1650 3] '
1%%method One: Gradient Descent method2x = Load ('E:\workstation\data\ex3x.dat');3y = Load ('E:\workstation\data\ex3y.dat');4 5x = [Ones (Size (x,1),1) x];6Meanx = mean (x);%ask for the mean value7Sigmax = STD (x);%To find the standard deviation8X (:,2) = (X (:,2)-meanx (2)./sigmax (2);9X (:,3) = (X (:,3)-meanx (3)./sigmax (3);Ten One Figure AItera_num = -; %number of iterations attempted -Sample_num = Size (x,1); %number of training samples -Alpha = [0.01,0.03,0.1,0.3,1,1.3];%because almost every 3 times times the learning rate to test, so directly enumerated thePlotstyle = {'b','R','g','k','b--','r--'}; - -Theta_grad_descent = Zeros (Size (x (1,:))); - forAlpha_i =1: Length (Alpha)%try to see which learning rate is best +theta = Zeros (Size (x,2),1); %the initial value of the theta is assigned a value of 0 -Jtheta = Zeros (Itera_num,1); + fori =1: Itera_num%calculates the parameter after the number of iterations of a learning rate alpha Itera_num AJtheta (i) = (1/(2*sample_num)). * (X*THETA-Y)'* (x*theta-y);%jtheta is a line vector atGrad = (1/sample_num). *x'* (x*theta-y); -theta = Theta-alpha (alpha_i). *Grad; - End -Plot0: the, Jtheta (1: -),Char(Plotstyle (alpha_i)),'linewidth',2)%It is important to convert the CHAR function here - on - in if(1= = Alpha (alpha_i))%The result of the experiment is that the alpha 1 o'clock is the best, then the theta value after the iteration is the desired value -Theta_grad_descent =Theta to End + End -Legend'0.01','0.03','0.1','0.3','1','1.3'); theXlabel ('Number of iterations') *Ylabel ('Cost function') $ Panax Notoginseng%here is the prediction formula -Price_grad_descend = Theta_grad_descent'*[1 (1650-meanx (2))/sigmax (2) (3-meanx (3)/sigmax (3))]' the + A%%method Two: normal equations thex = Load ('E:\workstation\data\ex3x.dat'); +y = Load ('E:\workstation\data\ex3y.dat'); -x = [Ones (Size (x,1),1) x]; $ $theta_norequ = INV ((x'*x)) *x'*y -price_norequ = theta_norequ'*[1 1650 3]'
View Code
Multivariance Linear Regression Practice
The problem to be solved is to give 47 training samples, the Y value of the training sample is the house price, The X attribute has 2, one is the size of the house, the other is the number of the house bedroom. These training data are needed to learn the functions of the system, predicting a house size of 1650, and the price of a bedroom with 3 houses.
Experimental basis:
Dot (b): Represents the inner product of vector A and vector B.
And the theory of linear regression can be known as the loss function of the system as shown below:
Its vector representation is as follows:
When using the gradient descent method to solve the parameters, the formula for updating the parameters is as follows:
Of course it also has its own vector form (which can be embodied in the program).
The two methods compare the predicted values:
L: Gradient Descent Method:
1 Training data:
x = Load (' E:\workstation\data\ex3x.dat ');
y = Load (' E:\workstation\data\ex3y.dat ');
x = [Ones (Size (x,1), 1) x];
Meanx = mean (x) mean value
Sigmax = STD (x) Standard deviation
X (:, 2) = (X (:, 2)-meanx (2))./sigmax (2);
X (:, 3) = (X (:, 3)-meanx (3))./sigmax (3);
2:gradient Descen Gradient Iteration
The emphasis is on choosing the appropriate descending gradient;
It ' s time to select a learning rate the goal of this part was to pick a good learning rate in the range of
You'll do this by making a initial selection, running gradient descent and observing the cost function, and adjusting t He learning rate accordingly. Recall the cost function is defined as
The cost function can also is written in the following vectorized form,
where
Show the best descent image by showing the descent image corresponding to each learning rate
the predicted value is ;
Price_grad_descend = Theta_grad_descent ' *[1 (1650-meanx (2))/sigmax (2) (3-meanx (3)/sigmax (3))] '
L method Two normal method direct
x = Load (' E:\workstation\data\ex3x.dat ');
y = Load (' E:\workstation\data\ex3y.dat ');
x = [Ones (Size (x,1), 1) x];
theta_norequ = INV ((x ' *x)) *x ' *y
price_norequ = Theta_norequ ' *[1 1650 3] '
Multivariance Linear Regression Practice