Linear regression learning notes and regression learning notes

Last Update:2017-07-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Operating System: CentOS7.3.1611 _ x64

Python version: 2.7.5

Sklearn version: 0.18.2

Tensorflow version: 1.2.1

Linear regression is a statistical analysis method that uses regression analysis in mathematical statistics to determine the quantitative relationship between two or more variables. It is widely used. The expression is y = w'x + e, and e is the normal distribution with the mean of 0.

One-dimensional linear regression and multiple linear regression can be divided based on the number of variables.

In a regression model, mona1 regression is the simplest and most robust, but it often lacks the ability to describe the behavior of complex systems. Therefore, multivariate regression-based prediction technology is more common. Traditional multivariate regression models are generally linear. Due to the possibility of insignificant variables and the correlation between their respective variables, the regular equations of regression may become seriously diseased, it affects the stability of regression equations. Therefore, a basic problem facing multiple linear regression is to find the "Optimal" regression equation.

1-dimensional linear regression

In regression analysis, only one independent variable and one dependent variable are included, and the relationship between the two can be expressed in a straight line, which is called a one-dimensional linear regression analysis. The expression is as follows:

Y = a + b*X + e

A Indicates the intercept, B indicates the slope of the straight line, and e indicates the error item. This equation can predict the value of the target variable (Y) based on the given prediction variable (X.

When a = 1, B = 2, e = 0.1, the curve is as follows (Y = 1 + 2 * X + 0.1 ):

Common application scenarios:

Simply predict commodity prices, cost assessment, etc.

Use sklearn to solve the problem of unary linear regression

Sample Code:

#! /usr/bin/env python# -*- coding:utf-8 -*-# version : Python 2.7.5import numpy as npimport matplotlib.pyplot as pltfrom sklearn.linear_model import LinearRegressionrng = np.random.RandomState(1)X = 10 * rng.rand(30)Y = 1 + 2 * X  + rng.randn(30)#print X#print Ymodel = LinearRegression(fit_intercept=True)model.fit(X[:, np.newaxis], Y)xfit = np.linspace(0, 20, 100)yfit = model.predict(xfit[:, np.newaxis])plt.scatter(X, Y)plt.plot(xfit, yfit)plt.show()

Github address of the Code:

Https://github.com/mike-zhang/pyExamples/blob/master/algorithm/LinearRegression/lr_sklearn_test1.py

The running effect is as follows:

Tensorflow for Linear Regression

Sample Code:

#! /Usr/bin/env python #-*-coding: UTF-8-*-# python version: 2.7.5 # tensorflow version: 1.2.1import tensorflow as tfimport numpy as npimport matplotlib. pyplot as pltN = 200 # sample data format trainNum = 30 # training times # formula: y = w * x + bX = np. linspace (-1, 1, N) Y = 3.0 * X + np. random. standard_normal (X. shape) * 0.3 + 0.9X = X. reshape ([N, 1]) Y = Y. reshape ([N, 1]) # The expected graph plt. scatter (X, Y) plt. plot (X, 3.0 * X + 0.9) plt. show () # modeling inputX = tf. placeholder (dtype = tf. float32, shape = [None, 1]) outputY = tf. placeholder (dtype = tf. float32, shape = [None, 1]) W = tf. variable (tf. random_normal ([1, 1], stddev = 0.01) B = tf. variable (tf. random_normal ([1], stddev = 0.01) pred = tf. matmul (inputX, W) + bloss = tf. performance_sum (tf. pow (pred-outputY, 2) train = tf. train. gradientDescentOptimizer (0.001 ). minimize (loss) tf. summary. scalar ("loss", loss) merged = tf. summary. merge_all () init = tf. global_variables_initializer () # Train with tf. session () as sess: sess. run (init) for I in range (trainNum): sess. run (train, feed_dict = {inputX: X, outputY: Y}) predArr, lossArr = sess. run ([pred, loss], feed_dict = {inputX: X, outputY: Y}) # print "lossArr:", lossArr # print "predArr:", predArr summary_str = sess. run (merged, feed_dict = {inputX: X, outputY: Y}) WArr, bArr = sess. run ([W, B]) print (WArr, bArr) # predicted graph plt. scatter (X, Y) plt. plot (X, WArr * X + bArr) plt. show ()

Github address of the Code:

Https://github.com/mike-zhang/pyExamples/blob/master/algorithm/LinearRegression/lr_tensorflow_test1.py

The running effect is as follows:

(array([[ 0.4075802]], dtype=float32), array([ 0.35226884], dtype=float32))(array([[ 0.75750935]], dtype=float32), array([ 0.56450701], dtype=float32))(array([[ 1.06031227]], dtype=float32), array([ 0.69184995], dtype=float32))(array([[ 1.32233584]], dtype=float32), array([ 0.76825565], dtype=float32))(array([[ 1.54907179]], dtype=float32), array([ 0.81409913], dtype=float32))(array([[ 1.7452724]], dtype=float32), array([ 0.84160519], dtype=float32))(array([[ 1.91505003]], dtype=float32), array([ 0.85810882], dtype=float32))(array([[ 2.06196308]], dtype=float32), array([ 0.868011], dtype=float32))(array([[ 2.18909097]], dtype=float32), array([ 0.87395233], dtype=float32))(array([[ 2.29909801]], dtype=float32), array([ 0.8775171], dtype=float32))(array([[ 2.39428997]], dtype=float32), array([ 0.87965596], dtype=float32))(array([[ 2.47666216]], dtype=float32), array([ 0.8809393], dtype=float32))(array([[ 2.54794097]], dtype=float32), array([ 0.88170928], dtype=float32))(array([[ 2.60962057]], dtype=float32), array([ 0.88217127], dtype=float32))(array([[ 2.66299343]], dtype=float32), array([ 0.88244849], dtype=float32))(array([[ 2.70917845]], dtype=float32), array([ 0.88261479], dtype=float32))(array([[ 2.7491436]], dtype=float32), array([ 0.88271457], dtype=float32))(array([[ 2.78372645]], dtype=float32), array([ 0.88277447], dtype=float32))(array([[ 2.81365204]], dtype=float32), array([ 0.88281041], dtype=float32))(array([[ 2.8395474]], dtype=float32), array([ 0.88283193], dtype=float32))(array([[ 2.8619554]], dtype=float32), array([ 0.88284487], dtype=float32))(array([[ 2.88134551]], dtype=float32), array([ 0.88285261], dtype=float32))(array([[ 2.89812446]], dtype=float32), array([ 0.88285726], dtype=float32))(array([[ 2.91264367]], dtype=float32), array([ 0.88286006], dtype=float32))(array([[ 2.92520738]], dtype=float32), array([ 0.88286173], dtype=float32))(array([[ 2.93607926]], dtype=float32), array([ 0.88286275], dtype=float32))(array([[ 2.94548702]], dtype=float32), array([ 0.88286334], dtype=float32))(array([[ 2.95362759]], dtype=float32), array([ 0.8828637], dtype=float32))(array([[ 2.9606719]], dtype=float32), array([ 0.88286394], dtype=float32))(array([[ 2.96676755]], dtype=float32), array([ 0.88286406], dtype=float32))

Multiple linear regression

In regression analysis, two or more independent variables are included, and the relationship between the dependent variables and independent variables is linear.

The expression is as follows:

Y = a0 + a1 * X1 + a2 * X2 + ... + an * Xn + e

Where,

(A0, a1, a2, a3,..., an) is an unknown parameter vector.

(X1, X2, X3,..., Xn) is an interpreted variable, which can be fixed (designed) or random

E is a random error.

This equation can predict the value of the target variable (Y) based on the given prediction vector (X1, X2, X3,..., Xn.

When a0 = 1, a1 = 2, a2 = 3, e = 0.1, the equation is as follows:

Y = 1 + 2 * X1 + 3 * X2 + 0.1

Use sklearn to solve multiple linear regression problems

Sample Code:

#! /usr/bin/env python#-*- coding:utf-8 -*-# version : Python 2.7.5import numpy as npimport matplotlib.pyplot as pltfrom sklearn.linear_model import LinearRegressionrng = np.random.RandomState(1)N = 10X = np.array(N * [10 * rng.rand(2)])b = [2, 3]Y = 1 + np.matmul(X,b)  + rng.randn(N)print Xprint Ymodel = LinearRegression()model.fit(X, Y)xfit = np.array(10 * [10 * rng.rand(2)])yfit = model.predict(xfit)print "xfit :"print xfitprint "yfit :"print yfit

Github address of the Code:

Https://github.com/mike-zhang/pyExamples/blob/master/algorithm/LinearRegression/lr_sklearn_test2.py

The running effect is as follows:

[[ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493]][ 30.42200315  29.87720628  31.81558253  28.6486362   32.69498666  30.188968    31.26921399  30.70080452  32.41228283  28.89003419]xfit :[[ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489]]yfit :[ 12.7586356  12.7586356  12.7586356  12.7586356  12.7586356  12.7586356  12.7586356  12.7586356  12.7586356  12.7586356]

Okay, that's all. I hope it will help you.

Github address:

Bytes

Please add

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linear regression learning notes and regression learning notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Linear regression learning notes and regression learning notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support