Linear regression learning notes and regression learning notes

Source: Internet
Author: User

Linear regression learning notes and regression learning notes

 

Operating System: CentOS7.3.1611 _ x64

Python version: 2.7.5

Sklearn version: 0.18.2

Tensorflow version: 1.2.1

Linear regression is a statistical analysis method that uses regression analysis in mathematical statistics to determine the quantitative relationship between two or more variables. It is widely used. The expression is y = w'x + e, and e is the normal distribution with the mean of 0.

One-dimensional linear regression and multiple linear regression can be divided based on the number of variables.

In a regression model, mona1 regression is the simplest and most robust, but it often lacks the ability to describe the behavior of complex systems. Therefore, multivariate regression-based prediction technology is more common. Traditional multivariate regression models are generally linear. Due to the possibility of insignificant variables and the correlation between their respective variables, the regular equations of regression may become seriously diseased, it affects the stability of regression equations. Therefore, a basic problem facing multiple linear regression is to find the "Optimal" regression equation.

1-dimensional linear regression

In regression analysis, only one independent variable and one dependent variable are included, and the relationship between the two can be expressed in a straight line, which is called a one-dimensional linear regression analysis. The expression is as follows:

Y = a + b*X + e

A Indicates the intercept, B indicates the slope of the straight line, and e indicates the error item. This equation can predict the value of the target variable (Y) based on the given prediction variable (X.

When a = 1, B = 2, e = 0.1, the curve is as follows (Y = 1 + 2 * X + 0.1 ):

Common application scenarios:

Simply predict commodity prices, cost assessment, etc.

Use sklearn to solve the problem of unary linear regression

Sample Code:

#! /usr/bin/env python# -*- coding:utf-8 -*-# version : Python 2.7.5import numpy as npimport matplotlib.pyplot as pltfrom sklearn.linear_model import LinearRegressionrng = np.random.RandomState(1)X = 10 * rng.rand(30)Y = 1 + 2 * X  + rng.randn(30)#print X#print Ymodel = LinearRegression(fit_intercept=True)model.fit(X[:, np.newaxis], Y)xfit = np.linspace(0, 20, 100)yfit = model.predict(xfit[:, np.newaxis])plt.scatter(X, Y)plt.plot(xfit, yfit)plt.show()

Github address of the Code:

Https://github.com/mike-zhang/pyExamples/blob/master/algorithm/LinearRegression/lr_sklearn_test1.py

The running effect is as follows:

Tensorflow for Linear Regression

Sample Code:

#! /Usr/bin/env python #-*-coding: UTF-8-*-# python version: 2.7.5 # tensorflow version: 1.2.1import tensorflow as tfimport numpy as npimport matplotlib. pyplot as pltN = 200 # sample data format trainNum = 30 # training times # formula: y = w * x + bX = np. linspace (-1, 1, N) Y = 3.0 * X + np. random. standard_normal (X. shape) * 0.3 + 0.9X = X. reshape ([N, 1]) Y = Y. reshape ([N, 1]) # The expected graph plt. scatter (X, Y) plt. plot (X, 3.0 * X + 0.9) plt. show () # modeling inputX = tf. placeholder (dtype = tf. float32, shape = [None, 1]) outputY = tf. placeholder (dtype = tf. float32, shape = [None, 1]) W = tf. variable (tf. random_normal ([1, 1], stddev = 0.01) B = tf. variable (tf. random_normal ([1], stddev = 0.01) pred = tf. matmul (inputX, W) + bloss = tf. performance_sum (tf. pow (pred-outputY, 2) train = tf. train. gradientDescentOptimizer (0.001 ). minimize (loss) tf. summary. scalar ("loss", loss) merged = tf. summary. merge_all () init = tf. global_variables_initializer () # Train with tf. session () as sess: sess. run (init) for I in range (trainNum): sess. run (train, feed_dict = {inputX: X, outputY: Y}) predArr, lossArr = sess. run ([pred, loss], feed_dict = {inputX: X, outputY: Y}) # print "lossArr:", lossArr # print "predArr:", predArr summary_str = sess. run (merged, feed_dict = {inputX: X, outputY: Y}) WArr, bArr = sess. run ([W, B]) print (WArr, bArr) # predicted graph plt. scatter (X, Y) plt. plot (X, WArr * X + bArr) plt. show ()

Github address of the Code:

Https://github.com/mike-zhang/pyExamples/blob/master/algorithm/LinearRegression/lr_tensorflow_test1.py

The running effect is as follows:

(array([[ 0.4075802]], dtype=float32), array([ 0.35226884], dtype=float32))(array([[ 0.75750935]], dtype=float32), array([ 0.56450701], dtype=float32))(array([[ 1.06031227]], dtype=float32), array([ 0.69184995], dtype=float32))(array([[ 1.32233584]], dtype=float32), array([ 0.76825565], dtype=float32))(array([[ 1.54907179]], dtype=float32), array([ 0.81409913], dtype=float32))(array([[ 1.7452724]], dtype=float32), array([ 0.84160519], dtype=float32))(array([[ 1.91505003]], dtype=float32), array([ 0.85810882], dtype=float32))(array([[ 2.06196308]], dtype=float32), array([ 0.868011], dtype=float32))(array([[ 2.18909097]], dtype=float32), array([ 0.87395233], dtype=float32))(array([[ 2.29909801]], dtype=float32), array([ 0.8775171], dtype=float32))(array([[ 2.39428997]], dtype=float32), array([ 0.87965596], dtype=float32))(array([[ 2.47666216]], dtype=float32), array([ 0.8809393], dtype=float32))(array([[ 2.54794097]], dtype=float32), array([ 0.88170928], dtype=float32))(array([[ 2.60962057]], dtype=float32), array([ 0.88217127], dtype=float32))(array([[ 2.66299343]], dtype=float32), array([ 0.88244849], dtype=float32))(array([[ 2.70917845]], dtype=float32), array([ 0.88261479], dtype=float32))(array([[ 2.7491436]], dtype=float32), array([ 0.88271457], dtype=float32))(array([[ 2.78372645]], dtype=float32), array([ 0.88277447], dtype=float32))(array([[ 2.81365204]], dtype=float32), array([ 0.88281041], dtype=float32))(array([[ 2.8395474]], dtype=float32), array([ 0.88283193], dtype=float32))(array([[ 2.8619554]], dtype=float32), array([ 0.88284487], dtype=float32))(array([[ 2.88134551]], dtype=float32), array([ 0.88285261], dtype=float32))(array([[ 2.89812446]], dtype=float32), array([ 0.88285726], dtype=float32))(array([[ 2.91264367]], dtype=float32), array([ 0.88286006], dtype=float32))(array([[ 2.92520738]], dtype=float32), array([ 0.88286173], dtype=float32))(array([[ 2.93607926]], dtype=float32), array([ 0.88286275], dtype=float32))(array([[ 2.94548702]], dtype=float32), array([ 0.88286334], dtype=float32))(array([[ 2.95362759]], dtype=float32), array([ 0.8828637], dtype=float32))(array([[ 2.9606719]], dtype=float32), array([ 0.88286394], dtype=float32))(array([[ 2.96676755]], dtype=float32), array([ 0.88286406], dtype=float32))

Multiple linear regression

In regression analysis, two or more independent variables are included, and the relationship between the dependent variables and independent variables is linear.

The expression is as follows:

Y = a0 + a1 * X1 + a2 * X2 + ... + an * Xn + e

Where,

(A0, a1, a2, a3,..., an) is an unknown parameter vector.

(X1, X2, X3,..., Xn) is an interpreted variable, which can be fixed (designed) or random

E is a random error.

This equation can predict the value of the target variable (Y) based on the given prediction vector (X1, X2, X3,..., Xn.

When a0 = 1, a1 = 2, a2 = 3, e = 0.1, the equation is as follows:

Y = 1 + 2 * X1 + 3 * X2 + 0.1

Use sklearn to solve multiple linear regression problems

Sample Code:

#! /usr/bin/env python#-*- coding:utf-8 -*-# version : Python 2.7.5import numpy as npimport matplotlib.pyplot as pltfrom sklearn.linear_model import LinearRegressionrng = np.random.RandomState(1)N = 10X = np.array(N * [10 * rng.rand(2)])b = [2, 3]Y = 1 + np.matmul(X,b)  + rng.randn(N)print Xprint Ymodel = LinearRegression()model.fit(X, Y)xfit = np.array(10 * [10 * rng.rand(2)])yfit = model.predict(xfit)print "xfit :"print xfitprint "yfit :"print yfit

Github address of the Code:

Https://github.com/mike-zhang/pyExamples/blob/master/algorithm/LinearRegression/lr_sklearn_test2.py

The running effect is as follows:

[[ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493] [ 4.17022005  7.20324493]][ 30.42200315  29.87720628  31.81558253  28.6486362   32.69498666  30.188968    31.26921399  30.70080452  32.41228283  28.89003419]xfit :[[ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489] [ 1.40386939  1.98101489]]yfit :[ 12.7586356  12.7586356  12.7586356  12.7586356  12.7586356  12.7586356  12.7586356  12.7586356  12.7586356  12.7586356]

 

Okay, that's all. I hope it will help you.

Github address:

Bytes

Please add

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.