Machine learning: How to use least squares in Python

Source: Internet
Author: User

The reason we say "use" rather than "implementation" is that Python's related class libraries have helped us implement specific algorithms, and we just have to learn how to use them. With the gradual mastery and accumulation of technology, we can also try to implement various algorithms in our own way when the algorithms in the class library are unable to meet their own requirements.

Anyway, what is the least squares method?

Definition: Least squares (also known as the least squares method) is a mathematical optimization technique that matches by minimizing the squared error and finding the best function of the data.

Function: Using the least squares method, the unknown data can be obtained easily, and the sum of the errors between the obtained data and the actual data is minimized.

Principle: The linear position is determined by "residual squared and Minimum" (in mathematical statistics, the residuals are the difference between the actual observed value and the estimated value).

Mathematical formula:

Basic idea: For a unary linear regression model, assume that n groups of observations (X1,y1), (X2,y2), ..., (Xn,yn) are obtained from the population, and that for these n points in the plane, you can use an infinite number of curves to fit. Linear regression requires that the sample regression function fit the set of values as well as possible, that is, the line should be as close to the center of the sample data as possible. Therefore, the criteria for selecting the best fit curve can be determined as follows: The total fit error (i.e. total residuals) is minimized.

The implementation code is as follows, and the comments are given in detail in the code:

# #最小二乘法import NumPy as NP # #科学计算库 import scipy as SP # #在numpy基础上实现的部分算法库import Matplotlib.pyplot as PLT # #绘图库from SCI Py.optimize Import LEASTSQ # #引入最小二乘法算法 "To set the sample data, the real data needs to be processed here ' # #样本数据 (xi,yi), which needs to be converted into an array (list) Form Xi=np.array ([ 6.19,2.51,7.29,7.01,5.7,2.66,3.98,2.5,9.1,4.2]) Yi=np.array ([5.25,2.83,6.41,6.71,5.1,4.23,5.05,1.98,10.5,6.3]) '    "Set the shape determination process of the fitting function and the deviation function function: 1. First draw the sample image 2. Determine the function form (line, parabola, sine cosine, etc.) according to the approximate shape of the sample image" # #需要拟合的函数func: Specifies the shape of the function def func (p,x): K,b=p return k*x+b# #偏差函数: x, y are all lists: here x, Y, and Xi,yi are one by one corresponding def error (P,x,y): return func (p,x)-y ' "main part: With part description 1.L The return value of the EASTSQ function is tuple, the first element is the solution result, and the second is the cost value of the solution (personal Understanding) 2. Exact words of the official website (second value): Value of the price function at the solution 3. Example: Para=&gt ;(Array ([0.61349535, 1.79409255]), 3) 4. The number of the first value in the return value tuple is the same as the number of parameters that need to be solved #k, the initial value of B, can be arbitrarily set, after several experiments, found that the value of p0 will affect the value of cost: para[ 1]P0=[1,20] #把error函数中除了p0以外的参数打包到args中 (use required) para=leastsq (error,p0,args= (xi,yi)) #读取结果k, B=para[0]print ("k=", K, "b = ", b) Print (" Cost: "+str (para[1)) print (" Solved fit line is: ") print (" y= "+str (rOund (k,2)) + "x+" +str (Round (b,2)) "'" plot to see the fit effect. Matplotlib default does not support Chinese, label set Chinese words need to be set separately if the error, change into English can be "#画样本点plt. Figure (Figsize= (8,6)) # #指定图像比例: 8:6plt.scatter (Xi,yi, Color= "Green", label= "Sample Data", linewidth=2) #画拟合直线x =np.linspace (0,12,100) # #在0-15 Direct Draw 100 consecutive dots y=k*x+b # #函数式plt. Plot (x, Y, Color= "Red", label= "fit straight", linewidth=2) plt.legend (loc= ' lower right ') #绘制图例plt. Show ()

The results are as follows:

Output Result:

k= 0.900458420439 b= 0.831055638877
Cost:1
The fitted line for the solution is:
y=0.9x+0.83

Drawing results:

Add: Simply list the case of the line, the curve is solved in a similar way (in another blog post for example parabolic), but the curve will be over-fitting, will be discussed in a future blog.

Machine learning: How to use least squares in Python

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.