Note: This article is not a tutorial, only as a record of people's daily learning process exists.
Objective: To fit a large number of red X-shaped markers in the graph to a blue line shown in the figure.
Basic idea: Wunda's Coursera machine learning curriculum Variable linear regression chapter; recursion Descent method
Realize:
1. Introduction of related libraries: Python's scientific Computing library NumPy and drawing library are used here matplotlib
Import NumPy as NP import Matplotlib.pyplot as Plt
2. Virtual generation of data: Because of the introduction, temporarily can not find ready-made data to use, so use numpy random function to generate the required data
# Generate Data data_size= np.random.randint (0,500,size== [] for in Range (data_size): y.append (X[i]*np.random.randint (1,30) +np.random.randint (100,5000)) Plt.plot ( X, y,'rx')
3. Define the necessary parameters and functions for fitting
Because x0 in the formula defaults to 1, you need to add a column to x0 in the original data, which you can understand as: x = [1,x]
Use Numpy.array's reshape function to generate a data format that is available for NumPy matrix operations.
The gradient function is the core of the method. The derivation process of the concrete formula is unknown.
for inch = Np.array (x). Reshape ((data_size,2= Np.array (y). Reshape (data_size,1= Np.zeros ((2,1 )def gradient (x,y,theta): = X.dot (theta) = y_pred- y return np.sum (diff)/len (y)
4. To fit
Max_iter =0.0001 for in Range (max_iter): = Theta-learning_ Rate * Gradient (X,y,theta)
5. Draw the line
result_x = Np.linspace (0,500,50= theta[1] * result_x + theta[0]plt.plot (result_x,result_y)
6. Summary
So far, you can fit a straight line representing the dataset. However, in the learning process, the setting of the learning rate and the correctness of the results are unknown, so the next step is to introduce cost function.
[ML] Linear fit data from the simplest-python