The algorithm can refer to other blogs as follows:
Random gradient descent:
#Coding=utf-8" "Random Gradient descent" "ImportNumPy as NP#Construct Training Datax = Np.arange (0., 10., 0.2) M=len (x) x0= Np.full (M, 1.0) Input_data= Np.vstack ([x0, X]). T#bias B as the first component of a weight vectorTarget_data = 3 * x + 8 +Np.random.randn (m) Max_iter= 10000#Maximum number of iterationsEpsilon = 1e-5#Initialize weight valueW = NP.RANDOM.RANDN (2)#w = Np.zeros (2)Alpha= 0.001#Step Sizediff =0.error= Np.zeros (2) Count= 0#Number of CyclesPrint 'stochastic gradient descent algorithm'. Center (60,'=') whileCount <Max_iter:count+ = 1 forJinchRange (m): diff= Np.dot (w, input_data[j])-TARGET_DATA[J]#training set generation, calculating error values #The biggest feature is that the parameters of the model are updated as you iteratew = W-alpha * diff *Input_data[j]ifNp.linalg.norm (W-error) < epsilon:#to find the norm of two vectors directly through the NP.LINALG packet Break Else: Error=WPrint 'Loop count =%d'% count,'\tw:[%f,%f]'% (W[0], w[1])
#Coding=utf-8"""Batch gradient Descent"""ImportNumPy as NP#Construct Training Datax = Np.arange (0., 10., 0.2) M=len (x) x0= Np.full (M, 1.0) Input_data= Np.vstack ([x0, X]). T#bias B as the first component of a weight vectorTarget_data = 3 * x + 8 +Np.random.randn (m)#Stop ConditionMax_iter = 10000Epsilon= 1e-5#Initialize weight valueW = NP.RANDOM.RANDN (2)#w = Np.zeros (2)Alpha= 0.001#Step Sizediff =0.error= Np.zeros (2) Count= 0#Number of Cycles whileCount <Max_iter:count+ = 1sum_m= Np.zeros (2) forIinchRange (m): dif= (Np.dot (w, input_data[i])-target_data[i]) *Input_data[i] Sum_m= Sum_m +dif" "for J in Range (m): diff = Np.dot (w, input_data[j])-TARGET_DATA[J] # Training set generation, calculation error value W = w-alpha * d IFF * Input_data[j]" "W= W-alpha *sum_mifNp.linalg.norm (W-error) <Epsilon: Break Else: Error=WPrint 'Loop count =%d'% count,'\tw:[%f,%f]'% (W[0], w[1])
Low-volume gradient descent:
#Coding=utf-8"""Low-volume gradient descent"""ImportNumPy as NPImportRandom#Construct Training Datax = Np.arange (0., 10., 0.2) M=len (x) x0= Np.full (M, 1.0) Input_data= Np.vstack ([x0, X]). T#bias B as the first component of a weight vectorTarget_data = 3 * x + 8 +Np.random.randn (m)#two types of termination conditionsMax_iter = 10000Epsilon= 1e-5#Initialize weight valuenp.random.seed (0) W= NP.RANDOM.RANDN (2)#w = Np.zeros (2)Alpha= 0.001#Step Sizediff =0.error= Np.zeros (2) Count= 0#Number of Cycles whileCount <Max_iter:count+ = 1sum_m= Np.zeros (2) Index= Random.sample (Range (m), Int (Np.ceil (M * 0.2))) forIinchRange (len (input_data)): Dif= (Np.dot (w, input_data[i])-target_data[i]) *Input_data[i] Sum_m= Sum_m +dif w= W-alpha *sum_mifNp.linalg.norm (W-error) <Epsilon: Break Else: Error=WPrint 'Loop count =%d'% count,'\tw:[%f,%f]'% (W[0], w[1])
By iteration, the results converge to 8 and 3
Loop count = 704 w:[8.025972, 2.982300]
SGD/BGD/MBGD Simple implementation using Python