1-Questions raised
2-Linear regression
3-Theoretical derivation
4-python/spark implementation
1 #-*-coding:utf-8-*-2 fromPysparkImportSparkcontext3 4 5theta =[0, 0]6Alpha = 0.0017 8sc = Sparkcontext ('Local')9 Ten deffunc_theta_x (x): One returnSUM ([i * j forI, JinchZip (theta, X)]) A - defCost (x): -thx =func_theta_x (x) the returnThx-x[-1] - - defPartial_theta (x): -DIF =Cost (x) + return[DIF * I forIinchX[:-1]] - +Rdd = Sc.textfile ('/home/freyr/linearregression.txt') A. Map (LambdaLine:map (float, Line.strip (). Split ('\ t'))) at -Maxiter = 400 -ITER =0 - whileTrue: -Partheta =Rdd.map (Partial_theta) -. Reduce (LambdaX, y: [i + J forI, Jinchzip (x, y)]) in - forIinchRange (2): toTheta[i] = theta[i]-Alpha *Partheta[i] + -ITER + = 1 the * ifITER <=Maxiter: $ ifSUM (Map (ABS, Partheta)) <= 0.01:Panax Notoginseng Print 'I Get it!!!' - Print 'Iter =%s'%ITER the Print 'Theta =%s'%Theta + Break A Else: the Print 'Failed ...' + Break
Ps:1. LinearRegression.txt
Spark implementations of linear regression [Linear regression/machine Learning/spark]