Catalog Gradient Lifting Tree principle gradient lifting Tree code (Spark Python)
The principle of gradient lifting tree |
to be continued ...
Back to Catalog
Gradient Boost Tree code (Spark Python) |
Code data: Https://pan.baidu.com/s/1jHWKG4I Password: acq1
#-*-coding=utf-8-*- fromPysparkImportsparkconf, SPARKCONTEXTSC= Sparkcontext ('Local') fromPyspark.mllib.treeImportGradientboostedtrees, Gradientboostedtreesmodel fromPyspark.mllib.utilImportmlutils#Load and parse the data file.data = Mlutils.loadlibsvmfile (SC,"Data/mllib/sample_libsvm_data.txt")" "each row uses the following format to represent a sparse feature vector for a tag label index1:value1 index2:value2 ... tempfile.write (b "+1 1:1.0 3:2.0 5:3.0\\n-1\\n-1 2:4 4:5.0 6:6.0 ") >>> Tempfile.flush () >>> examples = Mlutils.loadlibsvmfile (SC, tempfile.name). Collect ( ) >>> tempfile.close () >>> Examples[0]labeledpoint (1.0, (6,[0,2,4],[1.0,2.0,3.0])) >>> Examples[1]labeledpoint ( -1.0, (6,[],[)) >>> Examples[2]labeledpoint (-1.0, (6,[1,3,5],[4.0,5.0,6.0])) " "#Split the data into training and test sets (30% held out for testing) splits the dataset, leaving 30% as the test set(Trainingdata, TestData) = Data.randomsplit ([0.7, 0.3])#Train a gradientboostedtrees model. Training Decision Tree Models#Notes: (a) empty categoricalfeaturesinfo indicates all features is continuous. Empty categoricalfeaturesinfo means that all features are of continuous#(b) Use greater iterations in practice. Using more iteration numbers in practiceModel =Gradientboostedtrees.trainclassifier (Trainingdata, Categoricalfeaturesinf o={}, numiterations=30)#Evaluate model on test instances and compute Test error evaluation modelspredictions = Model.predict (Testdata.map (Lambdax:x.features)) Labelsandpredictions= Testdata.map (LambdaLp:lp.label). zip (predictions) Testerr=Labelsandpredictions.filter (LambdaLP:LP[0]! = lp[1]). COUNT ()/Float (testdata.count ())Print('Test Error ='+ str (TESTERR))#Test Error = 0.0Print('learned classification GBT model:')Print(Model.todebugstring ())" "Treeensemblemodel classifier with Trees Tree 0:if (feature 434 <= 0.0) If (feature-<= 165.0) Pr Edict: -1.0 Else (Feature > 165.0) predict:1.0 Else (feature 434 > 0.0) predict:1.0 Tree 1: if (feature 490 <= 0.0) if (feature 549 <= 253.0) if (feature 184 <= 0.0) Predict: 0.4768116880 884702 Else (feature 184 > 0.0) Predict: -0.47681168808847024 Else (feature 549 > 253.0) Predict : 0.4768116880884694 Else (Feature 490 > 0.0) If (feature 215 <= 251.0) predict:0.4768116880884701 Else (feature 215 > 251.0) predict:0.4768116880884712 ... Tree 29:if (feature 434 <= 0.0) If (feature 209 <= 4.0) predict:0.1335953290513215 Else (feature 2 > 4.0) If (feature 372 <= 84.0) Predict: -0.13359532905132146 Else (Feature 372 > 84.0) Predict: -0.1335953290513215 Else (feature 434 > 0.0) predict:0.13359532905132146 " "#Save and load ModelModel.save (SC,"Mygradientboostingclassificationmodel") Samemodel= Gradientboostedtreesmodel.load (SC,"Mygradientboostingclassificationmodel")PrintSamemodel.predict (Data.collect () [0].features)#0.0
Back to Catalog
"Spark Mllib crash Treasure" model 07 gradient Lift Tree "gradient-boosted Trees" (Python version)