Unlike regression trees, which use their mean values for each leaf node to make predictions, the model tree algorithm needs to construct a linear model on each leaf node, which is to set the leaf node as a piecewise linear function, the so-called piecewise linear (piecewise linear) refers to a model consisting of multiple linear fragments.
#################### #模型树 #################### #def linearsolve (DataSet): #模型树的叶节点生成函数m, n = shape (DataSet) X = Mat ( Ones ((m,n))); Y = Mat (Ones (m,1)) #建立两个全部元素为1的 (m,n) matrix and (m,1) matrix x[:,1:n] = dataset[:,0:n-1]; y = dataset[:,-1] #X存放所有的特征, y store xTx = x.t*xif Linalg.det (xTx) = = 0.0:raise Nameerror (' This matrix is singular, cannot d o Inverse,\ntry increasing the second value of OPS ') WS = XTX.I * (x.t * Y) #求线性回归的回归系数return ws,x,ydef modelleaf (dataSet): # Establish model leaf node function ws,x,y = linearsolve (DataSet) return wsdef Modelerr (DataSet): #模型树平方误差计算函数 ws,x,y = Linearsolve (dataSet) yhat = X * ws return sum (Power (y-yhat,2))
main.py
# coding:utf-8#!/usr/bin/env pythonimport regtreesimport matplotlib.pyplot as pltfrom numpy import *if __name__ = = ' __mai N__ ': Mydat = Regtrees.loaddataset (' exp2.txt ') Mymat = Mat (mydat) mytree = Regtrees.createtree (mymat,regtrees.modelleaf , Regtrees.modelerr, (1,10)) Print mytreeregtrees.plotbestfit (' exp2.txt ')
Get two-segment functions, dividing by 0.28
Y=3.46877+1.1852X and y=0.001698+11.96477x, respectively.
The real model for generating this data is y=3.5+1.0x and y=0+12x plus Gaussian noise generation
Machine learning--model tree