First, the Perception machine introduction

The Perceptron (English: Perceptron) is an artificial neural network invented by Frank Rosenblatt in 1957 at the Cornell Aviation Laboratory (Cornell Aeronautical Laboratory). It can be considered as the simplest form of feedforward neural network and is a binary linear classifier. Frank Rosenblatt gives the corresponding perceptual machine learning algorithms, which are commonly used for perceptual machine learning, least squares method and gradient descent. For example, the perceptual machine uses the gradient descent method to minimize the loss function, and then find out the separation Super plane which can divide the training data linearly, and then obtain the Perceptron model. The perceptual machine is a simple abstraction of biological nerve cells. Nerve cell structure can be broadly divided into: dendritic, synaptic, cell body and axon. A single nerve cell can be seen as a machine with only two states-the excitement being ' yes ' and the ' no ' when not excited. The state of the nerve cell depends on the amount of input signal received from other nerve cells, and the intensity of the synapse (inhibition or enhancement). When the sum of the semaphore exceeds a certain threshold, the cell body becomes agitated and generates an electrical pulse. Electrical pulses follow the axon and pass through synapses to other neurons. In order to simulate neuronal behavior, the corresponding perceptual machine-based concept is proposed, such as weight (synapse), bias (threshold), and activation function (cell body).

In the field of artificial neural networks, the Perceptron is also referred to as a single-layer artificial neural network, which distinguishes it from the more complex multilayer perceptron (multilayer Perceptron). As a linear classifier, (single-layer) Perceptron is the simplest form of forward artificial neural network. Despite its simple structure, perceptual machines can learn and solve quite complex problems. The main intrinsic flaw of perceptron is that it cannot deal with linear irreducible problems.

Second, the principle of the perception machine

The principle of the perceptron algorithm and the linear regression algorithm are roughly the same, except that the predictive function h and the weight update rule are different, and the perceptron algorithm is applied to the two classification.

Introduction of data sets

The breast cancer dataset, with an instance number of 569, includes diagnostic classes and attributes that help predict the properties of 30, each of which includes the radius radius (the average of the distance from the center to the edge of the point), the texture texture (the standard deviation of the grayscale value), and so on, the classes include: Wdbc-malignant malignant and wdbc-benign benign. Using 70% of the dataset as a training set, 30% of the dataset as a test set, and both the training set and the test set include features and diagnostic classes.

The Code realization and result analysis of the Perceptual machine algorithm

Code implementation:

Import pandas as pd# reading data with pandas

Import Matplotlib.pyplot as Plt

Import NumPy as NP

From Sklearn Import preprocessing

From matplotlib.colors import Listedcolormap

From Perceptron import Perceptron

From scipy.special import expit# This is the sign () function

From sklearn.model_selection import Train_test_split

Def loaddataset ():

# df=pd.read_csv ("Breast_ cancer_ data.csv")

# # Print (Df.head ())

# # Print (Df.tail ())

# label=df.ix[:,1]

# data=df.ix[:,2:32] #数据类型没有转换, such as 31.48, is a str type and needs to be converted to float

#

# M=data.shape[0]

# Data=np.array (data,dtype=float)

#

# for I in range (m):

# if label[i]== ' B ':

# label[i]=0

# Else:

# label[i]=1

# train_x, test_x, train_y, test_y = train_test_split (data, label, test_size=0.30, random_state=0) # Dividing datasets and test sets

DF = Pd.read_csv ("Breast_ cancer_ data.csv")

Dataarray=np.array (DF)

testratio=0.3

DATASIZE=DATAARRAY.SHAPE[0]

Testnum=int (Testratio*datasize)

Trainnum=datasize-testnum

Train_x=np.array (Dataarray[0:trainnum,2:],dtype=np.float)

Test_x=np.array (Dataarray[trainnum:,2:],dtype=np.float)

train_y=dataarray[0:trainnum,1]

test_y = Dataarray[trainnum:, 1]

For I in Range (Trainnum):

If train_y[i]== ' B ':

Train_y[i]=1

Else

Train_y[i]=0

For I in Range (Testnum):

If test_y[i] = = ' B ':

Test_y[i] = 1

Else

Test_y[i] = 0

Return train_x,test_x,train_y,test_y

# def sign (inner_product):

# if inner_product >= 0:

# return 1

# Else:

# return 0

#学习模型, learn the parameters theta

def train_model (train_x,train_y,theta,learning_rate,iteration):

M=TRAIN_X.SHAPE[0]

N=TRAIN_X.SHAPE[1]

J_theta=np.zeros ((iteration,1)) #列向量

Train_x=np.insert (Train_x,0,values=1,axis=1) #相当于x0, plus on the first column

For I in range (iteration): #迭代

# Temp=theta #暂存

# J_theta[i]=sum (SUM ((Train_y-expit (Np.dot (Train_x,theta)) **2)/2.0) #dot是内积, the SUM function sums the lost column vectors to a number

J_theta[i]=sum ((Train_y[:,np.newaxis]-expit (Np.dot (Train_x,theta)) **2)/2.0#dot is an inner product, and the SUM function sums the missing column vectors to a number

For j in Range (N): #j是第j个属性, but the theta for all properties are updated, so loop

# Temp[j]=temp[j]+learning_rate*np.dot ((train_x[:,j). T) [Np.newaxis], (Train_y[:,np.newaxis]-expit (Np.dot (Train_x,theta)))) #T是转置

# Temp[j]=temp[j]+learning_rate*np.dot (Train_x[:,j]. T, (Train_y-expit (Np.dot (Train_x,theta)))) #T是转置

Theta[j]=theta[j]+learning_rate*np.dot ((train_x[:,j). T) [Np.newaxis], (Train_y[:,np.newaxis]-expit (Np.dot (Train_x,theta)))) #T是转置

# theta=temp

X_iteration=np.linspace (0,iteration,num=iteration)

Plt.plot (X_iteration,j_theta)

Plt.show ()

Return theta

def predict (Test_x,test_y,theta): #假设theta是已经学习好的参数传递进来

Errorcount=0

M=TEST_X.SHAPE[0]

Test_x=np.insert (Test_x,0,values=1,axis=1) #相当于x0

H_theta = Expit (Np.dot (test_x, theta))

For I in range (m):

If h_theta[i]>0.5:

H_theta[i]=1

Else

H_theta[i]=0

If h_theta[i]!=test_y[i]: #test_y [i] need to be 0 or 1 to compare, because H_theta[i] is 0 or 1

Errorcount+=1

Error_rate=float (errorcount)/M

Print ("Error_rate", Error_rate)

#特征缩放中的标准化方法, note: The matrix operation in NumPy should be used more

def standardization (x): #x是data

M=X.SHAPE[0]

N=X.SHAPE[1]

X_average=np.zeros ((1,n)) #x_average是1 *n Matrix

Sigma = Np.zeros ((1, N)) # Sigma is the 1*n matrix

X_result=np.zeros ((M, N)) # X_result is the m*n matrix

X_average=sum (x)/M

# x_average = X.mean (axis=0) #用np的mean函数也可以求得每一列的平均值

# for I in range (n):

# for J in Range (m):

# X_average[0][i] + = (float (x[j][i)))

# x_average[0][i]/=m

# sigma= (SUM ((x-x_average) **2)/m) **0.5#m*n matrix minus 1*n matrix, will broadcast, 1*n matrix will be copied into M*n matrix

Sigma = X.var (axis=0) # Use the Var function of NP to find the variance of each column

# for I in range (n):

# for J in Range (m):

# sigma[0][i]+= ((X[j][i]-x_average[0][i]) **2.0)

# sigma[0][i]= (sigma[0][i]/m) **0.5

x_result= (x-x_average)/sigma# the corresponding element divide

# for I in range (n):

# for J in Range (m):

# x_result[j][i]= (X[j][i]-x_average[0][i])/sigma[0][i]

Return X_result

#特征缩放中的调节比例方法

def rescaling (x):

m = x.shape[0]

n = x.shape[1]

X_min=np.zeros ((1,n)) #x_min是1 *n Matrix

X_max=np.zeros ((1,n)) #x_max是1 *n Matrix

X_result = Np.zeros ((m, N)) # X_result is the m*n matrix

# for I in range (n):

# X_min[0][i]=x[0][i]

# X_max[0][i]=x[0][i]

# for J in Range (1,m):

# if X_MIN[0][I]>X[J][I]:

# X_min[0][i]=x[j][i]

# if X_MAX[0][I]<X[J][I]:

# X_max[0][i]=x[j][i]

# for I in range (n):

# for J in Range (m):

# x_result[j][i]= (X[j][i]-x_min[0][i])/(X_max[0][i]-x_min[0][i])

X_min=x.min (axis=0) #获得每个列的最小值

X_max=x.max (axis=0) #获得每个列的最大值

X_result = (x-x_min)/(X_max-x_min)

Return X_result

If __name__== ' __main__ ':

Train_x, test_x, train_y, Test_y=loaddataset ()

# scaler=preprocessing. Minmaxscaler ()

# Train_x=scaler.fit_transform (train_x)

# Test_x=scaler.fit_transform (test_x)

# train_x=standardization (train_x)

# test_x=standardization (test_x)

# train_x=rescaling (train_x)

# test_x=rescaling (test_x)

N=test_x.shape[1]+1

Theta=np.zeros ((n,1))

# Theta=np.random.rand (n,1) #随机构造1 matrix of *n

Theta_new=train_model (train_x,train_y,theta,learning_rate=0.001,iteration=1000) #用rescaling的时候错误率0.017

Predict (test_x, test_y, theta_new)

Results Display and Analysis:

The experiment of the perceptual machine classifying breast cancer data set

The number of iterations of the experiment was 1000, the learning rate was 0.001, and the method effect of feature scaling was compared, as shown in table 3.

Table 3 Relationship between the classification error rate and the method of feature scaling

Feature Scaling |
Standardization |
Rescaling |

Classification Error Rate |
0.182 |
0.017 |

The J of standardization standardized method as the number of iterations changes as shown in Figure 1:

Figure 1 The Standardized method

The J of the rescaling scaling method as the number of iterations changes as shown in Figure 2:

Figure 2 Adjusting the scale method

Figure 1 shows that the number of iterations and the learning rate parameters are not adjusted to make the effect of better values; Figure 2 has a good effect, the loss function is gradually reduced and tends to smooth, compared with the two methods, the relative adjustment of the proportion of the method is better than the standardized method.

Python perceptron classification breast cancer data set