Python_sklearn Machine Learning Library Learning notes (vii) the Perceptron (Perceptron)

Source: Internet
Author: User
Tags idf

First, the perception device

The Perceptron, invented by Frank Rosenblatt in 1957 at the Cornell Aviation Laboratory, was inspired by the simulation of the human brain, which is a synapse (synapses) of information-processing neurons (neurons) cells and linked neuron cells.

  

A neuron can be seen as a computational unit that processes one or more inputs into an output. A perceptron function is similar to a neuron: it accepts one or more inputs, processes

They then return an output. Neurons can be real-time, error-driven learning, and neurons can constantly update parameters through a training sample rather than using the entire set of data at once. Real-time learning can effectively handle big data that cannot fit in memory. The perceptron is usually represented by the following graphic:

X1,x2,x3 are input cells, each representing a feature. The perceptron usually represents a common error term with another input unit, but the input unit is usually ignored in the graph. The Middle Circle is a computational unit, similar to the nucleus of a neuron. The edges of the link input cell and the cell are similar to dendrites. One weight per edge, or one parameter. Parameters are easy to interpret, and if an explanatory variable is associated with a positive type, the weight is positive, and an explanatory variable is associated with a negative type with a negative weight. The edges of the linked compute cells and output cells resemble dendrites.

Second, the excitation function

The perceptron uses a linear combination of the explanatory variables and model parameters to classify the samples by using the excitation function (activeation functions), which is calculated as follows. The linear combination of explanatory variables and model parameters is sometimes referred to as the Perceptron's pre-excitation (preactivation).

  

WI is the model parameter, B is the constant error term, and Φ () is the excitation equation. There are several kinds of excitation equations commonly used. Rosenblatt the step function of the initial perceptron (Heaviside step or unit step functions) as the excitation function. The function formula is as follows:

If the sum of the weighted explanatory variable and the constant error term is greater than 0, then the excitation equation returns 1, when the Perceptron classifies the sample as positive. Otherwise, the excitation equation returns 0, and the Perceptron classifies the sample as negative. The step function graph looks like this:

Another common excitation function is the logical S-shape (logistic sigmoid) excitation function. The gradient distribution of this excitation function can be calculated more efficiently, and it is very effective to deal with the Ann algorithm behind it. The formula is calculated as follows:

  

where x is the sum of the weighted inputs. This model is similar to the logical equation in the fourth chapter, which is a linear combination of the explanatory variable value and the model parameter, as well as the logistic regression model. Although the perceptron with the logical S-shape excitation function is the same as the logistic regression, the parameters to be estimated are different.

Third, Perceptron learning algorithm

The perceptron algorithm first needs to set the weight to 0 or a small random number, and then predict the type of training sample. Perceptron is an error-driven (error-driven) learning algorithm. If the perceptron is correct, the algorithm continues to process the next sample. If the perceptron is wrong, the algorithm updates the weights and re-predicts. The update rules for weights are as follows:

  

For each training sample, an increase in the parameter value of each explanatory variable α (Dj-yj (t)) XJ,I,DJ is the true type of the sample J, YJ (t) is the predictive type of the sample J, Xj,i is the value of the explanatory variable for the I sample J, and α is the hyper-parameter that controls the learning rate. If the prediction is correct, dj-yj (t) equals 0,α (Dj-yj (t)) Xj,i is also 0, at which point the weights are not updated. If the error is predicted, the weights are multiplied by the learning rate, (Dj-yj (t)) and the value of the explanatory variable.

The updated rule here is similar to the weighted update rule in gradient descent, which is to update the correct classification of the sample, and the amplitude of the update is controlled by the learning rate. Each time a training sample is traversed it becomes completed one generation (epoch). If all the samples are sorted correctly after one generation, the algorithm converges (converge). The learning algorithm does not guarantee convergence (such as a linearly irreducible data set), so the learning algorithm also requires a hyper-parameter, the maximum number of generations that need to be updated before the algorithm terminates

Two-yuan classification of Perceptron

Let's solve a classified case. Suppose you want to distinguish between kittens and adult cats from a bunch of cats. The data only has two explanatory variables: the proportion of days used to sleep, the proportion of days that make a temper. The training data is comprised of the following four samples:

  

The following scatter plots indicate that these samples can be linearly separable:

  

In [2]:ImportMatplotlib.pyplot as Pltin [3]: fromMatplotlib.font_managerImportFontpropertiesin [4]: Font = fontproperties (fname = r"C:\WINDOWS\FONTS\MSYH.TTC", size = 10) in [4][ImportNumPy as Npin [6]: X = Np.array ([[0.2,0.1],[0.4,0.6],[0.5,0.2],[0.7,0.9]]) in [7]: y = [0,0,0,1]in [8]: marker = ['.','x']in [9]: Plt.scatter (x[:3,0],x[:3,1],marker=',', s=400) out[9]: <matplotlib.collections.pathcollection at 0x6d46208>In [Ten]: Plt.scatter (x[3,0],x[3,1],marker='x', s=400) out[Ten]: <matplotlib.collections.pathcollection at 0x6c856a0>In [ALL]: Plt.xlabel (U'percentage of days to sleep', fontproperties =font) out[ALL]: <matplotlib.text.text at 0x6c9eeb8>In []: Plt.ylabel (U'proportion of days of bad temper', fontproperties =font) out[]: <matplotlib.text.text at 0x6cadef0>In []: Plt.title (U'Kittens and adult cats', fontproperties =font) out[]: <matplotlib.text.text at 0x6cad5c0>In [+]: plt.show ()

Our goal is to train a perceptron to identify the type of cat with two explanatory variables. We use masculine to denote a young cat and a negative for an adult cat. The sense of Network diagram can be used to present the process of perceptual training.

Slightly

(iii) PERCEPTRON resolution document classification

The Scikit-learn provides the Perceptron functionality. Similar to the other features we used, the constructor of the Perceptron class accepts the hyper-parameter setting. The Perceptron class has fit_transform () and Predict methods, and the Perceptro class also provides a Partial_fit () method that allows the classifier to train streaming data and make predictions

In the following example, we train a perceptron to categorize the datasets of 20 news categories. This data set of 20 Web news sites collects nearly 20,000 news articles. This data set is often used for document classification and clustering experiments, and Scikit-learn provides an easy way to download and read datasets. We will train a perceptron to identify three news categories: Rec.sports.hockey, Rec.sport.baseball, and Rec.auto. Scikit-learn's perceptron also supports multi-class classification, using the one versus all strategy to train the classifier for each type of training set. We will use the TF-IDF weighted word bag to represent the news document. The Partial_fit () method can be used to connect Hashingvectorizer to train large streaming data in limited memory situations

First we use Fetch_20newsgroups () to download and read the data, consistent with other built-in data, the function returns objects including the Data,target and Target_name properties. We also removed the header, footer, and citations for each article. Preserving the explanatory variables that make the classification easier, we use Tfidfvectorizer to generate TF-IDF vectors, train the Perceptron, and then use the test set to evaluate the effect.

Source: "Machine learning with Scikit-learn"

Python_sklearn Machine Learning Library Learning notes (vii) the Perceptron (Perceptron)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.