In this article, we can have a common sense of the commonly used algorithms of ML, no code, no complex theoretical derivation, is to illustrate, know what these algorithms are, how they are applied, examples are mainly classification problems.

Each algorithm has seen several video, pick out the most clear and interesting to speak, easy to science.

Then there is time to parse the individual algorithm in depth.

Today's algorithm is as follows:

- Decision Tree
- Random Forest algorithm
- Logistic regression
- Svm
- Naive Bayesian
- K Nearest Neighbor algorithm
- K-mean-value algorithm
- Adaboost algorithm
- Neural network
- Markov

1. Decision Tree

According to some feature, each node asks a question, through judgment, divides the data into two categories, and then continues to ask questions. These problems are based on the existing data learned, and then put in new data, you can according to the problem of this tree, the data will be divided into the appropriate leaves.

2. Random Forest

Video

Randomly select data in the source data to form several subsets

The S matrix is the source data, with 1-n data, a B c is feature, and the last column C is the category

M-sub-matrices randomly generated by S

This m subset gets m decision trees

Put the new data into this M-tree, get m-classification results, count to see which type of the number of the most, this category as the final prediction results

3. Logistic regression

Video

When the predicted target is the probability that the range needs to meet greater than or equal to 0, less than or equal to 1, this time the simple linear model can not be done, because in the definition of the domain is not within a certain range, the value range is beyond the specified interval.

So it's better to have a model of this shape at this point.

So how do you get such a model?

This model needs to meet two conditions greater than or equal to 0, less than or equal to 1

Models greater than or equal to 0 can choose absolute values, squares, and exponential functions, which must be greater than 0

Less than or equal to 1 with division, the numerator is itself, the denominator is itself plus 1, that must be less than 1 of the

Once again, the logistic regression model is obtained.

The corresponding coefficients can be obtained by calculating the source data.

Finally get the graph of the logistic

4. SVM

Video

Support Vector Machine

To separate the two classes, want to get a super-plane, the best super-plane is to the two types of margin to reach the maximum, margin is the ultra-plane and distance from its nearest point, such as, Z2>Z1, so the green super-plane is better

This super-plane is represented as a linear equation, above the line above the class, is greater than or equal to 1, another class less than or equal to 1

The distance from the point to face is calculated based on the formula in the diagram

So the expression of total margin is as follows, the goal is to maximize the margin, you need to minimize the denominator, and then become an optimization problem

Take a chestnut, three points, find the best super plane, define the weight vector= (2,3)-(

Get weight vector for (a,2a), the two points into the equation, substituting (2,3) Another value = 1, substituting (first) another of its value =-1, to solve the value of a and intercept moment W0, and then get the expression of the super-plane.

After a is obtained, the surrogate (A,2A) is the support vector.

The equation for the into hyperspace plane of a and w0 is the support vector machine

5. Naive Bayes

Video

For an application in NLP

Give a paragraph of text, return emotional classification, this paragraph of the attitude of the text is positive, or negative

To solve this problem, you can just look at some of these words

This text, will only be represented by some words and their count

The original question is: give you a word, it belongs to what kind of

Through the Bayes rules becomes a relatively simple and easy to obtain problem

The question becomes, what is the probability of this sentence appearing in this category, of course, don't forget the other two probabilities in the formula

Chestnut: The probability that the word love appears in the case of positive is 0.1, the probability of appearing in negative case is 0.001

6. K Nearest Neighbor

Video

K Nearest Neighbours

Given a new data, which category is more than the nearest K-point, which type of data belongs to

Chestnut: To distinguish between a cat and a dog, by claws and sound two feature to judge, the circle and the triangle is known to classify, then this star represents what kind?

K=3, the three-line link is the nearest three points, so the circle is more, so this star is a cat

7. K Mean value

Video

Want to divide a set of data into three categories, with a large pink value and a small yellow value

The most happy first initialization, which selected the simplest 3,2,1 as the initial value of the various

The rest of the data, each with three initial values to calculate the distance, and then classified to its nearest initial value in the category

After the class is divided, the average of each class is calculated as the center point of the new round

After a few rounds, the grouping no longer changes, you can stop

8. Adaboost

Video

AdaBoost is one of Bosting's methods.

Bosting is to consider several classifiers which are not well classified, and get a better classifier.

, the left and right two decision trees, a single look is not very good, but the same data into the two results added up to consider, will increase credibility

AdaBoost chestnut, handwriting recognition, on the artboard can grab a lot of features, such as the direction of the starting point, the distance between the starting point and the end, etc.

Training, will get each feature weight, such as the beginning of 2 and 3 very much like, this feature to the classification play a small role, its weight will be small

And this alpha angle has a strong recognition, the weight of the feature will be larger, the final result is a comprehensive consideration of these feature results

9. Neural Networks

Video

Neural Networks suitable for one input may fall into at least two categories

NN consists of several layers of neurons, and the connections between them

The first layer is the input layer, and the last layer is the output layer

Both the hidden layer and the output layer have their own classifier

Input into the network, is activated, the calculated score is passed to the next layer, activating the back of the nerve layer, the final output layer of the nodes on the node on behalf of a variety of fractions, example to get the classification result of Class 1

The same input is transferred to different nodes and the results are different because the respective nodes have different weights and bias

This is forward propagation.

10. Markov

Video

Markov Chains is made up of state and transitions

Chestnuts, according to the phrase ' The quick brown fox jumps over the lazy dog ', to get Markov chain

Step, set each word to a state, and then calculate the probability of transitions between states

This is the probability of a sentence calculation, when you use a lot of text to do statistics, you will get a larger state transfer matrix, such as the following can be connected to the word, and the corresponding probability

In life, the alternative result of keyboard input method is the same principle, the model will be more advanced

Easy to read machine learning ten common algorithms