A book to get Started with machine learning (data mining, pattern recognition, etc.)

Source: Internet
Author: User

(written in front) said yesterday to write a machine learning book, then write one today. This book is mainly used for beginners, very basic, suitable for sophomore, junior to see the children, of course, if you are a senior or a senior senior not seen machine learning is also applicable. Whether it's studying intelligence or doing other things, machine learning is a must. You see GFW all use machine study, we also have to science.

(full-text structure) In fact, I think, learn a subject, List a pile of books, comments to comment to go, in fact, for beginners is not very useful; he doesn't know what these are, you start a review, leaving only a bunch of cold lines. So I'll start by talking about what machine learning is all about? What is the basic content of it?

(What's this stuff for?) machine learning, also known as data mining, pattern recognition, has many definitions. But plain English says, what machine learning is going to do is, now there are some data (like your Renren friends and their speeches), we want to process the data, we want to get the information we want from the data (such as which friends and you hit it off). From the above example, we can see that machine learning is actually the imitation of human intelligence, but also the way to achieve human and higher intelligence.

(What's the goods?) What does he basically have?

(rather difficult machine learning theory, math small white dispersion)

In the first part, the underlying theory of machine learning is that there are some underlying theories of machine learning, such as reasoning and planning, approximate computable theory, regularization, Ascension theory, nuclear methods, and of course, the famous statistical machine learning theory. This part of the content is not a beginner's study.

First, these theories are actually summed up in practice, there is no basic machine learning method can not understand the theory, and the second is that these theories need mathematics is very high, you that higher mathematics what is not to mention, these theories must have the general knowledge of the functional analysis, the general knowledge of optimization theory, general knowledge of matrix theory, higher probability theory, Mathematical basis for stochastic analysis and so on. Three is that these theories for most people who just want to use the machine learning method, I feel meaningless; you just want to use machine learning, and these theories give you an estimate that you've seen all the entertainment.

(The main method of machine learning, math small white can also look over Yes)

The second part, machine learning method: This part is the beginner should learn, also must learn.

(in the form of data, given the general situation) He follows the data processing can be divided into the following parts:

1. Supervised learning, that is, your data has been dealt with very well, which data is what the situation is clearly divided.

2. Unsupervised learning, your data is too primitive, all is a bunch of numbers, do not know which is which situation.

3. Semi-supervised learning, because the supervised learning effect is good, but the data requirements are too high, the non-supervised learning effect is not well, but the data requirements are low; then we compromise, we first mark some of the data, and then use supervised learning to label other unlabeled data, if the algorithm produces the accuracy rate below a certain parameter, This data is given to the standard.

4. Migration learning: For example, we now have a way to use a similar analysis between books, then this method can be used in the analysis of all users on the Internet? This is migration learning.

5. Intensive Learning: Learning based on environmental feedback.

6. "All kinds of chaos into learning": in fact, there are a lot of wonderful learning methods are not the above five more common learning methods, they are generally all kinds of wonderful ideas, coupled with a variety of love mathematical theory deduced. Because it is not very mature, so beginners do not have to be too entangled in this part.

(What kind of data do we generally deal with?) )

The data we deal with is generally a table-like, plain, that is--each piece of data is a vector (a few days ago saw a child said that the vector is a direction, I think the special language, = =!, now even the physics of the vector you can not imagine his direction, although the vector could be imagined as a geometric form, This is the basis of analysis, but do not rigidly adhere to the geometry. Since each piece of data is a vector, it is clear that all of the data constitutes a vector space. This vector space must have some abstract imagination ability, not only European space, but also may be topological space ... You'll know why now.

That's what this is about. Let's look at some of the basics of machine learning, and, of course, if you're reading more than that, don't be surprised, I'm just enumerating the very, very common ways.

(The most common machine learning method = Basic Method * Extension method * application area)

(Basic method)

1. Correlation analysis: Now the data is a sales record, we want to find out which products are often bought together, this time there will be two main methods: Apriori method, mainly pruning, and he met with AIS and stem, which stem is used for SQL Language Association analysis algorithm ; Fp-growth, the main is to build a tree, through this structure acceleration algorithm, and vertical correlation mining, array method.

2. Decision Tree: There is a mind-reading app that keeps you informed of the people you think you are, and makes constant deduction to find the person you're thinking of. This application seems magical, in fact, the decision tree can be roughly done. The decision tree is a tree, each side of the tree has the condition, the root node is the starting node, the leaf node is the result node, from the root node, the information on the edge of the continuous movement to the corresponding tree node, until the leaf node, the results are given. This is the decision tree. Decision tree is a large class of algorithms, mainly ID3, C4.5 and so on.

3. Perceptron: Do you remember the vector space I just said? Each vector can be represented as a point in space, if we can find a straight line to divide all the points into two parts, part a class, and the other part Class B. Then we have another point, we just have to see him in the straight side of the line can directly determine his category. Perceptron is a large class of algorithms, the algorithm is too many, not one by one enumeration.

4. Support Vector machine: An upgraded version of the Perceptron. If you have learned the function of the students, all know that the complete inner product space is Hilbert space, the nuclear method can be carried out in the Hilbert space. Support Vector machine is the use of the interval maximum principle and the kernel method to improve the perceptron, so as to achieve relatively good results. Support Vector machine, is a large class of algorithms.

5. Feedback Neural Network: An upgraded version of the Perceptron. A perceptron is a linear function, and if multiple linear functions are nested with each other and a complex vector space surface description is provided using nonlinear dynamics, we get a better effect than the perceptron. = =! What is a question support vector machine and a feedback neural network hybrid?

6. Neural networks: In fact, neural networks include feedback neural networks. The reason to put the feedback neural network alone is because it is used too much, and he inherits the Perceptron. But the neural network itself is a very, very, very, very rich, large class of algorithms, and complex. I try to divide a class, mainly hierarchical networks, time-delay neural networks, coupled neural networks, self-organizing neural networks, recurrent neural networks (and time-delay neural network somewhat like, but in the continuous and discrete amount of slightly different, continuous use analog circuit implementation), Radial basis function network (this is actually a regularization network, The General RBF Network is the T-regularization of the feedback neural network, the integrated neural network, the fuzzy Neural network, the Boltzmann machine (a network using the annealing algorithm), the probabilistic neural network and so on and so on. Of course there is the theory of neural field, need the knowledge of differential geometry, belongs to the basic theory of machine learning, beginners can ignore. Of course, there are people trying to design a neural network computer, beginners can also ignore. Of course, the neural network is very magical, he even PCA, ICA, LDA (linear discriminant analysis), LDA (hidden geographical distribution) what can be used in neural network learning.

7. Statistical decision Method: Statistical decision method, is based on statistical theory design statistical decision theory. In fact, statistical judgments are very useful theories, and many of the methods included in the field of machine learning, such as minimizing the maximum loss, sequential judgments, parameter estimation and so on. Naive Bayes is one of them. This is also a large class of algorithms.

8. Bayesian networks: A theory supported by reasoning and planning theory.

9. Sequence Analysis method: is to analyze a sequence of learning. A language is a sequence of words, so something like the hidden Markov method.

10. Logistic regression: If you have studied ecology, you are no stranger to logical equations and logistic regression, in fact this and the perceptron is a urine sex thing. He and the learning of hidden Markov models can use a principle called maximum entropy. In fact, the maximum entropy principle can be introduced by the Cauchy-Lagrange equation in the Variational method under the information theory, which is also an exercise after Duda's "pattern Classification".

11. Clustering methods: We have a bunch of data, and we want to know what kind of things they are. is also a large class of methods, commonly used are: K-means, hierarchical clustering, density distribution clustering, model clustering, graph clustering algorithm (including ant colony Clustering).

12. Data processing methods: such as principal component Analysis (PCA), linear decision LDA, ICA independent analysis, and so on.

13. Other: Sorry, although I wrote some, but feel that the basis of the foundation, seems to have more than these, summed up seemingly some difficulties. But in fact, the main should be as it is. Welcome to Add.

(extension method)

1. Online: Because we know that the current data are constantly coming, constantly updated. But because the data is huge, we can't update it every time we do it again, so let the algorithm be called inline for incremental methods. The basic approach can be traced to their online approach.

2. Distributed and parallelized: This is a distributed and parallelized approach to big data that provides all of the underlying methods above.

3. Modified over fitting method: Because most of the above basic methods have the problem of fitting, the noise in the data is fitted, so that the learning effect becomes worse, the information that should be obtained is y=x+1, and now the information is y= (x^100+1)/(x^99+1) +1. Obviously the latter one gets too sophisticated, but the effect is not good. Most of the above basic methods can be modified by modifying the fitting method. Regularization is a better approach.

4. A variety of mathematical methods: Yes, you have not read wrong, all kinds of mathematics into it. For example, Fuzzy mathematics chaos, produce a bunch of new methods: Fuzzy SVM, Fuzzy neural network what. For example, more comprehensive mathematical chaos, quotient space and granular operations what. Again such as Lie groups: lie-machine learning. Again, for example, the differential geometry of chaos, what is the manifold learning. These I think, have seen all as entertainment.

(Application method)

1. Apply to diagram, convert to figure mining.

2. Applied to the database and Data Warehouse, turned into data mining.

3. Applied to social networks, turned into network science.

4. Applied to natural language processing, turned into statistical natural language processing (there are many errors, all when entertainment).

5. Should go to your field .... Suddenly....

。。。。

Part III: Machine learning Applications: Learn from the application, not much to say.

(The book is here!) )

In fact, I think, if you really can have a general understanding of machine learning, the book does not matter. You follow the above things, Baidu out some blog to see, may be more suitable for you. Or you can go and find some papers. Then I recommend a few to get started here.

In fact, I think that you in the Amazon Search machine learning, the first page of the book I should be full read, the second page of the book most of the read, and the third page ... You can also do this ...

Once written on this list of books: http://blog.renren.com/blog/389867835/847240971?bfrom=011300082

If you have to recommend it:

Pattern classification of 1.Duda. In fact, I personally prefer the "modern pattern recognition" this book, but because the Chinese people's xenophilia mood is too serious, it is true that many of the domestic book plagiarism phenomenon is serious, basically out of the book is for the purpose of the book, is not for people to see. But there is no denying that there are many good books in China!

2. "Modern Pattern Recognition" (second edition), is really a good book, in my personal view, the Victory Duda that undoubtedly.

"Machine learning" in 3.Mitchell.

4. The first and second books, mathematics need a lot, buy books or borrow a book, please make sure to see the preface of the first part of the mathematical knowledge of the explanation, otherwise you will only kept math. The third is in fact mathematics need not much, but the content is too little, and involves some superficial machine learning theory, also more troublesome. This is not as good as the introduction of machine learning (an introduce to machines learning), which can be said to use the least mathematical knowledge, told some content, also very thin, very good-looking.

Actually, that's enough. Why is it? Because there are a lot of people who ask me what I read (because I think I read books and read papers all day).

But they repeatedly asked, it seems you recommend them a few, they looked or did not look, but also you recommend ...

To say the truth, to learn which new subjects, are not smooth sailing. Do you think you have completely no foundation, 200 pages of machine learning book, you can use an afternoon seconds to kill, then I think you ... Too much love.

So plainly, the book is not important, the important thing is your determination, whether you want to learn this subject! If you hold the dabbler mentality, that decisive you can not see, because Dabbler's determination is not to learn to understand any subject! If you want to learn, bite your teeth, stick to your eyes, look over and over again.

A book to get Started with machine learning (data mining, pattern recognition, etc.)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.