This chapter summarizes
A brief introduction to machine learning.
The 1th Chapter Introduction Basic Terms Hypothesis spatial inductive preference Development course and application actuality
The 1th Chapter Introduction
The research content of machine learning is the algorithm that produces the model from the data , namely learning algorithm. Basic Terminology classification classification and regression regression
Simply put, if the prediction is a discrete value, it is the classification if the prediction is a continuous value, is the regression generalization generalization
The so-called generalization is that the learning model is suitable for the new sample's capacity hypothesis space
In general, machine learning is learning from the sample , that is, the process of induction, particularly to the general, belongs to inductive learning inductive learning.
There are two basic means of general scientific research: inductive induction and deductive deduction. The process can be thought of as the opposite:
1. Induction: It is from the example of learning, from special to general, embodies the generalization process
2. Deduction: It is from the axiom of the introduction of the theorem, from general to special, embodies the special specialization process 1. Inductive Preference
The so-called inductive preference induction bias is the preference of machine learning algorithms for certain types of assumptions in the learning process.
An effective machine learning algorithm must have an inductive preference, otherwise it will not produce a definite learning result . such as predicting the classification of a thing, sometimes classified as a, and sometimes classified as B, such learning results are obviously meaningless.
and whether the inductive preference of the algorithm matches the problem itself, most of the time directly determines whether the algorithm can achieve good performance.
This is what bias means, in fact, according to the NFL theorem (No free Lunch):
Under the premise that all "problems" appear the same chance, or all problems are equally important , good learning algorithms and bad learning algorithms have the same expected performance. That is, the total error is independent of the learning algorithm. (Mathematical proof of the book p8-9)
But our use of machine learning is often focused on a specific task, and naturally there can be good and bad learning algorithms. It is meaningless to put aside the specific problem of learning algorithms. development process and application status
The National Science Foundation has highlighted three key technologies for deep research and integration in the Big Data era: machine learning, cloud computing and crowdsourcing crowdsourcing. They are responsible for providing data analysis capabilities, data processing capabilities, and data tagging capabilities, respectively. Chen Yang-ning's classic speech: The influence of the book of changes on Chinese culture ↩