Machine Learning principle, implementation and practice-Introduction to Machine Learning
If a system can improve its performance by executing a process, this is learning. --- Herbert A. Simon
1. What is machine learning?
A discipline in which computers build Probability Statistics Models Based on data and use models to predict and analyze data.
From the definition of machine learning, we can learn the following information:
- Machine Learning is based on computers and networks;
- Machine Learning is based on data.
- Machine Learning is designed to predict and analyze data.
- Machine Learning is model-centered. Build, optimize, and use models for prediction.
- The machine learning model is based on probability statistics. A large amount of knowledge about probability and statistics is used. At the same time, machine learning is also information theory.
- Computing Theory, optimization theory, computer science and other fields of interdisciplinary disciplines, and gradually formed an independent theoretical system and methodology in the development.
2. Machine Learning Objects
The object of machine learning is data. It starts from data, extracts data features, abstracts data models, discovers data knowledge, and returns to data analysis and prediction. At the same time, data is diverse, including various numbers, texts, images, videos, audio data and their combinations on computers and networks.
So what data can be abstracted, learned, and disordered?
The basic assumption of machine learning about data is that similar data has a certain statistical regularity. Similar data refers to data with a certain common nature. Because they have statistical rules, they can be processed using probability statistics. You can use random variables to describe features in data and probability distribution to describe statistical rules of data.
In actual machine learning, data is often extracted as a feature vector
$ X = (x ^ {(1)}, x ^ {(2)}, \ dots, x ^ {(I)}, x ^ {(n )}) ^ t $
Data can be discrete or continuous.
3. The purpose of machine learning
Machine Learning is used to predict and analyze data, especially new unknown data.
The general goal of machine learning is to consider what models to learn and how to learn models so that the models can accurately predict and analyze data. Similarly, we should also consider improving learning efficiency as much as possible.
4. Machine Learning Methods
Machine Learning builds statistical models based on data to predict and analyze data. Machine Learning includes supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning.
Supervised Learning: starting from a given, limited, and training data set, we assume that the data is generated independently and evenly; in addition, assume that the model to be learned belongs to a function set, which is called a hypothesis space. Apply an evaluation criterion to obtain an optimal model from the hypothesis space, make it have the Optimal Prediction for known training data and unknown test data under the given evaluation criteria. The optimal model selection is implemented by the algorithm.
Model hypothesis space, model selection criteria, and model learning algorithms: three elements of machine learning, models, strategies, and algorithms.
The steps can be summarized as follows:
- Obtains a finite set of training data;
- Determine the hypothetical space that contains all possible models, that is, the set of learning models;
- Determine the model selection criteria, that is, the learning strategy;
- An Algorithm for Optimal Model Solving, that is, a learning algorithm;
- Select the optimal model for the learning method;
- Use the optimal learning model to predict or analyze new data.
5. Application of machine learning
In the past 20 years, machine learning has achieved great development in both theory and application, with many important breakthroughs, statistical Learning has been successfully applied to many computer application fields such as artificial intelligence, pattern recognition, data mining, natural language processing, speech recognition, image recognition, information retrieval, and biological information.
Introduction to Machine Learning