Through just three problems, you should be familiar with three special
machine learning problems: binary classification, multi-classification, and regression problems.
However, these problems belong to the category of supervised learning. In the field of
machine learning, supervised learning is only the tip of the iceberg. We usually divide
machine learning into four categories:
Supervised learning, recent image recognition, speech recognition, etc. are all focused on this.
Unsupervised learning, that does not require any labels, can be used for data visualization, data compression, data denoising, or simply to better understand the representation of the data at hand. Unsupervised learning is the "bread and butter" basis of data analysis. It is necessary to use it before supervised learning. "Dimensionality reduction" and "clustering" are well-known types of unsupervised learning.
Self-supervised learning, this is actually a special case of supervised learning, but it is sufficiently different that it can be divided into a single category. Self-supervised learning is supervised learning without human labels. However, it still has some labels. After all, some things need to be supervised in the learning process, but they are generated by certain heuristic algorithms through the input data itself. You can think of this as unattended supervised learning, such as self-encoders. Note that these categories are actually very vague.
Reinforcement learning. Recently, this field has attracted much attention, and it is Google's deepmind that successfully let computers play Atari games. The mechanism searches on its own. At present, it has not found any other significant applications except for the game field. It will play a major role in autonomous driving, robots, resource management, and education in the future.
In this book, we will mainly focus on supervised learning because it is currently the mainstream, and we will look at self-supervised learning in later chapters.
Although most of supervised learning consists of classification and regression problems, there are some other variants:
Sequence generation, for example, for a picture, we generate text to describe it. Sequence generation can reshape some classification problems, such as predicting words repeatedly or putting in sentences.
Syntax tree prediction (for a sentence, predict its decomposition in the syntax tree).
Object recognition: Given a picture, draw a frame to frame the specified object. This can be expressed as a classification problem (give many frames, and classify the content in each.) or as classification and regression Consortium of problems.
Image segmentation: Given a picture, draw the pixel level mask of a specific object.
Classification and regression glossary
Classification and regression contain a lot of professional terms. In the previous examples, you should have seen them. In the following chapters, you will see more. They all have accurate definitions within the scope of machine learning. You should be familiar with them. :
sample or input, sample or input: a data point into your model
prediction or output, prediction or output: what goes out of your model
target, target: correct. Your model should ideally predict data from some additional sources.
prediction error or loss value, prediction error or loss value: used to measure the distance between your prediction and the target.
Classes, category: The set of possible labels is used to select in the classification problem, for example, when the cat and dog images are classified, the two animals belong to two categories.
label, label: do not go into details.
ground-truth, or annotations, correct labeling: artificial labeling of the correct label.
Binary classification: A task that divides each sample into different types.
multi-class classification, a large number of classifications: the type of points is greater than 2
Multi-label classification, multi-label classification: classification problem can get many labels, for example, a picture of a cat and dog at the same time, then the label is given to both cat and dog.
Scalar regression, scalar regression: The target is a continuous scalar value, and the house price prediction is just a good example: different prices form a continuous space.
vector regression, vector regression: the goal is a series of continuous values, such as a vector. If you are doing regression of multiple values, you are actually doing vector regression.
mini-batch or batch: some small samples to be processed soon. A power of 2 in the GPU is more efficient. During training, one mini-batch can update the weight once.