Machine Learning 01

Source: Internet
Author: User

For the data of a group of objects, set the objects according to different "attributes", and the corresponding values are "attribute values ", the Space Formed by "attribute" is called "attribute space ". The data of an object corresponds to the "attribute value" on different coordinate axes in the "attribute space". the vector corresponding to the corresponding vertex in the "attribute space" is the "feature vector" of the object.

The model created by each feature vector in this "attribute space" can predict the corresponding values of subsequent objects.

To predict discrete values, it is called "classification"; to predict continuous values, it is called "regression". For "binary classification" that only involves two categories, one is usually called "positive", and the other is called "anti-class". For those that involve multiple categories, it is called "multiclass classification".

After learning the model, the prediction process is called "test", and the predicted sample is called "test sample ".

Clustering refers to grouping A group of objects. Each group is called a "cluster". These automatically formed clusters may correspond to some potential concepts, it helps you understand the internal laws of data and build a foundation for more in-depth data analysis. However, in "clustering" learning, the actual "concept" is unknown in advance, and the training samples used in the learning process usually do not have labeled information.

Based on whether training data has labeled information, you can divide learning tasks into two categories: supervised learning and unsupervised learning ". Classification and regression represent the former and clustering represent the latter.

The goal of machine learning is to make the learned model better suited to "new samples", not just training samples. Even for unsupervised learning tasks such as clustering, we also hope that the learned clustering can be applied to samples that do not appear in the training set. The ability of the learning model to apply to new samples is called "generalization ". A model with strong generalization ability can be better suited to the entire sample space. Therefore, in order to make our model have strong generalization ability, we expect our training samples to better reflect the characteristics of the entire sample space. For example, if all samples in the sample space are subject to an unknown "distribution" D, each of the samples we obtain is obtained independently from this distribution, that is, "independent distribution ". In general, the more training samples we get, the more information we get about D, the more likely we will get a model with strong generalization ability through learning.


Machine Learning 01

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.