Machine Learning Classification

Last Update:2018-07-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

From a macro perspective, machine learning can be categorized from different angles.

Whether to train under human intervention/supervision. (Supervised,unsupervised,semisupervised and reinforcement learning)
Is it possible to learn incrementally (online learning, bulk learning)
Whether to compare the new data with the known data, or to find some rules in the training data to build a predictive model (instance-based, model-based learning).

The above classifications are not mutually exclusive. Here's a detailed description of each machine learning category.

Supervised/unsupervised Learning

As parents supervise their children, different parents have different means of supervision, some are inseparable, and some are stocked. This is divided into four categories supervised, unsupervised, semisupervised and reinforcement learning.

Supervised learning

In supervised learning, the dataset you provide to the algorithm (trainng data) already contains the expected results (Labels).

A typical supervised learning task is the classification (classfication). such as spam filtering, the training set contains the message itself and their labels (classification, whether spam), the training of the model will be the new message classification, to adopt a filtering strategy.

Another typical task is to predict a target value, such as predicting a car's price by providing an associated set of feature/attribute (such as kilometer, brand, and factory date). These feature are called predictors (predictors). This task is called regression. In order to train the system, we need to provide a large amount of data containing these predictors and tags (car price).

Some regression algorithms can also be used to do classfication, and vice versa. For example, we can use a logistic regression algorithm to give a value that represents the likelihood of belonging to a category (20% probability is spam)

The following are some commonly used supervised learning algorithms:

K-nearest Neighbors
Linear Regression
Logistic Regression
Support Vector machines (SVMs)
Decision Trees and Random forests
Neural networks

Unsupervised learning

It is easy to guess that unsupervised learning does not need to label data (labels), which can be achieved without parental supervision of autonomous learning. The following is a common unsupervised learning algorithm.

Clustering
- K-means
- Hierarchical Cluster Analysis (HCA)
- Expectation maximization
Visualization and dimensionality reduction
- Principal Component Analysis (PCA)
- Kernel PCA
- Locally-linear Embedding (LLE)
- t-distributed Stochastic Neighbor Embedding (T-sne)
Association Rule Learning
- Apriori
- Eclat

For example, you have a lot of data about blog visitors, you can use clustering algorithms to detect similar behavior of visitors. You don't need to tell the algorithm how to group, and the algorithm will automatically find some associations. For example, it will find that 40% of visitors are male, they like comic books and are mostly reading your blog at night, while 20% are young science fiction enthusiasts who often visit blogs on weekends, and so on. If you use the Hierachical clustering algorithm, it will continue to divide each group again into more detailed groups. This helps your blog to target different groups of users.

For the visualization algorithm, although the input of seemingly complex non-tagged data, but it can be based on the data to draw the corresponding 2D or 3D graphics. These algorithms try to preserve more type structures and avoid overlapping of independent clusters, and we can understand how the data is organized and perhaps discover the potential unknown laws.

dimensionality reduction is used to simplify data (without losing a lot of information). One use is to combine multiple associated feature into one. For example, the car's mileage and its age are closely related, so the dimensionality reduction algorithm will combine the two as a feature to reflect the loss of the vehicle. This is called feature extraction (feature extraction). Feature extraction is a useful technique that makes subsequent machine learning algorithms run faster (because data consumes less disk and memory), and in some cases, more accurate.

There is also an unsupervised learning anomaly detection, such as monitoring abnormal credit card transactions to prevent fraud, capturing defects in manufacturing, and automatically removing outliers (outliers, extreme values, outliers) from the data set. By training the model with normal data and then applying it to the new data, it can tell us whether the new data is normal.

We're going to say the last unsupervised learning is the association rule Learning, whose goal is to dig a lot of data from it to discover the relationships hidden between attributes. For example, the check-in rules are applied to the sales log data of large supermarkets, perhaps to discover a buying habit: people who buy barbecue sauces and chips also tend to buy steaks. As a result, merchants can place these items close to each other.

Cond.....

Machine Learning Classification

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine Learning Classification

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Machine Learning Classification

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support