Machine Learning Algorithm journey [go]

Source: Internet
Author: User

After understanding the machine learning problems we need to solve, we can think about what data we need to collect and what algorithms we can use. This article will repeat the most popular machine learning algorithms to find out which methods are useful.

There are many algorithms in the machine learning field, and each algorithm has many extensions. Therefore, it is very difficult to determine a correct Algorithm for a specific problem. In this article, I would like to give you two methods to sum up the algorithms we will encounter in reality.

Learning Methods

Algorithms are divided into different types based on how to process experience, environment, or any input data. Machine Learning and AI textbooks generally consider learning methods that can be adapted to algorithms.

Here we only discuss several major learning styles or learning models, and there are several basic examples. This classification or organization method is good, because it forces you to think about the process of data input roles and Model preparation, and then select an algorithm that best suits your problem, to get the best results.

  • Supervised Learning:Input data is called training data and has known results or are marked. For example, whether an email is a spam email, or the stock price within a period of time. When a model makes a prediction, it will be corrected if it is wrong. This process continues until the training data reaches the correct standards. Examples of problems include classification and regression. Examples of algorithms include logical regression and reverse neural networks.
  • Unsupervised learning:The input data is neither marked nor determined. The model summarizes the data structure and values. Examples of problems include association rule learning and clustering. Examples of algorithms include the Apriori algorithm and the K-means algorithm.
  • Semi-supervised learning:The input data is a mixture of labeled and unlabeled data. There are some prediction problems, but the model must also learn the structure and composition of the data. Examples of problems include classification and regression. algorithm examples are basically extensions of unsupervised learning algorithms.
  • Enhanced learning:Input data can stimulate the model and make the model respond. Feedback is not only obtained from supervised learning, but also from rewards or punishments in the environment. The problem example is robot control. The algorithm examples include Q-learning and temporal difference learning.

When integrating data to simulate business decisions, most of them will use supervised learning and unsupervised learning. When the next hot topic is semi-supervised learning, such as classification, there is a big database in the problem, but only a small part of the image is marked. Reinforcement Learning is mostly used in the development of Robot Control and other control systems.

Algorithm Similarity

Algorithms are classified by function or form. For example, a tree-based algorithm or a neural network algorithm. This is a useful classification method, but it is not perfect. Because many algorithms can be easily divided into two types, for example, learning vector quantization is also a neural network algorithm and an instance-based method. Just as machine learning algorithms do not have a perfect model, their classification methods are not perfect.

In this section, I listed the algorithms that I think are the most intuitive method classification. I have not exhausted algorithms or classification methods, but I want to give readers a general understanding of them. If you do not know me, please leave a message. Now let's start!

Regression

Regression analysis is concerned with the relationship between variables. It applies statistical methods. Examples of several algorithms include:

  • Ordinary Least Squares
  • Logistic Regression
  • Stepwise Regression
  • Multivariate adaptive regression splines (MARS)
  • Locally estimated scatterplot smoothing (loess)
Instance-based methods

Instance-based learning (instance-Based Learning) simulates a decision-making problem. The instances or examples used are very important to the model. This method creates a database for existing data, adds new data, and uses a similarity measurement method to find an optimal match in the database for prediction. For this reason, this method is also called the winner method and the memory-based method. The focus is on the representation of stored data and the similarity measurement method.

  • K-nearest neighbour (KNN)
  • Learning vector quantization (LVQ)
  • Self-Organizing Map (SOM)
Regularization Methods

This is an extension of other methods (usually regression methods). The more simple the model is, the better it is for induction. I listed it here because it is popular and powerful.

  • Ridge Regression
  • Least Absolute shrinkage and selection operator (lasso)
  • Elastic net
Demo-tree Learning

Demo-tree methods (decision tree method) establishes a model for decision making based on the actual values in the data. Decision Trees are used to solve the problem of induction and regression.

  • Classification and regression tree (Cart)
  • Iterative dichotomiser 3 (ID3)
  • C4.5
  • Chi-squared automatic interaction detection (chaid)
  • Demo-stump
  • Random Forest
  • Multivariate adaptive regression splines (MARS)
  • Gradient boosting machines (GBM)
Bayesian

Bayesian method (Bayesian method) is a method that uses Bayesian Theorem to solve classification and Regression Problems.

  • Naive Bayes
  • Averaged one-dependence estimators (aode)
  • Bayesian Belief Network (BBN)
Kernel Methods

The most famous kernel method is support vector machines ). In this way, the input data is mapped to a higher dimension, and some classification and Regression Problems are easier to model.

  • Support Vector Machines (SVM)
  • Radial Basis Function (RBF)
  • Linear discriminate analysis (LDA)
Clustering Methods

Clustering (clustering) describes the problems and methods. Clustering Methods are generally classified by modeling. All clustering methods use a unified data structure to organize data, making each group have the most in common.

  • K-means
  • Expectation maximisation (EM)
Association rule learning

Association rule learning is a method used to extract rules between data. Through these rules, we can discover the relationships between massive and multi-dimensional data, these important connections can be used by organizations.

  • Apriori algorithm
  • Eclat Algorithm
Artificial Neural Networks

Artificial Neural Networks (Artificial Neural Network) is inspired by the structure and function of the biological neural network. It is a type of pattern matching and is often used for regression and classification, but it has hundreds of algorithms and variants. Some of them are classic popular algorithms (I will introduce deep learning separately ):

  • Perceptron
  • Back-Propagation
  • Tmpnetwork
  • Self-Organizing Map (SOM)
  • Learning vector quantization (LVQ)
Deep Learning

Deep Learning is a modern update of artificial neural networks. Compared with the traditional neural network, it has more and more complex networks, and many methods are concerned with semi-supervised learning. This learning problem involves a lot of data, however, few of them are labeled data.

  • Restricted Boltzmann Machine (RBM)
  • Deep belief networks (DBN)
  • Convolutional Network
  • Stacked auto-encoders
Dimensionality ction

Dimensionality Reduction (dimension reduction), like clustering methods, pursues and utilizes a unified structure in data, but uses less information to generalize and describe data. This is useful for visualizing or simplifying data.

  • Principal Component Analysis (PCA)
  • Partial Least Squares Regression (PLS)
  • Sammon Mapping
  • Multidimen1_scaling (MDS)
  • Projection Pursuit
Ensemble methods

Ensemble methods is composed of many small models. These models are trained independently to draw independent conclusions and form a general prediction. Many studies focus on what models are used and how these models are combined. This is a very powerful and popular technology.

  • Boosting
  • Bootstrapped aggregation (bagging)
  • AdaBoost
  • Stacked generalization (blending)
  • Gradient boosting machines (GBM)
  • Random Forest

This is an example of Fitting Using a combination method (from Wikipedia). Each fire method is represented in gray, and the final prediction of the final synthesis is red.

Other resources

This machine learning algorithm journey aims to give you a general understanding of the algorithms and associated algorithms.

The following are some other resources. Do not think too much. The more you learn about algorithms, the better you will be. But it is also useful to have a deep understanding of some algorithms.

  • List of machine learning algorithms: This is a wiki resource. Although it is comprehensive, it is not very well classified.
  • Machine Learning Algorithms category: this is also a resource on the Wiki, which is a little better than the previous one and is sorted by letters.
  • Cran task view: Machine Learning & Statistical Learning: R language expansion package for machine learning algorithms to see what others are using.
  • Top 10 algorithms in Data Mining: This is a published article (published article). It is now a book, including the most popular data mining algorithm. In another basic algorithm list, there are a lot of algorithms listed here, which helps you to learn in depth.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.