Common algorithms for machine learning---2016/7/19

Source: Internet
Author: User
Tags stock prices

Machine learning is a core skill of the data analyst advanced Step. Share the article about machine learning, no algorithms, no code, just get to know machine learning quickly!

--------------------------------------------------------------------------------------------------------------- -----------------------------------

after understanding the types of machine learning problems that need to be addressed, you can start to consider the types of data collected and the machine learning algorithms that you can try. The most popular machine learning algorithms are introduced here, and it is helpful to explore the main algorithms to get a general idea of the available methods.

There are many algorithms available, and the difficulty is that there are different kinds of methods and extensions to these methods. This makes it very difficult to distinguish exactly what is the Orthodox algorithm. There are two ways to think and differentiate the algorithms you will encounter in this field.

The first method of dividing the algorithm is based on the way of learning, the second is based on the similarity of form and function. Both of these methods are useful.

Learning Style

  Based on experience, environment, or any interaction we call input data, an algorithm can model a problem in different ways. In machine learning and AI textbooks, the popular approach is to first consider an algorithmic learning style. The main learning mode and learning model of the algorithm are only a few, introduced each, and given several algorithms and the problem types they are suitable to solve as examples.

supervised Learning: input data is called training data, they have known labels or results, such as spam/non-spam or stock prices for a certain period of time. The model parameter determination needs to pass a training process, in this process the model will request to make the forecast, when the forecast does not match, then needs to make the change.

Unsupervised Learning : The input data is not labeled or has a known result, and the model is modeled by guessing the structure that exists in the input data. Examples of such problems relate to the learning of union rules and clustering. Examples of algorithms include the Apriori algorithm and the K-means algorithm.

semi-supervised learning: The input data consists of tagged and unmarked components. While the right predictive model already exists, the model must be able to predict and organize the data by discovering the underlying structure. Such issues include classification and regression. Typical algorithms include the generalization of some other flexible models that make assumptions about how to model unlabeled data.

Intensive Learning: input data is provided to the model as an incentive from the environment, and the model must react. Feedback does not come from the training process as supervised learning, but rather as a punishment or reward for the environment. Typical problems are system and robot control. Examples of algorithms include Q-Learning and sequential differential learning.

When you work with large amounts of data to model business decisions, you typically use supervised and unsupervised learning. A hot topic at the moment is semi-supervised learning, for example, in image classification, where the datasets involved are large but contain only a handful of tagged data.

--------------------------------------------------------------------------------------------------------------- --------------------------------

Similarity of algorithms

  In general, we will distinguish the algorithm according to the similarity of function and form. such as tree structure and neural network methods. This is a useful classification method, but it is not perfect. There are still some algorithms that can easily be grouped into several categories. For example, learning vector quantization, which is both inspired by the neural network method, is an instance-based approach, there are some algorithms that describe the name of the problem, it is also a class of algorithms, such as the name of the regression and clustering. Because of this, the inverse will see different collations of the algorithm from different sources. Just like its learning algorithm itself, there is no perfect model, only good enough model.

In this section, many popular machine learning algorithms will be listed in the way I find most intuitive. Although neither the categories nor the algorithms are exhaustive, I think they are representative and contribute to a general understanding of the whole field.

Regression analysis

  Regression is a modeling method that determines the amount of prediction errors for a model, and then iteratively optimizes the relationship between the variables by this amount. Regression method is the main application of statistics, which is classified as statistical machine learning. This is confusing because doors can refer to a class of problems or a class of algorithms using regression. In fact, regression is a process. Here are some examples:

Ordinary least squares

Logistic regression

Stepwise regression

Multiple Adaptive spline regression Mars

Local polynomial regression fitting loess

The instance-based learning model models the decision-making problems, which are based on examples that are considered important in the training data or that are necessary for the model. This approach typically builds a sample database and then compares the new data to the database based on a similarity metric to find the most matching item and finally make predictions. Thus, the case-based approach is also known as the "winner-take-all" approach and memory-based learning. The midpoint of this approach lies in the representation of existing instances and the measurement of similarity between instances.

K Nearest Neighbor Algorithm KNN

Learning Vector Quantization LVQ

Self-organizing mappings som

Regularization method

  This is an extension of another method (usually a regression analysis method) that punishes a model with a high degree of complexity and tends to promote a better, simpler model. There are some regularization methods listed here, because they are popular, powerful, and usually just simple improvements to other methods.

Ridge return

Lasso Algorithm Lasso

Elastic Network

Decision Tree Learning

  The decision tree method models the decision-making process based on the actual values of the attributes in the data. Decisions are forked on a tree structure until a particular record can be predicted. In the problem of classifying the latter regression, we use data to train decision trees.

Classification and regression number algorithm cart

Iterative binary Tree 3 generation ID3

C4.5 algorithm

Chi-square Automatic interactive view Chaid

Single-layer decision tree

Random Forest

Multiple Adaptive spline regression Mars

Gradient Propulsion Machine GBM

Bayesian algorithm

  Bayes method is an algorithm that applies bayes theorem in the definite classification and regression problems.

Naive Bayesian algorithm

Aode algorithm

Bayesian Reliability Network BBN

Kernel function method

  The most famous method of kernel function is the popular support vector machine algorithm, which is actually a series of methods. The kernel function method is concerned with how to map the input data to a high latitude vector space, in which some classification or regression problems can be easily solved.

Support Vector Machine SVM

Radial basis function RBF

Linear discriminant Analysis Lda

Clustering method

  Just like regression, clustering represents both a class of problems and a class of methods. Clustering methods are generally divided according to the modeling method: Centroid-based or hierarchical structure. All methods use the intrinsic structure of the data to classify the data into the most common category.

K Mean Value method

Max expectation algorithm em

Association Rule Learning

  Association rule learning is a class of algorithms for extracting rules that best explain the relationships between variables in the observed data. These rules can find important and commercially useful associations in a large cube and are then further exploited.

Apriori algorithm

Eclat algorithm

Artificial neural network

  Artificial neural network is an algorithm that is inspired by the structure or function of the biological neural network. They are commonly used in regression and classification problems in the pattern matching method, but in fact, this huge subclass contains hundreds of algorithms and algorithms of deformation, can solve various types of problems, some classic popular methods (deep learning has been separated from this classification):

Perception Machine

Inverse propagation algorithm

Hopfield Neural Network

Adaptive mapping som

Learning Vector Quantization LVQ

Deep learning

  The deep learning method is a modern version of artificial neural network using inexpensive and redundant computational resources. This type of approach attempts to resume a much larger and more complex neural network, as mentioned earlier, many methods are based on very limited tag data in large data sets to solve semi-supervised learning problems.

Limited Boltzmann Machine RBM

Depth of belief net dbm

convolutional Neural Networks

Cascade Automatic Encoder SAE

Dimensionality reduction method

  As with clustering methods, the dimensionality reduction approach attempts to summarize or describe the data using the intrinsic structure of the data, and the difference is that it uses less information in an unsupervised manner. This is useful for visualizing high-dimensional data or simplifying data for subsequent supervised learning.

Principal component Analysis PCA

partial least squares regression pls

Salmon mapping

Multidimensional Scale analysis MDS

Projection Pursuit

Integration method

  The integration method is composed of several weaker models, which are trained independently and whose predictive structures are integrated in some way to produce a general forecast. Much effort is focused on choosing what type of learning model to use as a sub-model and how to integrate their results. This is a very powerful technology, and therefore very popular.

Propulsion Technology Boosting

Self-exhibition integrated bagging

Adaptive Propulsion AdaBoost

Cascading generalization strategy blending

Gradient Propulsion Machine GBM

Random Forest

  

Common algorithms for machine learning---2016/7/19

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.