Machine Learning Basic Knowledge

Last Update:2016-05-22 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Common Data Mining & machine learning knowledge (points)

Basis (Basic):

MSE (meansquare error mean square), LMS (Least meansquare min-squared), LSM (Least square Methods least squares), MLE (Maximum Likelihoodestimation Maximum likelihood estimation), QP (quadraticprogramming two-time plan), CP (conditionalprobability conditional probability), JP (Joint probability Joint probabilities), MP (marginal probability edge probability), Bayesian Formula (Bayesian formula), L1/l2regularization (L1/l2 Regular, and more, now compare fire L2.5 regular, etc.), GD ( Gradient descent gradient descent), SGD (Stochastic gradientdescent random gradient descent), eigenvalue (eigenvalue), eigenvector (eigenvectors), Qr-decomposition ( QR decomposition), Quantile (number of bits), covariance (covariance matrix).

Common distribution (common distribution):

Discrete distribution (discrete distribution): Bernoulli distribution/binomial (Bernoulli step/Two distribution), negative binomialdistribution (negative two-item distribution), Multinomial distribution (multi-distribution), geometric distribution (geometric distribution), hypergeometric distribution (hypergeometric distribution), Poisson Distribution (Poisson distribution)

Continuousdistribution (continuous distribution): Uniform distribution (evenly distributed), normal distribution/gaussiandistribution (normal distribution/Gaussian distribution), Exponential distribution (exponential distribution), lognormal distribution (logarithmic normal distribution), gamma distribution (gamma distribution), beta distribution (beta distribution ), Dirichlet distribution (Dirichlet distribution), Rayleigh distribution (Rayleigh distribution), Cauchy distribution (Cauchy distribution), Weibull distribution (Weber distribution)

Three sampling distribution (three sample distributions): Chi-Square distribution (CHI-square distribution), T-distribution (t-distribution), F-distribution ( F-Distribution)

Data pre-processing (preprocessing):

Missingvalue imputation (missing value padding), discretization (discretization), Mapping (map), normalization (normalized/normalized).

Sampling (sampling):

Simplerandom sampling (simple random sampling), Offline sampling (offline and other possible K-sampling), online sampling (on-line and other possible K-sampling), ratio-based sampling (proportional random sampling), Acceptance-rejection sampling (Accept-reject sampling), importance sampling (importance sampling), MCMC (Markov Chain Montecarlo MARCOF Montecaro sampling Algorithm:metropolis-hasting& Gibbs).

Clustering (cluster):

K-means,k-mediods, dichotomy K-means,fk-means,canopy,spectral-kmeans (spectral clustering), Gmm-em (mixed Gaussian model-desired maximization algorithm solution), K-pototypes,clarans ( Based on partitioning), BIRCH (based on hierarchy), CURE (based on hierarchy), DBSCAN (based on density), clique (density based and grid based), density clustering algorithm on science of 2014, etc.

Clustering effectivenessevaluation (Cluster effect evaluation):

Purity (Purity), RI (rand Index, Richter indicator), ARI (adjusted Rand Index, adjusted richter indicator), NMI (normalizedmutual information, normalized mutual information), F-meaure ( f measurement) and so on.

Classification&regression (Classification & regression):

LR (linearregression linear regression), LR (Logistic regression logistic regression), SR (softmaxregression Multi-categorical logistic regression), GLM ( Generalized linearmodel Generalized linear model), RR (Ridge regression Ridge regression/l2 Regular least squares regression), LASSO (Least absoluteshrinkage and Selectionator Operator L1 Regular Least squares regression), &NBSP;RF (Random Forest), DT (decision tree Decision Tree), GBDT (Gradient boostingdecision tree Gradient Descent decision Tree), CART (classification andregression tree categorical regression tree), KNN (k-nearest Neighbor K nearest neighbor), SVM (Support Vector machine, Support vector machines, including SVC (classification) &SVR (regression)), KF (Kernel function kernel function polynomial kernelfunction polynomial kernel function, Guassian Kernel function Gaussian kernel functions/radial Basis function RBF radial basis function, string Kernel function string kernel function), &NBSP;NB (Naive bayes Naive Bayes), BN (bayesiannetwork/bayesian belief network/belief network Bayesian network/Bayesian Reliability Network/Belief network), LDA (Linear Discriminantanalysis/fisher Linear discriminant linear discriminant Analysis/fisher linear discriminant), EL (Ensemble Learning integrated Learning boosting,bagging, Stacking), AdaBoost (adaptiveboosting Adaptive Enhancement), MEM (Maximum Entropy model maximum entropy)

Classification Effectivenessevaluation (Classification effect evaluation):

Confusionmatrix (confusion matrix), Precision (accuracy), Recall (recall rate), accuracy (accuracy), F-score (F-Score), Roc Curve (ROC Curve), AUC (AUC area), Lift Curve (lift curve), KS Curve (KS curve).

PGM (probabilisticgraphical models probability map model):

BN (Bayesiannetwork/bayesian belief network/belief Network/Bayesian Reliability Network/Belief network), MC (Markov Chain Markov chain), HMM (Hidden Markovmodel Markov model), MEMM (Maximum Entropymarkov model maximum Entropy Markov model), CRF (Conditional randomfield conditional random field), MRF (Markov Randomfield Markov with the airport).

NN (neural network neural Networks):

Ann (Artificialneural Network Artificial neural networks), BP (Error back propagation errors back propagation), HN (Hopfield network),
RNN (recurrent neural network, recurrent neural Network), SRN (simple recurrent network, simply cyclic neural networks), ESN (echo State networks, echo status nets), LSTM (Long Short term Memory neural network), CW-RNN (Clockwork

Recurrent neural network, clock-driven recurrent neural networks, 2014ICML), etc.

Deep Learning (Depth learning):

Auto-encoder (Automatic encoder), SAE (stacked auto-encoders Stacking Automatic encoder: Sparse auto-encoders sparse Automatic encoder, denoising auto-encoders de-noising automatic encoder, Contractiveauto-encoders Shrink Auto Encoder), RBM (Restricted Boltzmannmachine Limited Boltzmann machine), DBN (deep beliefnetwork depth Belief network), CNN ( convolutional neuralnetwork convolutional Neural Network), Word2vec (Word vector learning model).

dimensionality Reduction (Descending dimension):

LDA (lineardiscriminant analysis/fisher Linear discriminant linear discriminant Analysis/fish linear discriminant), PCA (Principal componentanalysis principal component analysis), ICA (Independent component analysis of the independent Componentanalysis), SVD (Singular valuedecomposition singular value decomposition), FA (Factor Analysis factor analytical method).

Text Mining (Textual mining):

VSM (vectors spacemodel vector space model), Word2vec (Word vector learning model), TF (term frequency frequency), TF-IDF (Termfrequency-inverse Document Frequency Word Frequency-reverse document rate), MI (Mutual information Mutual information), ECE (expected crossentropy desired crossover entropy), Qemi (two information entropy), IG (information Gain information gain ), IGR (informationgain Ratio information gain rate), Gini (Gini coefficient), x2 statistic (x2 statistic), TEW (text evidenceweight textual evidence right), OR (Oddsratio dominance rate), N-gram Model,lsa (latentsemantic analysis of potential semantic analyses), pLSA (probabilisticlatent Semantic analysis based on potential semantic analyses of probabilities), LDA (latent Dirichletallocation potential Dirichlet models), SLM (Statisticallanguage model, statistical language models), NPLM (neuralprobabilistic Language model, Neural probabilistic language model), Cbow (continuous bag of Words model, continuous word bag models), Skip-gram (Skip-grammodel), etc.

Association Mining (Association Mining):

Apriori,fp-growth (Frequencypattern tree growth frequent pattern trees growth algorithm), Aprioriall,spade.

Recommendation engine (recommended engines):

DBR (Demographic-basedrecommendation based on demographic recommendations), CBR (context-based recommendation Content-based recommendations), CF (collaborative filtering collaborative filtering), UCF (user-based collaborativefiltering recommendation user-based collaborative filtering recommendations), ICF (item-based Collaborativefiltering recommendation Project-based collaborative filtering recommendations).

Similaritymeasure&distance Measure (similarity and distance measurement):

Euclideandistance (European distance), Manhattan Distance (Manhattan distance), Chebyshev Distance (Chebyshev snow distance), Minkowski Distance (Minkowski distance), Standardized euclideandistance (standardized Euclidean distance), Mahalanobis Distance (ma distance), Cos (cosine cosine), Hamming distance/editdistance ( Hamming distance/edit distance), Jaccard Distance (Jaccard distance), Correlation coefficientdistance (correlation coefficient distance), information Entropy (information entropy), KL ( Kullback-leiblerdivergence KL divergence/relative Entropy relative entropy).

Optimization (optimized):

non-constrained Optimization (unconstrained optimization): Cyclic Variable Methods (variable rotation method), Pattern search Methods (pattern searching method), Variable Simplex Methods (variable simplex method), Gradient descent Methods (Gradient descent method), Newton Methods (Newton method), Quasi-Newton Methods (Quasi-Newton method), conjugate Gradientmethods (conjugate gradient method).

Constrainedoptimization (constrained optimization): approximation Programmingmethods (approximate planning method), feasible directionmethods (feasible direction method), Penalty function Methods (penalty function method), Multiplier Methods (multiplicative sub-method).

Heuristicalgorithm (heuristic algorithm), SA (simulated annealing, simulated annealing algorithm), GA (Genetic algorithm genetic algorithm)

Feature Selection (Feature selection):

Mutualinformation (Mutual information), document frequence (documentation frequency), information Gain (information gain), chi-squared test (Chi-square test), Gini (Gini coefficient).

Outlier Detection (anomaly detection):

Statistic-based (based on statistics), distance-based (distance based), density-based (based on density), clustering-based (based on clustering).

Learning to Rank (based on learning sort):

Pointwise:mcrank;

Pairwise:rankingsvm,ranknet,frank,rankboost;

Listwise:adarank,softrank,lamdamart;

Machine Learning Basic Knowledge

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Machine Learning Basic Knowledge

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Machine Learning Basic Knowledge

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support