Data mining is divided into 4 categories, that is, prediction, classification, clustering and association, according to different mining purposes to select the corresponding algorithm. Here is a summary of the data mining packages commonly used in the R language:
Prediction of continuous dependent variables:
Stats-Packet lm function for multivariate linear regression
Stats-Packet glm function for generalized linear regression
Stats packet nls function to realize nonlinear least squares regression
Rpart packet Rpart function, a classification regression tree model based on cart algorithm
Rweka packet m5p function, model tree algorithm, advantages of set linear regression and cart algorithm
Adabag packet bagging function, an integrated algorithm based on Rpart algorithm
Adabag packet boosting function, an integrated algorithm based on Rpart algorithm
Randomforest packet randomforest function, an integrated algorithm based on Rpart algorithm
e1071 packet SVM function, support vector machine algorithm
Kernlab packet ksvm function, support vector machine based on kernel function
Nnet packet nnet function, a single hidden layer neural network algorithm
Neuralnet packet neuralnet function, multiple hidden layer multi-node neural network algorithm
Rsnns Packet MLP function, multilayer perceptron neural network
Rsnns packet RBF function, neural network based on radial basis function
Classification of discrete dependent variables:
Stats package GLM function, implement logistic regression, select Logit connection function
Stats packet knn function, k nearest neighbor algorithm
KKNN packet kknn function, weighted k nearest neighbor algorithm
Rpart packet Rpart function, a classification regression tree model based on cart algorithm
Adabag packet bagging function, an integrated algorithm based on Rpart algorithm
Adabag packet boosting function, an integrated algorithm based on Rpart algorithm
Randomforest packet randomforest function, an integrated algorithm based on Rpart algorithm
Party Package Ctree function, conditional classification tree algorithm
Rweka packet Oner function, one-dimensional learning rule algorithm
Rweka packet Jpip function, multi-dimensional learning rule algorithm
Rweka packet J48 function, decision tree based on C4.5 algorithm
C50 packet C5.0 function, decision tree based on C5.0 algorithm
e1071 packet SVM function, support vector machine algorithm
Kernlab packet ksvm function, support vector machine based on kernel function
e1071 packet naivebayes function, Bayesian classifier algorithm
Klar packet naivebayes function, Bayesian classifier calculation
Mass packet lda function, linear discriminant analysis
Mass packet Qda function, two-time discriminant analysis
Nnet packet nnet function, a single hidden layer neural network algorithm
Rsnns Packet MLP function, multilayer perceptron neural network
Rsnns packet RBF function, neural network based on radial basis function
Cluster Type:
The Nbclust packet nbclust function determines how many classes should be clustered
Stats packet Kmeans function, K-mean clustering algorithm
Cluster packet Pam function, K-centric point clustering algorithm
Stats packet hclust function, hierarchical clustering algorithm
FPC packet Dbscan function, density clustering algorithm
FPC package kmeansruns function, compared to the Kmeans function more stable, but also can be estimated to several types of
FPC package PAMK function, compared to the PAM function, can give a reference to the number of clusters
Mclust packet mclust function, desired maximum (EM) algorithm
Association rules:
Arules packet apriori function, apriori Association rule algorithm
Common data Mining algorithm packages in R