If you are already familiar with the Python and R module/package loading method, the table below is relatively easy to find. Python is referenced in the following table as a module., and some modules are not native modules, please use
pip install *
installation; In the same vein, in order to facilitate indexing, R also::represents the function and the name of the package in which the function is located, if it does not contain a::default package that is represented as R,::please use
install.packages("*")
Installation.
Connector and IO database
category |
Python |
R |
Mysql |
Mysql-connector-python (official) |
Rmysql |
Oracle |
Cx_oracle |
Roracle |
Redis |
Redis |
Rredis |
Mongodb |
Pymongo |
Rmongo, Rmongodb |
Neo4j |
Py2neo |
Rneo4j |
Cassandra |
Cassandra-driver |
Rjdbc |
Odbc |
Pyodbc |
RODBC |
Jdbc |
Unknown [Jython only] |
Rjdbc |
Io class
category |
Python |
R |
Excel |
Xlsxwriter, Pandas. (from/to) _excel, OPENPYXL |
OPENXLSX::READ.XLSX (2), XLSX::READ.XLSX (2) |
Csv |
Csv.writer |
Read.csv (2), read.table |
Json |
Json |
Jsonlite |
Image |
PIL |
JPEG, PNG, TIFF, BMP |
Descriptive statistics of statistical classes
category |
Python |
R |
Summary of Descriptive statistics |
Scipy.stats.descirbe |
Summary |
Mean value |
Scipy.stats.gmean (geometric mean), Scipy.stats.hmean (harmonic average), Numpy.mean, Numpy.nanmean, pandas. Series.mean |
Mean |
Number of Median |
Numpy.median, Numpy.nanmediam, pandas. Series.median |
Median |
The majority of |
Scipy.stats.mode, Pandas. Series.mode |
Unknown |
Number of Bits |
Numpy.percentile, Numpy.nanpercentile, pandas. Series.quantile |
Quantile |
Cumulative experience Function (ECDF) |
Statsmodels.tools.ECDF |
Ecdf |
Standard deviation |
SCIPY.STATS.STD, SCIPY.STATS.NANSTD, NUMPY.STD, pandas. Series.std |
Sd |
Variance |
Numpy.var, Pandas. Series.var |
Var |
Coefficient |
Scipy.stats.variation |
Unknown |
Covariance |
Numpy.cov, Pandas. Series.cov |
CoV |
(Pearson) Correlation coefficient |
Scipy.stats.pearsonr, Numpy.corrcoef, pandas. Series.corr |
Cor |
Peak degree |
Scipy.stats.kurtosis, Pandas. Series.kurt |
E1071::kurtosis |
Degree of skewness |
Scipy.stats.skew, Pandas. Series.skew |
E1071::skewness |
Histogram |
Numpy.histogram, numpy.histogram2d, NUMPY.HISTOGRAMDD |
Unknown |
Regression (including statistics and machine learning)
category |
Python |
R |
Common least squares regression (OLS) |
Statsmodels.ols, Sklearn.linear_model. Linearregression |
Lm |
Generalized linear regression (GLS) |
Statsmodels.gls |
Nlme::gls, Mass::gls |
Scale regression (Quantile regress) |
Statsmodels. Quantreg |
Quantreg::rq |
Ridge return |
Sklearn.linear_model. Ridge |
Mass::lm.ridge, Ridge::linearridge |
LASSO |
Sklearn.linear_model. Lasso |
Lars::lars |
Minimum angle regression |
Sklearn.linear_modle. Lassolars |
Lars::lars |
Robust regression |
Statsmodels. RLM |
Mass::rlm |
Hypothesis Testing
category |
Python |
R |
T test |
Statsmodels.stats.ttest_ind, Statsmodels.stats.ttost_ind, statsmodels.stats.ttost.paired; Scipy.stats.ttest_1samp, Scipy.stats.ttest_ind, Scipy.stats.ttest_ind_from_stats, Scipy.stats.ttest_rel |
T.test |
KS Test (test distribution) |
Scipy.stats.kstest, Scipy.stats.kstest_2samp |
Ks.test |
Wilcoxon (non-parametric test, differential test) |
Scipy.stats.wilcoxon, Scipy.stats.mannwhitneyu |
Wilcox.test |
Shapiro-wilk test of normal condition |
Scipy.stats.shapiro |
Shapiro.test |
Pearson correlation coefficient test |
Scipy.stats.pearsonr |
Cor.test |
Time series
category |
Python |
R |
Ar |
Statsmodels.ar_model.AR |
Ar |
Arima |
Statsmodels.arima_model.arima |
Arima |
Var |
Statsmodels.var_model.var |
Unknown |
Python can also be found inPyFlux.
Survival analysis
category |
Python |
R |
ph regression |
Statsmodels.formula.api.phreg |
Unknown |
Modules specifically analyzed:
Python:Lifelines
Machine Learning class Regression
See Statistical classes
Classifier LDA, QDA
category |
Python |
R |
Lda |
Sklearn.discriminant_analysis. Lineardiscriminantanalysis |
Mass::lda |
QDA |
Sklearn.discriminant_analysis. Quadraticdiscriminantanalysis |
Mass::qda |
SVM (Support vector machine)
category |
Python |
R |
Support Vector Classifier (SVC) |
Sklearn.svm.SVC |
E1071::svm |
Non-support vector classifier (NONSVC) |
Sklearn.svm.NuSVC |
Unknown |
Linear support vector classifier (lenear SVC) |
Sklearn.svm.LinearSVC |
Unknown |
Based on proximity
category |
Python |
R |
K-Nearest classifier |
Sklearn.neighbors.KNeighborsClassifier |
Unknown |
Radius near classifier |
Sklearn.neighbors.RadiusNeighborsClassifier |
Unknown |
Near center of gravity classifier (Nearest centroid Classifier) |
Sklearn.neighbors.NearestCentroid |
Unknown |
Bayesian
category |
Python |
R |
Naive Bayesian |
Sklearn.naive_bayes. Gaussiannb |
E1071::naivebayes |
Dovibeyes (multinomial Naive Bayes) |
Sklearn.naive_bayes. Multinomialnb |
Unknown |
Bernoulibeyes (Bernoulli Naive Bayes) |
Sklearn.naive_bayes. Bernoullinb |
Unknown |
Decision Tree
category |
Python |
R |
Decision Tree Classifier |
Sklearn.tree.DecisionTreeClassifier |
Tree::tree, Party::ctree |
Decision Tree Regression |
Sklearn.tree.DecisionTreeRegressor |
Tree::tree, Party::tree |
Assemble method
category |
Sub-category |
Python |
R |
Bagging |
Random Forest classifier |
Sklearn.ensemble.RandomForestClassifier |
Randomforest::randomforest, Party::cforest |
Bagging |
Random Forest regression device |
Sklearn.ensemble.RandomForestRegressor |
Randomforest::randomforest, Party::cforest |
Boosting |
Gradient boosting |
xgboostModule |
xgboostPackage |
Boosting |
AdaBoost |
Sklearn.ensemble.AdaBoostClassifier |
adabag,fastAdaboost,ada |
Stacking |
Unknown |
Unknown |
Unknown |
Clustering
category |
Python |
R |
Kmeans |
Scipy.cluster.kmeans.kmeans |
Kmeans::kmeans |
Hierarchical clustering |
Scipy.cluster.hierarchy.fcluster |
(Stats::) hclust |
Baouzu class (Bagged Cluster) |
Unknown |
E1071::bclust |
DBSCAN |
Sklearn.cluster.DBSCAN |
Dbscan::d Bsan |
Birch |
Sklearn.cluster.Birch |
Unknown |
K-medoids Clustering |
Pyclust. Kmedoids (Reliability unknown) |
Cluster.pam |
Association Rules
category |
Python |
R |
Apriori algorithm |
Apriori (Unknown reliability, py3 not supported), Pyfim (Reliability unknown, PIP installation not available) |
Arules::apriori |
Fp-growth algorithm |
Fp-growth (Unknown reliability, py3 not supported), Pyfim (Reliability unknown, PIP installation not available) |
Unknown |
Neural network
category |
Python |
R |
Neural network |
Neurolab.net, keras.* |
Nnet::nnet, Nueralnet::nueralnet |
Deep learning |
keras.* |
Unreliable packages mostly and unknown |
Of course, thetheanomodule is worth mentioning, buttheanothe design of the essence package is not in the neural network, so it is not attributed to this class.
Probabilistic graph model
Python:PyMC3
Text, NLP basic operations
category |
Python |
R |
Tokenize |
Nltk.tokenize (UK), Jieba.tokenize (middle) |
Tau::tokenize |
Stem |
Nltk.stem |
Rtexttools::wordstem, Snowballc::wordstem |
Stopwords |
Stop_words.get_stop_words |
Tm::stopwords, Qdap::stopwords |
Chinese participle |
Jieba.cut, Smallseg, Yaha, finalseg, genius |
Jiebar |
TFIDF |
Gensim.models.TfidfModel |
Unknown |
Topic model
category |
Python |
R |
Lda |
Lda. LDA, Gensim.models.ldamodel.LdaModel |
Topicmodels::lda |
Lsi |
Gensim.models.lsiModel.LsiModel |
Unknown |
Rp |
Gensim.models.rpmodel.RpModel |
Unknown |
HDP |
Gensim.models.hdpmodel.HdpModel |
Unknown |
It's worth noting that Python's new third-party modulespaCy
Interaction with other analysis/visualization/mining/reporting tools
category |
Python |
R |
Weka |
Python-weka-wrapper |
Rweka |
Tableau |
Tableausdk |
Rserve (actual service pack for R) |
Reproduced in: 1190000005041649
Python and R data analysis/mining tools Mutual Search