R Language Common Data mining package

Source: Internet
Author: User

Today found a very good blog (http://www.RDataMining.com), Bo Master is committed to research the R language in data mining applications, just recently want to learn a system of r language and data mining the entire process, read the content of this blog, the heart of a long time can not calm. The decision starts today ... Found a very good blog today (http://www.RDataMining.com, Bo Master is committed to research R language in the application of data mining, just recently want to learn a system of r language and data mining the entire process, read the content of this blog, the heart of a long time can not calm. Decided from today, as long as the evening can be before 11 o'clock to wash the bowl, spend one hours of time to learn the content of the blog, and the learning process to remember the information recorded, by the way to the English level four gap as far as possible to narrow.

The collection of R packages and functions available for data mining is listed below. Some of them are not specifically developed for data mining, but these packages can help us a lot in the process of data mining, so they are included.

1. Clustering

    • Commonly used packages: Fpc,cluster,pvclust,mclust

    • Partitioning-based approach: Kmeans, Pam, PAMK, Clara

    • Hierarchy-based approach: Hclust, Pvclust, Agnes, Diana

    • Model-based approach: Mclust

    • Density-based approach: Dbscan

    • Drawing-based method: Plotcluster, Plot.hclust

    • Verification-based method: Cluster.stats

2. Classification

    • Commonly used packages:

      Rpart,party,randomforest,rpartordinal,tree,margintree,

      Maptree,survival

    • Decision Tree: Rpart, Ctree

    • Random forest: Cforest, Randomforest

    • Regression, logistic regression, poisson regression: GLM, predict, residuals

    • Survival analysis: Survfit, Survdiff, coxph

3. Association rule and frequent item set

    • Commonly used packages:

      Arules: Supports mining frequent itemsets, maximum frequent itemsets, frequent closed itemsets, and association rules

      DRM: A repetitive association model of regression and categorical data

    • Apriori algorithm, breadth rst algorithm: Apriori, DRM

    • Eclat algorithm: Using equivalence class, RST depth search and the intersection of sets: Eclat

4. Sequence mode

    • Commonly used packages: Arulessequences

    • Spade algorithm: Cspade

5. Time series

    • Commonly used packages: Timsac

    • Time series build function: TS

    • Component decomposition: Decomp, decompose, STL, TSR

6. Statistics

    • Commonly used packages: Base R, Nlme

    • Variance analysis: AoV, ANOVA

    • Density Analysis: Density

    • Hypothesis test: T.test, Prop.test, Anova, AoV

    • Linear hybrid Model: LME

    • Principal component Analysis and Factor analysis: Princomp

7. Chart

    • Bar chart: Barplot

    • Pie chart: Pie

    • Scatter chart: Dotchart

    • Histogram: hist

    • Density chart: Densityplot

    • Candle chart, box-shaped diagram BoxPlot

    • QQ (quantile-quantile) Chart: Qqnorm, Qqplot, Qqline

    • Bi-variate Plot:coplot

    • Tree: Rpart

    • Parallel Coordinates:parallel, Paracoor, Parcoord

    • Heat map, Contour:contour, Filled.contour

    • Other diagrams: Stripplot, Sunflowerplot, Interaction.plot, Matplot, Fourfoldplot,
      Assocplot, Mosaicplot

    • Saved Chart formats: PDF, PostScript, Win.metafile, JPEG, BMP, PNG

8. Data manipulation

    • Missing value: Na.omit

    • Variable Normalization: Scale

    • Variable transpose: t

    • Sample: Sample

    • Stacks: Stack, unstack

    • Others: Aggregate, merge, reshape

9. Interface with data mining software Weka

    • Rweka: With this interface, all the Weka algorithms can be used in R.

R Language Common Data mining package

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.