The first knowledge of R language

Source: Internet
Author: User

2011, Big Data outbreak, the Big Data era formally arrived, when I just entered the university, then I did not know that will embark on the program ape this road of no return .... 6 years later, the ape began to learn big data, which is hereby recorded today. First, learn the R language.

What is the R language?

The R language is a programming language and software environment for statistical analysis, graphical presentation reporting, and is one of the most popular data analysis and visualization platforms of the year, first appearing in 1993 (two years younger than me), initially by Ross Ihaka and Robert Gentleman in the design and development of the statistical department at Oakland Oakland University in New Zealand, which was popular in 2011 with the outbreak of big data.

At the heart of the R language is an interpreted computer language that allows for modular programming of branching and looping as well as functions. The R language allows for the integration of processes written in C, C + +,. Net, python or FORTRAN to improve efficiency.

Of course, in addition to r, there are other languages of data analysis, such as: Excel,spss,sas.

What are the characteristics of the R language?   

As mentioned earlier, the R language is a programming language and software environment for statistical analysis, graphical representation, and reporting. The following is a list of some features of the R language-

    • The R language is a well-developed, simple and effective programming language that includes conditions, loops, user-defined recursive functions, and input and output tools.
    • The R language has an effective data processing and storage tool,
    • The R language provides a set of operators for calculating arrays, lists, vectors, and matrices.
    • The R language provides a large, consistent, and integrated collection of data analysis tools.
    • The R language provides graphical tools for data analysis and direct display on a computer or in a document.

Why Choose R?

free, support Windows/mac os/linux, open source, there are many powerful toolkits, more large companies to use (Twitter, Ford, New York Times, Microsoft,google); You can complete almost any step of your data analysis design: Data acquisition-data cleansing-data analysis-Results report-and publish results.

In the above 5 steps, the data analysis, the result report, the release result is more important. Start with simple learning:

Data analysis

    •   Exploratory data analysis

The necessary steps in data analysis, can be plotted to understand the data, R has the ability to draw.

    • Statistical inference

The process of making a formal conclusion based on data, but because of the uncertainty of the conclusion (sample deviation of data acquisition).

For example A, b two people who is more beautiful? In reality, Lori be, each one loves, all also has the uncertainty. In general recognition, as long as the error rate is less than 5%, it is considered to be a formal conclusion.

Use R to complete this critical step.

    • Regression analysis

Linear regression analysis: The linear model is used to fit the data, which can be divided into: Predictor variable, result variable.

For example, analysis of price: Predictor variables can have lots, room size, policies and so on.

Result variables can be derived from predictor variables

Nonlinear regression analysis

    • Machine Learning-Classification issues

such as: Cat Pike sofa

This allows the machine to classify the above items. It requires a lot of algorithmic knowledge.

    • Developing data Products

For example: Using the Googlevis API, R make HTML, call Google charts to generate HTML graphics

Use Manipulate,rcharts to make JavaScript interactive graphics from R

Use shiny to create an interactive R program that embeds Web pages. Create and publish R-based results reports through slidify. http://www.shinyapps.io/

Results Report:   The result information in the data is summed up by drawing and other. Big Data Analytics Competition Platform

Release Results: The following two platforms can be used to publish a knot Fruit GitHub rpubs

Install R and Rstudio:

Depending on the platform, download and install

Installing r:https://cran.r-project.org/

Installing Rstudio https://www.rstudio.com/

The first knowledge of R language

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.