This article is for you to learn about the R language as well as the steps of the segmented tutorial!
There is a general lack of systematic learning methods when people learn R language. Learners do not know where to start, how to proceed, and what to choose. Although there are many good free learning resources on the Internet, however, they are more than the head, but they will make people cross-stitch eyes.
To build the R language learning approach, we have selected a comprehensive set of resources in Vidhya and Datacamp to help you learn the R language from scratch. This learning method is useful for beginners in data science or R, and if the reader is an old user of the R language, this article will look at some of the latest achievements in this language.
The R language learning approach will help you learn the R language quickly and efficiently.
Objective
The first question to answer before starting to learn is: why use R? Or why is the R language so useful?
The R language is a fast-growing open source software that is a competitor to commercial software such as SAS, Stata and SPSS. The demand for r language in the job market is rising rapidly, and companies such as Microsoft are also committed to making the R language a universal language for data science.
Take a look at the 90-second video (Https://www.youtube.com/watch?v=VlJnNSeO1uQ) produced by Revolution Analytics, and you'll know the usefulness of the R language. By the way, Microsoft has just acquired Revolution Analytics.
Step One: Configure the computer environment
The easiest way to build an R language learning environment is to download (https://cran.r-project.org/) to your local computer via the comprehensive R Language archive network (CRAN). You can choose Linux, Mac, and Windows for binary file downloads.
You might consider using a console that comes with the R language, but we recommend that you install the R Language integrated development Environment (IDE). RStudio (https://www.rstudio.com/) is the most famous IDE, which makes the R language encoding easier and faster, and allows you to enter multiple lines of code, process graphics, install and maintain programs to effectively guide your programming environment. Rstudio can also opt for Eclipse-based architect (Http://www.openanalytics.eu/architect). If you need to install a graphical user interface (GUI), select either R-commander (http://www.rcommander.com/) or Deducer (http://www.deducer.org/pmwiki/index.php?n =main.windowsinstallation).
Homework after class
- Install R and Rstudio.
- Install the RCMDR, Rattle, and deducer packages. As well as recommended or dependent packages, including the GUI.
- Use the library command to load the installer and open the GUI.
Step two: Basic learning of R language
You should first understand the fundamentals of language, libraries, and data structures.
If you prefer to learn R grammar in online communication mode, Datacamp (https://www.datacamp.com/courses/free-introduction-to-r) offers free online R tutorials that are good resources. You can also choose a follow-up course: Intermediate R Programming (HTTPS://WWW.DATACAMP.COM/COURSES/INTERMEDIATE-R). Another way to learn is online version swirl (https://www.datacamp.com/swirl-r-tutorial), which allows you to learn the R language in a similar rstudio environment.
In an interactive learning environment, you can choose to participate in Coursera (https://www.coursera.org/specializations/jhu-data-science) or edx (https://www.edx.org/ course/introduction-r-programming-microsoft-dat204x-0) on the MOOC course.
In addition to the above online resources, you can also consider the following excellent resources:
- Cran Free teaching R language (https://cran.r-project.org/doc/manuals/R-intro.pdf).
- Jared Lander ' s R for Everyone (http://www.jaredlander.com/r-for-everyone/)
- Quick-r (http://statmethods.net/)
Specialized Learning: Reading, data frames, tables, overview, description, loading and installing packages, visualizing data using drawing commands.
Homework after class
- Use the Datacamp free online R tutorial to familiarize yourself with the basic R syntax.
- Create a github (http://github.com/) account.
- Troubleshoot problems with the installation process with Google Help.
- Install the swirl package and learn R programming (see above).
Step three: Learn about R
The existence of a strong community is the main reason for the rapid development and success of R language. The core of the community is the "package" ecosystem of the R language. The R language package can be downloaded in Cran, Bioconductor, GitHub, and BitBucket. In Rdocumentation (http://www.rdocumentation.org/), you can easily search for packages from Cran, GitHub, and bioconductor that meet your current job needs. As important as in the R language package ecosystem, you can easily get help and feedback on r endeavours. First, R has a built-in Help system that you can access by command. At the same time, the Vidhya discussions,stack Overflowr language in analytics is the fastest growing language. R-bloggers (http://www.r-bloggers.com/) brings together blogs written by many R-language enthusiasts.
After-school assignments:
- Visit Crantask views to learn about the R language ecosystem.
- Register and subscribe to the Daily News on http://r-bloggers.com.
Step four: Data import and operation
Importing and manipulating data is an important step in the Data science workflow. The R language allows you to use specific packages to import different data formats to make your work easier, as follows:
- Readr: Import flat file.
- READXL Package: Convert an Excel file to an R language.
- Haven Package: Lets you import SAS, Stata, and SPSS data files into the R language.
- Databases: Connect through a package like Rmysql and Rpostgresql, using DBI access and operations.
- Rvest: Web data fetching.
Once the data is available in the work environment, you can use the following package operation:
- The Tidyr package that organizes the data.
- The Stringr package handles string manipulation.
- Object data frame, you can learn DPLYR packet input and output (https://www.datacamp.com/courses/dplyr-data-manipulation).
- Need to perform heavy data contention tasks? Try the Data.table package.
- Perform time series analysis? Try a package like zoo,xts and Quantmod.
Homework after class
- Use the import data into the R language course, or read articles 1, 2, 3, 4. Master the import data package.
- Watch data wrangling with R via Rstudio. (https://www.rstudio.com/resources/webinars/data-wrangling-with-r-and-rstudio/)
- Read and practice how to use the Dplyr, Tidyr, and data.table packages.
Step five: Effective data visualization
It is a matter of pride to create your own data visualization work. However, data visualization is both a skill and an art. Many scholars read Edward Tufte's "visual quantitative data" principle, or Stephenfew's "Pitfalls on dashboard Design". You can also read the blog post written by Nathanyau in Flowingdata to get a visual inspiration for creating the R language.
1. Floor plans are everywhere
The R language provides a variety of ways to create graphics, and using schematics to create graphics is a standard approach. However, there are some good tools (or packages) that you can use to create and view graphics in a simpler way.
- Learning basic graphics syntax in the R language is a practical method in data visualization.
- In the R language, Ggplot is the most important package in data visualization and is popular, with many of its learning resources, such as online ggplot2 tutorials, Cheatsheet, and a teaching book written by Hadleywehan.
- The Ggvis package allows you to create interactive Web graphics using basic graphics syntax (see Tutorial).
- Do you know Hans Rosling Ted course? teach you how to use Googlevis (a Google chart interface) to reconstruct a chart.
- If you are having problems drawing data, this article will help you. You can see more visual resources in this Cran task view. or view the R language data visualization guide.
2. Maps Everywhere
Are you interested in analyzing spatial visualization data? Learn this tutorial: Introduce the R language spatial data, which you will use easily.
- Static images from Google Maps and Ggmap open streets that you can use to create visual spatial data and models.
- The CHOROPLETHR package in Ari Lamstein ' s.
- TMap Package
3. HTML Plugin
The HTML plugin is a very promising plugin for R-language visualizations, and you can create interactive Web visualizations in a simple way (see the Rstudio Tutorial), and mastering this visualization will be an essential skill in R language learning. The visual effects will impress your friends and colleagues.
- Leaflet Create a dynamic picture.
- Use dygraphs to generate a time series data chart.
- Interactive table (DataTable).
- Diagrammer create diagrams and flowcharts.
- Metricsgraphics Create scatter plots, line charts, and histograms.
Homework after class
- Understand the basic graphics syntax principle.
- Learn Ggplot2 Tutorials.
- Learn HTML plugins using the Rstudio environment.
Step Six: Data mining, machine learning
For new methods of statistical data learning, we recommend the following resources:
- Andrew Conway's Course: Introduction to R language statistics.
- Duke University data analysis and statistical inference.
- R language Practical Data Science.
- Johns Hopkins University in data science courses.
- R Language Data Science usage Guide.
If you want to improve your machine learning ability, consider starting from the following tutorials:
- Machine Learning Algorithm essentials.
- Bike Sharing Contest--a complete solution for R language.
- Machine learning courses on the Kaggle.
- Master machine learning.
- Introduce machine learning.
Ensure that the machine learning resources available to the R language are viewed in the relevant Cran task view.
Homework after class
- Get started with the statistics course.
- Learn Kaggle on a free machine learning course.
- Look at some of the R language Data mining books in rattle.
- You can learn the time series from this booklet--a Little book for The Times series in R.
Step Seven: report results
Sharing insights with data science enthusiasts is an important thing. Fortunately, the R language has some very useful tools for this issue.
The first tool is R Markdown, which generates a report of your data analysis results using KNITR and Pandoc replication methods. Using the R markdown tool, the R language eventually generates the document, replacing the R language code. Documents can be in the form of HTML, Word, PFD, ioslides, and so on. You can learn more from this tutorial and use Cheatsheet as a reference.
The second tool is reporters, which is the creation of Microsoft (Worddocx and PowerPoint pptx) and HTML R language documentation packages and can be run on Windows, Linux, UNIX, and Mac OS systems. Automatically generate R language reports like the R markdown Tool, click here to see how it operates.
The third one is shiny, the most exciting tool in the R language at the moment. Making the R language easy to build interactive Web applications. You can convert your analytics reports into interactive Web applications without needing to know HTML, CSS, or Java-related knowledge. If you want to learn shiny, please click Rstudio Learning Portal.
Homework after class
- Create the first interactive report using Rmarkdown or reporters.
- Try to build a shiny app.
Practice
Only a great deal of practice can be a good R language programmer. Therefore, the problem in data science should be resolved on a regular basis. Our advice is to start communicating with the data scientists on the kaggle.
Test your own R language level in solving problems-problems in the practice.
Step Eight: Time series analysis
The R language has a time series for a dedicated task view. If you want to do some time series analysis in the R language, this will be where you start. You will soon discover the power of the tool.
It is not easy to master time series analysis from online resources. A good entry point is a book about time series or the book "Principles and Practices". In terms of packages, you need to be familiar with zoo and XTS packages. The zoo provides you with a common format for saving time series objects, while XTs is a dataset tool for manipulating time series.
Auxiliary resources: A comprehensive course of time series.
Homework after class
- Start your analysis by selecting the time series tutorials listed above.
- Use the Quantmod or Quandl program package to download financial data and start your time series analysis.
- Create your Visual time series data and analysis using packages such as dygraphs.
A key tool for text mining
Learn text mining, which you can learn from the edge course. Although the course is over, you can still access these courses.
Practice
- Text Mining Contest-a complete solution for the R language.
Step nine: Become the Master of R language
Now that you have mastered most of the R language data analysis, it's time to give some advanced course resources. You probably already know some of these things, but you might want to take a look at these tutorials.
- Hadley Wickham's Advanced R language Tutorials.
- Use the R language in Hadoop, MongoDB, or NoSQL.
- Microsoft's Revoscaler Package
Original link:
https://www.analyticsvidhya.com/learning-paths-data-science-business-analytics-business-intelligence-big-data/ learning-path-r-data-science/
Teach you to learn R language