The latest version of this article has been updated to http://thinkinside.tk/2013/05/03/r_notes_1_what.html
While learning about quantum investment, I found R (www.r-project.org ). What is R? Before you start, let's take a look at the magic of R.
1. R Overview
Select an image from The CRAN (The Comprehensive R Archive networkw.cran.r-project.org-mirrors.html, and then download The appropriate installation package (R supports Linux, Mac OS X, and Windows ).
After installing and running R, you can see the R Console (my operating system is Mac OS ):
Enter the following command in the R console:
> Install. packages ('quantmod') # Install the quantmod package> require (quantmod) # reference the quantmod package> getSymbols ("GOOG", src = "yahoo", from = "2013-01-01 ", to = '2017-04-24 ') # obtain google stock data from yahoo financial> chartSeries (GOOG, up. col = 'red', dn. col = 'green') # display the K line chart> addMACD () # Add a MACD chart
You can see the following results:
Finally, exit R:
> q()#Terminate an R Session
2. What is R?
Is it amazing? At that time, I was fully held.
So what is R? Or, what is R used? R is described differently from different perspectives.
- From the perspective of usage, R is a software with statistical analysis and powerful plotting functions. It is released free of charge under the GNU General Public Licence4 protocol.
- From the programming perspective, R language is an object-oriented statistical programming language developed by the S language developed by AT&T Bell Laboratory.
- From a computing perspective, R is a language and environment designed for Statistical Computing and graphic display.
- From the development perspective, R is a set of open source data operations. The integration package of computing and graphic display tools can be programmed and called in various ways.
- From the perspective of architecture, R is a system designed for Statistical Computing and graphic display. It includes a programming language, a high-level graphic display function, interfaces with other languages, and debugging tools.
If you must find a software similar to R, It is the commercial software Matlab. R and Matlab are both tools for data analysis based on programming. Matlab is applicable to a wider range of fields, while R is better at statistical analysis.
Compared with Matlab, R is more open:
- R is free software and Matlab is commercial software;
- R can be easily expanded through the "package". R has only 25 packages at the core, but thousands of external packages can be called. Of course, you can develop your own;
- The R language is more powerful than the Matlab language;
- R has good interfaces with other programming languages/databases. Other languages can also conveniently call R APIs and result objects.
R is often used in finance and statistics. Most people use R because of its statistical function. R implements many classic or fashionable statistics technologies internally.
3. core concepts of R 3.1 objects
R language is an object-oriented language. All objects have two internal attributes: Element type and length.
Element types are the basic types of elements in an object, including: numeric, character, complex, logical, and function, you can use the mode () function to view the type of an object.
Length is the number of elements in an object. You can view the object length by using the length () function.
In addition to element types, the object itself has different "types", indicating different data structures (struct ). The object types in R mainly include:
Vector: a vector consists of a series of ordered elements.
Factor: a vector object that classifies (groups) other vector elements of the same length. R also provides ordered and unordered factors.
Array: a set of elements of the same type as multiple subobjects.
Matrix: A matrix is just an array of Double-low objects. R provides the following functions for processing two-dimensional arrays (matrices ).
Data frame: a structure similar to a matrix. In the data box, columns can be different objects.
Time series: contains some additional attributes, such as frequency and time.
List: a general form vector. It does not require all elements to be of the same type. In many cases, they are vectors and lists. The list provides a convenient way to return Statistical Computation results.
3.2 Constants
Some constants are also defined in R, such:
NA: Indicates unavailable
Inf: Infinite
-Inf: negative infinity
TRUE: TRUE
FALSE
4. The basic use of the 4.1 command for R
R is a simple expression language ). The user interacts with R through commands.
The basic command is either expression or assignments ). If a command is an expression, it will be parsed (evaluate), the result will be displayed on the screen, and the memory occupied by the command will be cleared. The value assignment will also parse the expression and pass the value to the variable, but the result will not be automatically displayed on the screen.
Based on commands, you can use R in interactive mode or in the form of batch processing/script files.
4.2 interactive use of R
Interactive shell is a convenient environment for you to try and adjust the process at any time. Like Python and Ruby, R also provides a shell environment. The example at the beginning of this article is to use R in interactive mode. When you open the R console, the R command prompt ">" is displayed. You can enter the command.
The following are examples of interactive R usage:
Example 1:
> Help. start () # start online help. A browser is opened.> X <-rnorm (50); y <-rnorm (x) # generate two random vectors x and y> plot (x, y) # use x and y to draw two-dimensional scatter plots, A graphic window is opened.> ls () # view the R objects in the current workspace> rm (x, y) # Clear x, y Object> x <-# equivalent to x = ,..., 20)
Example 2:
X <-1:20 # is equivalent to x = (1, 2,..., 20 ). W <-1 + sqrt (x)/2 # 'weight' vector of the standard deviation. Dummy <-data. frame (x = x, y = x + rnorm (x) * w) # create a double row data box dummy consisting of x and y # view the data in the dummy object. Fm <-lm (y ~ X, data = dummy) # fit y to the simple linear regression summary (fm) of x # view the analysis result. Fm1 <-lm (y ~ X, data = dummy, weight = 1/w ^ 2) # weighted regression summary (fm1) # View analysis results. Attach (dummy) # enable the column items in the Data box to be used as common variables. Lrf <-lowess (x, y) # performs a non-parametric local regression. Plot (x, y) # standard scatter chart. Lines (x, lrf $ y) # adds a local regression curve. Abline (0, 1, lty = 3) # True regression curve: (intercept 0, slope 1 ). Abline (coef (fm) # regression curve without weight. Abline (coef (fm1), col = "red") # weighted regression curve. Detach () # Remove the data box from the search path. Plot (fitted (fm), resid (fm), xlab = "Fitted values", ylab = "Residuals", main = "Residuals vs Fitted ") A standard Regression Diagnostic diagram that tests the heteroscedasticity. Qqnorm (resid (fm), main = "Residuals Rankit Plot") # use a normal score chart to test the skewness, kurtosis, and outlier of the data ). Rm (fm, fm1, lrf, x, dummy) # Empty again.
Example 3: A typical experiment on how Michaelson and Morley measure the speed of light
Filepath <-system. file ("data", "morley. tab ", package =" datasets ") # obtain the file path filepath of the experiment data from the object morley # view the file path. show (filepath) # view the file content mm <-read. table (filepath) # Read data in the form of a data box mm $ Expt <-factor (mm $ Expt) mm $ Run <-factor (mm $ Run) # Change Expt and Run to a factor. Attach (mm) # make the data visible in location 3 (default) (that is, direct access ). Plot (Expt, Speed, main = "Speed of Light Data", xlab = "Experiment No.") # Compare five experiments with a simple box chart. Fm <-aov (Speed ~ Run + Expt, data = mm) # analyze the random block, 'runs' and 'experiments' as factors. Summary (fm) fm0 <-update (fm ,.~ .-Run) anova (fm0, fm) # fit the child model that ignores 'runs' and analyze the variance before and after model changes. Detach () rm (fm, fm0) # Clear the data before performing the following operations. # The following is an example of contour lines and Image Display x <-seq (-pi, pi, len = 50) # x is a [-pi \, pi] The vector y <-x f <-outer (x, y, function (x, y) cos (y) of 50 elements with equal spacing) /(1 + x ^ 2) # f is a matrix. The rows and columns are indexed by x and y respectively. The corresponding value is cos (y)/(1 + x ^ 2). Oldpar <-par (no. readonly = TRUE) par (pty = "s") # Save the graphic parameters and set the graphic area to square ". Contour (x, y, f) contour (x, y, f, nlevels = 15, add = TRUE) # Draw the contour lines of f; add some curve display details. Fa <-(f-t (f)/2 # fa is the "asymmetric part" of f (t () is a transpose function ). Contour (x, y, fa, nlevels = 15) # Draw a contour line par (oldpar) # restore the original image parameter image (x, y, f) image (x, y, fa) # draw some high-density image display objects (); rm (x, y, f, fa) # Clear the data before proceeding to the next step. Th <-seq (-pi, pi, len = 100) z <-exp (1i * th) # 1i represents the plural I par (pty = "s") plot (z, type = "l") # When the graphic parameter is a complex number, it indicates that the virtual part draws a picture of the real part. This may be a circle. W <-rnorm (100) + rnorm (100) * 1i # suppose we want to randomly sample the circle. One way would be to make the imaginary and real part values of the complex number a standard normal random number... w <-ifelse (Mod (w)> 1, 1/w, w) # map points outside the circle to their reciprocal. Plot (w, xlim = c (-1, 1), ylim = c (-1, 1), pch = "+", xlab = "x", ylab = "y ") lines (z) # all vertices are in the circle, but the distribution is not even. # Uniform distribution is used below. Now the points in the disc look even more. W <-sqrt (runif (100) * exp (2 * pi * runif (100) * 1i) plot (w, xlim = c ), ylim = c (-1, 1), pch = "+", xlab = "x", ylab = "y") lines (z) rm (th, w, z) # Empty again. Q () # exit the R Program
4.3 workspace)
R shell can store a complete environment, which is called a workspace ). In the preceding example, when you run the q () command to exit R, you are asked if you want to save the Workspace:
A workspace stores some environment information. Each session with R can start from a completely new environment or continue on the basis of the original, and the running information is saved in the workspace.
If you start R in a UNIX System Using a command line, the current directory is the workspace of the session:
$ mkdir r_test
$ cd r_test/
$ R
Let's see what R can save for the Workspace:
> X <-rnorm (50); y <-rnorm (x) # generates two random vectors x and y
> q()
Save workspace image? [y/n/c]: y
$ ls -Al
R stores two hidden files:. RData and. Rhistory. Here,. RData stores the variable values in the session in binary mode, and. Rhistory saves all commands in the session in text file mode.
If you start R in an existing workspace, the following message is displayed:
[The original workspace has been restored]
In this case, you can use the ls () and history () functions to view the previously stored data and commands.
You can use rm ()/remove () to delete variables in a workspace.
In the R console, you can also use the getwd () and setwd () functions to obtain/set the workspace directory; Use list. files () to view files under the current directory.
If you run the R console in GUI mode, you can use the menu to load or save the workspace.
4.4 script/batch processing
As mentioned above, R can save historical commands in the workspace. In fact, this is the default script in a workspace. It is automatically executed when the workspace is loaded.
We can completely write our own scripts and specify R to execute some commands in batches. Generally, your script uses ". R" as the extension. The simplest example is test. R:
X <-rnorm (50); y <-rnorm (x) # generates two random vectors x and y
Plot (x, y) # use x and y to draw a two-dimensional scatter chart. A graphic window is opened.
And save it to the workspace. Then, on the R console, run the following command:
> source('test.R')
You can execute this script.
> Source ('test. R', echo = TRUE) allows more detailed information to be output during script execution.
When writing scripts to automatically execute some tasks, the sink () function is more useful:
> sink("record.lis")
All subsequent output results will be redirected from the console to the external file record. lis. At this time, the console does not see the command output results. Run the following command:
> sink()
You can redirect the output stream to the console. 5. Help System
GNU software usually has a very good help system, which can be of great help to beginners and skilled practitioners. R is no exception. R provides the following types of help:
5.1 documentation and search
The help. start () command opens the browser and displays the help document. Includes some Getting Started documents and Search functions (link: Search Engine & Keywords ).
5.2 demo
Demo () lists all available demos by package group:
You can start the demo by name, for example:
demo(is.things)
5.3 function help
If you already know the name of a function (such as solve), you need to know its package, purpose, usage, parameter description, return value, references, related functions, and examples. You can use the command
Help (solve) or? Solve
This command will pop up a window:
5.4 function example
For functions, you can also use example () for example:
example(solve)
5.5 keywords and operators
Similar to function help, but must be enclosed in quotation marks, for example:
>? '[[' # Is equivalent to help ('[[')
>? '+' # Equivalent to help ('+ ')
>? 'If' # equivalent to help ('if ')
5.6 search
If you do not know the function name, you can search for it, for example:
?? 'Analytic' # equivalent to search ('analytic ')
5.7 official search
The preceding help is limited by the packages already installed in the local environment. If you want to search for all resources (packages, functions, and mathematical methods) in R, You need to search on R's official website:
www.r-project.org—search.html
6. Learning Materials
Www.R-project.org R Official Website
Some manuals and documents on cran.r-project.org-manuals.html official website (mandatory)
Staff.washington.edu-Rcourse an R tutorial at the University of Washington