R Language Learning Notes

Source: Internet
Author: User
Tags scalar

date:2014.10.29

R Learning: Chapter One overview of knowledge points

  1. R languages are case sensitive
  2. The data types supported by the R language include: vectors, matrices, data frames, and lists
  3. All data Objects during an interactive session are stored in memory (my# large amounts of data can be very stressful for memory?). To use a special method in the back) my# to represent your own comments
  4. The R language uses <-as the assignment symbol, which differs from other languages x <-Rnorm (5) using a function to get a vector of five elements x
  5. The C () function can combine the parameters inside the parentheses into a vector or a list for the next function's operation (my# as if Python's list is encapsulated in Python's scientific calculations as a specific list of the packages it introduces)
  6. Mean () to find the average of the list or vector inside, summary () is a summary statistic of the numbers inside
  7. SD () to find the standard deviation of the list or vector inside
  8. Cor (A, A, b) indicates the correlation of a with a.
  9. Plot (b) Gets a scatter plot of A and B (in this case the default scatter plot), hist (x) generates a histogram
  10. Run demo () to get a complete list of the graphs, you can view some graphical examples by running functions such as demo (image).
  11. Enter Help.start () to open the local Help document in HTTP front matter, but it's all in English; Assist (func) to find the aid with using it; Help.search (func) search the Func; Example (func) Find the examples of func (graphic example Oh, here's an example of the graphical presentation of the bottom example in Help)
  12. Workspaces: You can use GETWD () to get the current workspace directory and use SETWD () to modify the current working directory. LS () to get the object in the current workspace (my# lowercase); RM ("A", "B") deletes one or more objects.
  13. Options () Displays or sets the current options, such as using options (digits=3) to display a format that has three significant digits after the decimal point. But it shows four decimal places (the answer is to keep 3 valid digits, which is valid from the first number on the left that is not 0)
  14. The history (#) shows the most recently used # commands, the default is 25,savehistory ("MyFile") to save the command to the file "MyFile", the default value is. Rhistory. LoadHistory ("myfile") loads a command history file, which defaults to. Rhistory. Save.image ("MyFile") Save Workspave to Myfile,and default is. RData, Load ("myfile") read a workspace to R. Save (Objectlist,file = "myfile") Save the object to MyFile.
  15. R language does not automatically create directories, there is a need to use function dir.create () to create, Dir.create ("Niuqike") in the current (GETWD ()) workspace to create a folder "Niu Ke" parameter inside the single quotation mark is also possible. Use '/' forward slash if you have a path
  16. Source ("filename. R ") This can be written in Notepad, such as the R statement, and then in the command line state using the source command, the execution of the script, remember that the file name to indicate the type, otherwise it will be wrong, you can directly change the extension R is OK, but the extension is not mandatory.
  17. Sink ("filename") is the output of the text results stored in a file, there are two parameters, one is append set to True, you can append to the text, the default is to overwrite the original file, and a parameter split if set to True note the Boolean value to uppercase, That will be in the screen and file output results, if it is sink () No parameters, only on the screen output.

(later test: When you start writing to the output language (sink or PDF), subsequent results and images are stored automatically, but the image appears to be displayed after the conversation is closed)

    1. Install.packages ("Packagesname") download a package to Lib,update.packages ("name") update package, Install.packages (), Install.update () Displays a list of downloads and updates all packages. Installed.packages () displays information about the installation package. Successfully downloaded the "VCD" package
    2. Library ("Packagesname") to load the package, help ("Packagesname") package.

Data 2014.10.30

R Language Learning: Chapter II

  1. Vectors: vectors are one-dimensional arrays used to store numeric, character, or logical data. The data in a single vector must have the same type or pattern, and the same vector cannot be intermixed with different patterns. (my# test: placed in a vector of numeric, character and logical data, all displayed is the character data)
  2. It is best not to use C as the variable name because the function C () with the same name is built into R
  3. A scalar is a vector that contains only one element and is used to hold constants. There is no concept of scalar in R
  4. In the acquisition of elements of a vector, the [] symbol is still used, but unlike other programming languages, its subscript is evaluated starting at 1. You can also use a slice-like feature, such as A[2:6] to get 2nd to 6th, note that this includes the post-value of the slice, which includes the 6th number in the a vector, and is different from other languages. The last kind of method a[c (+/-)] A[c (1:3)] is to pass an ordered sequence formed by the C function into the [] value, note that the direct value a[1,2,3] is not possible.
  5. A matrix is a two-dimensional array, except that each element has the same data pattern (numeric, logical, or character) that can be created from the matrix. It is created in Mymatrix <-matrix (vector,nrow=1,ncol=1,byrow = True,dimnames = List (char_vector_rownames,char_vector_ colnames) vector represent the elements in Matrix,byrow default is set false,dimnames = List () represents the row and column names. The way you read the data set in the vector is consistent with the way you read it in the matrix, and if it is read by the line in the process of composing the vector, then set the line to read as long as the number of rows and columns is correct, and there will be no problem.
  6. The value of the matrix can be taken using the subscript plus []. A[num_row,] take out the entire row of data, A[,num_col] Take out the entire column of data, note the comma, and this value is also starting from 1. A[num_row,num_col] Take out a specific value. Also provides multiple lines of query, still using the form of vectors, A[1:3,1:3] can, Y[c (1,3,5), 1:3] can also, Y[c (1,2,4), C (All-in-one)] is also possible.
  7. Matrices can only be two-dimensional, and if you are involved in multidimensional data of the same type, use an array, and try using a data frame when it comes to multiple patterns.
  8. An array is a natural generalization of the Matrix, which is created as an array (vector,dimensions,dimnames) with no byrow parameters and the same value as the matrix.
  9. The data frame is created with Data.frame (Col1,col2), where the column name is the variable name of the original column vector. Note that data[1] gets the first column. But data[1,2] The value obtained is the second value of the first row. The value can also be obtained using A[c ("ColName", "colname"), and a new a$colname. But the resulting format is different. The new method gets a vector, which must be noted.
  10. Note: When creating arrays, matrices, and data frames, remember the use of commas. If it is a[] in this form, in fact it is achieved that the original creation is using the coordinates of the vector. For example a matrix a created by 1:4 2*2, where a[4] = 4
  11. Table () produces a list of columns that can be easily calculated using the vector obtained by the above data frame using the $ symbol.
  12. In order to make a$colname, this is more concise, you can introduce the attach ("variable name") method, so that you do not have to write the variable name each time, you can directly use its column name to calculate. Similar to import in other languages, but remember to remove it, use the detach ("Variable name") method.
  13. When parameters are passed in the R language, the quoted text is usually a file, or a string. Double quotes are not required when the function needs to receive a variable object. You can also use the With statement, such as with (variable name, {

Calculation statements

}) If there is only one, you can omit the curly braces. In addition, the limitation is that the object inside is only valid in parentheses, which is a local variable. If you want to create a global variable you need to use a special assignment symbol <<-

    1. Instance identifier: data.frame (name,age,gender,row.names = name) This is the name listed in the Name column to mark various types of printouts and graphics output.
    2. Nominal variables and ordered variables are called factors in R, which determines how the data is analyzed and how it is rendered visually. Use the factor () function
    3. The variable of the nominal type and the variable of order type, the factor is a kind of mapping, the order of the factor arrangement is arranged according to the alphabetical order, this certainly cannot satisfy the demand, the order type: with factor (variable name, Order=true,levels = C ("" "" "" " ) in quotation marks, fill in the order of the permutations.
    4. The process of factoring is the logical meaning of a field with a seemingly impossible comparison between the nominal and the ordered, and the comparison is realized by the internal mapping to the vector of the digital type. This is similar to the processing of some non-numeric numeric values in data mining. The process of levels assignment is actually a process of data processing
    5. After factoring, the calculation can be provided at the time of calculation.
    6. A list of interesting phenomena: list[2] and List[[2], the former actually returns the second element of the list, the latter actually returns the original constituent list of the element, its printing effect inside the former will be more than a variable value. Later tested, A[[1]] The results obtained are consistent with the results of the constituent list input; a[1] or get a list.
    7. Test:> A<-c, which is the same as the return value for a vector, but he will not encounter the above situation.

> A

[1] 1 2 3

> A[1]

[1] 1

> class (A[1])

[1] "Numeric"

> class (a)

[1] "Numeric"

Testing of matrices: If no two-dimensional structure is present, it is output in the form of vectors.

> B<-matrix (1:4,nrow=2,ncol=2)

> b

[, 1] [, 2]

[1,] 1 3

[2,] 2 4

> B[1]

[1] 1

> class (B[1])

[1] "integer"

> class (B[4])

[1] "integer"

> b[1,2]

[1] 3

> class (B[2])

[1] "integer"

> b[1,]

[1] 1 3

> class (B[1,])

[1] "integer"

> class (b[,2])

[1] "integer"

> b[,2]

[1] 3 4

> B[[2]]

[1] 2

> class (b[1:2,2])

[1] "integer"

> b[1:2,2]

[1] 3 4

> B[1:2,1:2]

[, 1] [, 2]

[1,] 1 3

[2,] 2 4

Test the array:

> A<-array (1:12,c (2,3,2))

> A

,, 1

[, 1] [, 2] [, 3]

[1,] 1 3 5

[2,] 2 4 6

,, 2

[, 1] [, 2] [, 3]

[1,] 7 9 11

[2,] 8 10 12

> A[1]

[1] 1

> class (A[1])

[1] "integer"

> A[[1]]

[1] 1

> a[1,]

Error in A[1,]: Wrong number of measurements

> class (a[1,1,2])

[1] "integer"

> class (a[,,1])

[1] "Matrix"

> a[,,1]

[, 1] [, 2] [, 3]

[1,] 1 3 5

[2,] 2 4 6

> class (a[,1,])

[1] "Matrix"

> a[,1,]

[, 1] [, 2]

[1,] 1 7

[2,] 2 8

> a[1,,]

[, 1] [, 2]

[1,] 1 7

[2,] 3 9

[3,] 5 11

> a[1,,2]

[1] 7 9 11

> A[1,,1:2]

[, 1] [, 2]

[1,] 1 7

[2,] 3 9

[3,] 5 11

> arrays are actively descending, and are re-exported based on your results.

Data frame, mydata[1] returns the data frame, the return is a vector, and then there is the a[1,2] is a vector, the first column

> MyData

Age Gender Weight VAR4

1 We 45 7

2 We 0 7

3 RE 45 7

4 DF 78 7

> class (MyData)

[1] "Data.frame"

> Mydata[1]

Age

1 12

2 13

3 14

4 15

> class (Mydata[1])

[1] "Data.frame"

> class (Mydata$age)

[1] "Numeric"

> Mydata[1,s]

Error ' [. Data.frame ' (MyData, 1, s): Cannot find object ' s '

> mydata[1,2]

[1] We

Levels:we RE DF

> mydata[1,2]

[1] We

Levels:we RE DF

> mydata[1,4]

[1] 7

> class (mydata[1,4])

[1] "Numeric"

>

    1. With the keyboard input, first create an empty data structure. For example MyData <-data.frame (age = numeric (0), gender = character (0), weight = numeric ()) myadata<-edit (MyData) can be edited box to add resources and modify column names.
    2. Import data from delimited text file: MyData <-read.table (file,header = True/false,sep = "", Row.names = "") Sep develops separators. Note Using the read.table () function, the file must have a name extension.
    3. If you import from Excel, you can export it to CSV format and read it in the format as above. You can also download the RODBC package for import.
    4. RODBC Method: library (RODBC)

Channel <-Odbcconnectionexcel ("")

Mydataframe <-SqlFetch (channel, "MySheet")

    • The Odbcclose (channel) uses the database storage engine to establish a connection to the file and then use the SQLFetch statement to read the contents of the table insert inside, and then close the connection to release the resource when it is exhausted.
    1. In addition, the current xlsx format is read in the other package xlsx package.

Library (xlsx)

Mydata <-read.xlsx ("File Address", 1). At first, there was a bug that could not find Vim.dll, which was related to the environment variable.

    1. XML and Web crawl content are not described in detail.
    2. Import SPSS data: You can use READ.SPSS () in foreign to import SPSS data into R. You can also use the Spss.get () function in Hmisc, which is the encapsulation of the former can set many parameters.
    3. Library (Hmisc) MyData <-spss.get ("file name", Use_value_lables = TRUE)
    4. For SAS support: You can use the READ.SSD () in the foreign package, or the Sas.get () in Hmisc, not necessarily compatible, to test OK.
    5. Import the Stata data as much as before, load the foreign package, and then READ.DTA ("FILENAME.DTA") returns the data frame.
    6. Import NETCDF data and HDF5 data
    7. Connect to the database.
    8. Useful function Description: Length () Displays the number of elements/components in the object
    9. Dim () Displays the number of dimensions of an object
    10. STR () Displays the structure of the object
    11. Class () Displays the type of object
    12. Mode () displays the pattern of the object
    13. Names () Displays the names of the components in the object
    14. Cbind () Merging objects by column
    15. Rbind () merging objects by rows
    16. Head () lists the starting part of an object
    17. Tail () lists the last part of an object
    18. Ls ()
    19. Rm (list = ls ()) deletes all objects
    20. Fix () Directly edit the object, directly pop up an edit box

data:2014.10.31

R Language Learning Chapter III:

    1. Abline (LM (weight~age)) optimal fitting curve, weight is the longitudinal axis.
    2. Title () Add caption
    3. Output:

PDF ("")---dev.off ()

    1. Multiple graphical views, using the Dev.new () method, but do not use Dev.off at the end ()
    2. Plot () is a generic function in R, and the output will vary depending on the type of object being drawn, Polt (x,y,type= ' B ') in the example, X is the horizontal axis, Y is the vertical, and ' B ' represents the plot point and line. (in this case only)
    3. Modify Graphics Parameters: Customize fonts, colors, axes, and titles by modifying options called graphics parameters

One method is to print a modifiable parameter list using the par () function, the empty print parameter list, and the No.readonly = True parameter. Note: It is a good idea to first copy the current list so that you can revert back to the original state.

The Lty parameter represents the line type, and the PCH parameter shows the shape of the node.

R Language Learning Notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.