R Study Notes (2): Data Types and data structures in R

Source: Internet
Author: User
The latest version of this article has been updated to: http://thinkinside.tk/2013/05/09/r_notes_2_data_structure.html

Although R is object-oriented, I personally think that the so-called objects in R are only a structure (struct ). You still need to use the function to operate it.

The data structure in R is mainly oriented to some concepts in linear algebra, such as vectors and matrices. It is worth noting that there is actually no simple data (numeric, logical, and numeric) in R. For a simple type, it is automatically regarded as a vector with a length of 1. For example:

> b=5> length(b)[1] 1> typeof(b)[1] "double"> mode(b)[1] "numeric"

The most important data structures in R are vector and matrix ).

A vector consists of a series of ordered elements of the same type. A matrix is a special case of an array: An array with a dimension of 2, and an array is a vector with a dimension attribute added.

In addition, the list and data frame are the generalization of vectors and matrices. The list can contain different types of elements or even use objects as elements; the Data box allows different types of elements for each column. For lists and data boxes, the elements are usually called components ).

Object type and length

All objects in R have the type and length attributes. You can use the typeof () and length () functions to obtain/set them. Example:

View Code

> x = c(1,2,3,4)> x[1] 1 2 3 4> typeof(x)[1] "double"> length(x)[1] 4> dim(x)=c(2,2)> x     [,1] [,2][1,]    1    3[2,]    2    4> typeof(x)[1] "double"> length(x)[1] 4> Lst <- list(name="Fred", wife="Mary", no.children=3,+                    child.ages=c(4,7,9))> > Lst$name[1] "Fred"$wife[1] "Mary"$no.children[1] 3$child.ages[1] 4 7 9> typeof(Lst)[1] "list"> length(Lst)[1] 4

The typeof () function may return the following values (defined in TypeTable of src/main/util. c In R source code ):

# Data Objects
Vector with logical values in logical
Integer vector containing integer values
Vector with real values in double
Complex Vector Containing complex values
Character Vector Containing character values
Raw vector containing byte values

# Other objects
List
NULL
Closure Function
Special cannot be used to evaluate parameters.
Builtin built-in functions for parameter evaluation
Environment

# Generally used inside R
Symbol variable name
Pairlist paired list object
Promise is an object used for leisurely assignment.
Language R language Construction
... Specific variable length parameters
Any can match any type of special type
Expression object
Externalptr External table pointer object
Weakref weak reference object
Char character
Bytecode binary

The object type is not static and can be converted at any time. The example above is as follows:

> Typeof (x) [1] "double"> y =. the following table lists the conversion rules of logical (x)> typeof (y) [1] "logical: | ---- | to numeric | to logical | to character | --- + --- from numeric-| 0 → FALSE other numbers → TRUE | 1, 2 ,... → "1", "2" from logical FALSE → 0 TRUE → 1 |-| TRUE → "TRUE" FALSE → "FALSE" from character "1 ", "2 ",... → 1, 2 ,... "",... → NA | "FALSE", "F" → FALSE "TRUE", "T" → TRUE others → NA |

The length of an object can also be changed at any time. common situations include:

> # Extend the index range> x = c (, 3)> x [1] 1 2 3> x [5] = 12> x [1] 1 2 3 NA 12> length (x) [1] 5> # directly set the length attribute> length (x) = 2> x [1] 1 2> # revalue (Omitted) • class and attributestypeof () of the evaluate object () class () processes the type of elements in an object, for example,> x => x [1] 1 2 3 4 5 6> typeof (x) [1] "integer"> class (x) [1] "integer"> dim (x) = c (3, 2)> x [, 1] [, 2] [1,] 1 4 [2,] 2 5 [3,] 3 6> typeof (x) [1] "integer"> class (x) [1] "matrix"

You can also change the class of an object through the class, for example:

> X => class (x) [1] "integer"> class (x) = "matrix" error in class (x) = "matrix ": unless the dimension length is two (currently 0), it cannot be set to matrix category> class (x) = "logical"> x [1] TRUE

In addition to typeof and length, other class objects may have other attributes, which can be operated by attributes () and attr () functions, for example:

> x = 1:6> attributes(x)NULL> dim(x) = c(3,2)> attributes(x)$dim[1] 3 2> x     [,1] [,2][1,]    1    4[2,]    2    5[3,]    3    6> attr(x,"dim") = c(2,3)> x     [,1] [,2] [,3][1,]    1    3    5[2,]    2    4    6

The example shows that attributes are saved as lists, and all elements have names.
From the example, we can see that in the array of R, the arrangement order of elements is the fastest change of the first subscript, and the slowest change of the last subscript. This is called "column order" in FORTRAN ".

Some common attributes are as follows:
Names, which can add tags to each element of a vector or list.

> x = 1:6> x[1] 1 2 3 4 5 6> attributes(x)NULL> attr(x,'names') = c('a','b','c')> x   a    b    c <NA> <NA> <NA>    1    2    3    4    5    6 > attributes(x)$names[1] "a" "b" "c" NA  NA  NA 

Dim, marking the object dimension. In addition to vectors, array-based objects all have a dimension attribute, which is an integer vector that specifies the length of each dimension of the array. Similar to subscript, dimension can also be named. The dimnames attribute can be used to achieve this purpose:

> x = array(1:6,2:3)> x     [,1] [,2] [,3][1,]    1    3    5[2,]    2    4    6> attributes(x)$dim[1] 2 3> names = list(c('x','y'),c('a','b','c'))> dimnames(x) = names> x  a b cx 1 3 5y 2 4 6> attributes(x)$dim[1] 2 3$dimnames$dimnames[[1]][1] "x" "y"$dimnames[[2]][1] "a" "b" "c"
Elements in the Access Object

Since an object is a collection of elements, it is natural to think of using subscript to access elements in an object:

> X = array (6,)> x [, 1] [, 2] [, 3] [1,] 6 4 2 [2,] 5 3 1> x [1] # access a single element in the storage order [1] 6> x [2] # access a single element in the storage order [1] 5> x [3] # access a single element [1] 4> x [1, 2] # access a single element [1] 4> x [1,] through multiple subscripts # returns a row [1] 6 4 2> x [, 1] # returns a column [1] 6 5

If the object has the names attribute, You Can index it through names:

> X = array (6,)> names (x) = c ('A', 'B', 'C')> x [, 1] [, 2] [, 3] [1,] 6 4 2 [2,] 5 3 1 attr (, "names ") [1] "a" "B" "c" NA> x ['B'] # equivalent to x [2] B 5

The preceding two examples both return a single element in the object. In R, multiple elements of an object can be returned. In this case, the index is not a simple value or string, but a vector. Continue with the above example:

> x[1:3]a b c 6 5 4 > x[c(3,4)]   c <NA>    4    3 > x[c(1,2),c(1,2)]     [,1] [,2][1,]    6    4[2,]    5    3> x[c('a','b')]a b 6 5 
Object filling with Sequences

In the previous example, you may notice some similar syntaxes as python, such as sequences:
A: B
R provides some methods for creating sequences to easily populate objects. Including the rule sequence and random sequence.

The rule sequence is used to generate a rule sequence:
A: B is the simplest method;
If you need more control, you can use the seq (from, to, by, length, along) function;
You can use the rep () function to generate repeated elements.
For example:

> 1:3[1] 1 2 3> 2*1:3[1] 2 4 6> 3:1[1] 3 2 1> seq(1,2,0.2)[1] 1.0 1.2 1.4 1.6 1.8 2.0> seq(1,2,0.3)[1] 1.0 1.3 1.6 1.9> seq(to=2,by=.2)[1] 1.0 1.2 1.4 1.6 1.8 2.0> seq(to=2,by=.2,length=3)[1] 1.6 1.8 2.0> rep(1:3,2)[1] 1 2 3 1 2 3> rep(1:3,each=2)[1] 1 1 2 2 3 3

A random sequence is used to generate data that meets certain distribution rules. There are a large number of functions used to generate random sequences. here only the names of some functions are listed:

Data Editor

Of course, we can use subscript operations to edit data elements in an object. However, a visual tool provided by R can bring more convenience, which is the data editor.
Use the data. entry () function to open the data Editor:

> x = array(6:1,2:3)> data.entry(x)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.