Looping on the Command line
Writing for, while loops was useful when programming and not particularly easy when working interactively on the command Li Ne. There is some functions which implement looping to make life easier
lapply: Loop over a list and evaluate a function on each elementsapply: Same as Lapply but try to Simpli FY the result
Apply: Apply a function over the margins of an array
tapply: Apply a function over subsets of a vector mapply:multivariate version of lapply
An auxiliary function split was also useful, particularly in conjunction with lapply
Lapply
Lapply takes three arguments: (1) a list X; (2) a function (or the name of a function) fun; (3) Other arguments via its ... argument. If X is a list, it'll be coerced to a list using As.list.
# # function (X, fun, ...)
## {
# # Fun <-match.fun (fun)
# # if (!is.vector (x) | | | is.object (x))
# # x <-as.list (x)
## . Internal (lapply (X, Fun))
## }
# <bytecode:0x7ff7a1951c00>
# <environment:namespace:base>
The actual looping is do internally in C code.
Lapply always returns a list, regardless of the class of the input.
x <-List (a = 1:5, B = Rnorm (10))
Lapply (x, mean)
x <-List (a = 1:4, B = Rnorm (ten), C = rnorm (1), d = Rnorm (5)) lapply (x, mean)
> x <-1:4 > lapply (x, runif)
Lapply and friends make heavy use of anonymous function
> x <-list (a = Matrix (1:4, 2, 2), B = Matrix (1:6, 3, 2))
> x
$a
[, 1] [, 2]
[1,] 1 3
[2,] 2 4
$b
[, 1] [, 2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
An anonymous function for extracting the first column of each matrix.
> lapply (x, Function (ELT) elt[,1])
$a
[1] 1 2
$b
[1] 1 2 3
Sapply
> x <-list (a = 1:4, B = Rnorm (ten), C = rnorm (1), d = rnorm (100, 5))
> lapply (x, mean)
Apply
Apply is used to a evaluate a function (often-anonymous one) over the margins of an array.
It is most often used to apply a function to the rows or columns of a matrix
It can be used with general arrays, e.g. taking the average of an array of matrices
It isn't really faster than writing a loop, but it works in one line!
> str (apply)
function (X, MARGIN, fun, ...)
X is an array
MARGIN is a integer vector indicating which margins should be "retained".
Fun was a function to be applied
Arguments to being passed to fun
> x <-Matrix (Rnorm (200), 20, 10)
> Apply (x, 2, mean)
[1] 0.04868268 0.35743615-0.09104379
[4] -0.05381370-0.16552070-0.18192493
[7] 0.10285727 0.36519270 0.14898850
[10] 0.26767260
Col/row sums and Means
For sums and means of the matrix dimensions, we have some shortcuts.
Rowsums = Apply (x, 1, sum)
Rowmeans = Apply (x, 1, mean)
Colsums = Apply (x, 2, sum)
Colmeans = Apply (x, 2, mean)
The shortcut functions is much faster, but you won ' t notice unless you ' re using a large matrix.
Other Ways to Apply
Quantiles of the rows of a matrix.
> x <-Matrix (Rnorm (200), 20, 10)
> Apply (x, 1, quantile, probs = C (0.25, 0.75))
Mapply
Mapply is a multivariate apply of sorts which applies a function in parallel over a set of arguments.
> str (mapply)
function (fun, ..., Moreargs = NULL, simplify = True,use. NAMES = TRUE)
Fun are a function to apply ... contains arguments to apply over Moreargs are a list of other arguments to fun.
Simplify indicates whether the result should be simplified
The following is tedious to type
List (Rep (1, 4), Rep (2, 3), Rep (3, 2), Rep (4, 1))
Instead we can do
Vectorizing a Function
> Noise <-function (n, mean, SD) {
+ rnorm (n, mean, SD)
+ }
> Noise (5, 1, 2)
[1] 2.4831198 2.4790100 0.4855190-1.2117759
[5]-0.2743532
> Noise (1:5, 1:5, 2)
[1] -4.2128648-0.3989266 4.2507057 1.1572738
[5] 3.7413584
Instant vectorization
> mapply (Noise, 1:5, 1:5, 2)
Which is the same as
List (Noise (1, 1, 2), Noise (2, 2, 2), Noise (3, 3, 2), Noise (4, 4, 2), Noise (5, 5, 2))
Tapply
Tapply is used to apply a function over subsets of a vector. I don ' t know why it ' s called tapply.
> str (tapply) function (X, INDEX, fun = NULL, ..., simplify = TRUE)
Vector X is a
INDEX is a factor or a list of factors (or else they be coerced to factors)
Fun was a function to be applied
... contains other arguments to being passed fun
Simplify, should we simplify the result?
Take group means.
> x <-C (rnorm), runif (Ten), Rnorm (10, 1))
> F <-GL (3, 10)
> F
[1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3
[24] 3 3 3 3 3 3 3
Levels:1 2 3
> tapply (x, F, mean)
1 2 3
0.1144464 0.5163468 1.2463678
Take group means without simplification.
> tapply (x, F, mean, simplify = FALSE)
$ ' 1 '
[1] 0.1144464
$ ' 2 '
[1] 0.5163468
$ ' 3 '
[1] 1.246368
Find group Ranges.
> tapply (x, F, Range)
$ ' 1 '
[1]-1.097309 2.694970
$ ' 2 '
[1] 0.09479023 0.79107293
$ ' 3 '
[1] 0.4717443 2.5887025
Split
Split takes a vector or other objects and splits it to groups determined by a factor or list offactors.
> str (split) function (x, f, drop = FALSE, ...)
X is a vector (or list) or data frame
F is a factor (or coerced to one) or a list of factors
Drop indicates whether empty factors levels should be dropped
A Common idiom is split followed by an lapply.
> lapply (Split (x, F), mean)
Splitting a Data Frame
> Library (Datasets)
> Head (airquality)
> S <-split (airquality, Airquality$month)
> lapply (S, function (x) Colmeans (x[, C ("Ozone", "SOLAR.R", "Wind")])
> sapply (S, function (x) Colmeans (x[, C ("Ozone", "SOLAR.R", "Wind")])
> sapply (S, function (x) Colmeans (x[, C ("Ozone", "SOLAR.R", "Wind")], na.rm = TRUE)
Splitting on + than one level
> x <-rnorm (10)
> F1 <-GL (2, 5)
> F2 <-GL (5, 2)
Interactions can create empty levels.
> str (split (x, List (f1, F2)))
Split
Empty levels can be dropped
> str (split (x, List (f1, f2), drop = TRUE))
List of 6
$1.1:num [1:2]-0.378 0.445
$1.2:num [1:2] 1.4066 0.0166
$1.3:num-0.355
$2.3:num 0.315
$2.4:num [1:2]-0.907 0.723
$2.5:num [1:2] 0.732 0.360
Welcome attention
R Programming Week 3-loop functions