Recently participated in a small game where the tapply,sapply (lapply) function can quickly achieve my purpose and effectively reduce the amount of code
F |
Description |
Tapply (X,F,G) |
X is a vector, F is a factor column, G is an action function, and a similar operation with a data frame can be done by using the by function |
Sapply (LIST,G) |
G is the operation function, the return result is a vector, and the lapply return result is the list form. Often used in combination with split |
Example
Data for 980 bus different stations of the number of passengers on the statistics
Line Name |
License plate number |
Arrival Site |
Number of boarding |
Start Boarding Time |
end of boarding time |
980 |
Yue BM8475 |
14 |
11 |
2014-06-09 07:08 |
2014-06-09 07:13 |
980 |
Yue BM8475 |
13 |
3 |
2014-06-09 07:14 |
2014-06-09 07:15 |
980 |
Yue BM8475 |
12 |
10 |
2014-06-09 07:17 |
2014-06-09 07:17 |
980 |
Yue BM8475 |
10 |
5 |
2014-06-09 07:20 |
2014-06-09 07:20 |
980 |
Yue BM8475 |
8 |
1 |
2014-06-09 07:22 |
2014-06-09 07:22 |
Count the number of passengers at different stations
(Freq <-tapply (data[,4], data[,3], sum))
1 2 3 4 5 6 7 8 9
257 325 164 174 186 205 80 118 267 259 130
According to the license plate number classification, separately counts the number of different stations
Data <-split (data, data$ license plate number) # #对数据按照车牌号分组
Freq <-sapply (data, function (data) {
tapply (data$ Number of passengers, data$ arrival site, sum)
}) # #按照车牌号统计不同站点上车人数
freq[1] # #查看结果
$ yue BM8475
1 2 3 4 5 7 8 9 ten 6 15 2 4 7 3 1
Summary
The proficiency of the two data types, the factor, the list, will have a deeper understanding of the power of R.