R language--k random-sharing data set for folded cross-validation

Source: Internet
Author: User

Today, when reading Professor Wu Xizhi's "Complex data statistics method", encountered a data set according to a certain factor into subsets, and then a few subsets randomly divided into n parts of the problem, Professor Wu's method is better understood, but I still feel a bit cumbersome, so I wrote a function, After that, you just need to run the function.

This uses the iris dataset that comes with R,

> str (IRIS) ' data.frame ': Obs. of 5 variables:$ Sepal.Length:num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... $ sepal.widt H:num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... $ Petal.Length:num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... $ petal.wi Dth:num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ... $ species:factor W/3 levels "Setosa", "Versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...

  

IRIS DataSet structure as shown above, where species is a factor data, a total of three levels, according to species can be divided into three subsets, 50 percent cross-validation of each subset, you need to divide each dataset into five parts, R language code is as follows:

Fivedivide<-function (col,data,n=5) {#col is a FACOTR type column,divide each group of the Dataframe #into n partitio Ns,string type #data is a data.frame type in R #n represents the numbers which we want to divide Into,default 5 #the F  Unction return a list contain n data.frame #use sample (x) generate x numbers in unordered rank,then #divide the x numebr into n partitions group_num=length (levels (Data[,col])) # lst1=list () #按照因子分类把原数据分成group_num份 lst2=list () #把每一个gruop分成 Equal data frame Lst3=list () # for (i in 1:group_num) {lst1[[i]]=data[data[col]==levels (Data[,col]) [i],] #这里先把原数据集按照因子水平分成n个子 Set} for (k in 1:group_num) #这个循环的目的就是把么个子集平均分成n份, and is randomly divided, need to use the sample function {od=sample (Nrow (Lst1[[k])) newdata=lst1[[ K]][od,] len=length (OD) cutpoint=floor (len/n) for (j in 1:n) {if (len>=cutpoint* (1+j)) {ls t2[[j]]=newdata[(cutpoint* (j-1) +1):(cutpoint*j),]} else {lst2[[j]]=newdata[(cutpoint* (j-1) +1): Len, ]}} lst3[[k]]=LST2} return (LST3) #lst2 =list ()} 

To process Iris:

> rep=fivedivide ("Species", iris,5) > str (REP) List of 3 $: List of 5. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 4.8 5.2 4.8 4.7 5.5 5.1 4.8 4.4 4.8 4.9.. .. $ Sepal.Width:num [1:10] 3 3.5 3.4 3.2 3.5 3.7 3.1 3 3.4 3.. .. $ Petal.Length:num [1:10] 1.4 1.5 1.6 1.6 1.3 1.5 1.6 1.3 1.9 1.4.. .. $ Petal.Width:num [1:10] 0.3 0.2 0.2 0.2 0.2 0.4 0.2 0.2 0.2 0.2.. .. $ species:factor W/3 Levels "Setosa", "versicolor",..:1111111111.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 5 4.7 4.8 5.2 5.1 5.1 4.9 5.4 5 5.5.. .. $ Sepal.Width:num [1:10] 3.5 3.2 3 3.4 3.5 3.8 3.1 3.4 3.5 4.2.. .. $ Petal.Length:num [1:10] 1.3 1.3 1.4 1.4 1.4 1.5 1.5 1.7 1.6 1.4.. .. $ Petal.Width:num [1:10] 0.3 0.2 0.1 0.2 0.2 0.3 0.1 0.2 0.6 0.2.. .. $ species:factor W/3 Levels "Setosa", "versicolor",..:1111111111.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 5.4 4.3 4.9 5.44.4 4.6 5.1 5 5.1 5.1.. .. $ Sepal.Width:num [1:10] 3.9 3 3.6 3.9 3.2 3.6 3.4 3.4 3.8 3.8.. .. $ Petal.Length:num [1:10] 1.3 1.1 1.4 1.7 1.3 1 1.5 1.6 1.9 1.6.. .. $ Petal.Width:num [1:10] 0.4 0.1 0.1 0.4 0.2 0.2 0.2 0.4 0.4 0.2.. .. $ species:factor W/3 Levels "Setosa", "versicolor",..:1111111111.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 4.4 4.5 5.3 5 5 5.1 5.4 5.2 5.1 5.4.. .. $ Sepal.Width:num [1:10] 2.9 2.3 3.7 3.3 3.4 3.3 3.7 4.1 3.5 3.4.. .. $ Petal.Length:num [1:10] 1.4 1.3 1.5 1.4 1.5 1.7 1.5 1.5 1.4 1.5.. .. $ Petal.Width:num [1:10] 0.2 0.3 0.2 0.2 0.2 0.5 0.2 0.1 0.3 0.4.. .. $ species:factor W/3 Levels "Setosa", "versicolor",..:1111111111.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 4.6 5.8 5 5 5 4.6 5.7 4.9 5.7 4.6.. .. $ Sepal.Width:num [1:10] 3.4 4 3.6 3.2 3 3.2 4.4 3.1 3.8 3.1.. .. $ Petal.Length:num [1:10] 1.4 1.2 1.4 1.2 1.6 1.4 1.5 1.5 1.7 1.5.... $ Petal.Width:num [1:10] 0.3 0.2 0.2 0.2 0.2 0.2 0.4 0.2 0.3 0.2.. .. $ species:factor W/3 Levels "Setosa", "Versicolor",..: 1 1 1 1 1 1 1 1 1 1 $: List of 5.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 6.2 6 5.8 6.3 5.5 5.8 5.8 6.1 6.2 5.6.. .. $ Sepal.Width:num [1:10] 2.9 3.4 2.7 3.3 2.6 2.6 2.7 3 2.2 3.. .. $ Petal.Length:num [1:10] 4.3 4.5 3.9 4.7 4.4 4 4.1 4.6 4.5 4.1.. .. $ Petal.Width:num [1:10] 1.3 1.6 1.2 1.6 1.2 1.2 1 1.4 1.5 1.3.. .. $ species:factor W/3 Levels "Setosa", "versicolor",..:2222222222.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 6.4 5.6 5.7 6.6 6 6.4 5.9 6.9 6.7 5.5.. .. $ Sepal.Width:num [1:10] 3.2 2.5 2.8 3 2.2 2.9 3 3.1 3.1 2.5.. .. $ Petal.Length:num [1:10] 4.5 3.9 4.5 4.4 4 4.3 4.2 4.9 4.4 4.. .. $ Petal.Width:num [1:10] 1.5 1.1 1.3 1.4 1 1.3 1.5 1.5 1.4 1.3.. .. $ species:factor W/3 Levels "Setosa", "versicolor",..:2222222222.. $: 'Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 6.5 5.2 6.8 6 5.7 5 6.3 5.7 5.5 5.6.. .. $ Sepal.Width:num [1:10] 2.8 2.7 2.8 2.9 2.9 2.3 2.5 2.8 2.3 3.. .. $ Petal.Length:num [1:10] 4.6 3.9 4.8 4.5 4.2 3.3 4.9 4.1 4 4.5.. .. $ Petal.Width:num [1:10] 1.5 1.4 1.4 1.5 1.3 1 1.5 1.3 1.3 1.5.. .. $ species:factor W/3 Levels "Setosa", "versicolor",..:2222222222.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 6.6 6.7 5 6.7 5.9 6.1 5.7 5.4 6 5.1.. .. $ Sepal.Width:num [1:10] 2.9 3 2 3.1 3.2 2.8 2.6 3 2.7 2.5.. .. $ Petal.Length:num [1:10] 4.6 5 3.5 4.7 4.8 4 3.5 4.5 5.1 3.. .. $ Petal.Width:num [1:10] 1.3 1.7 1 1.5 1.8 1.3 1 1.5 1.6 1.1.. .. $ species:factor W/3 Levels "Setosa", "versicolor",..:2222222222.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 5.6 6.1 6.3 7 4.9 5.7 5.5 5.5 6.1 5.6.. .. $ Sepal.Width:num [1:10] 2.7 2.9 2.3 3.2 2.4 3 2.4 2.4 2.8 2.9.. .. $ Petal.Length:num [1:10] 4.2 4.7 4.4 4.7 3.3 4.2 3.8 3.7 4.7 3.6.. .. $ Petal.Width:num [1:10] 1.3 1.4 1.3 1.4 1 1.2 1.1 1 1.2 1.3.. .. $ species:factor W/3 Levels "Setosa", "Versicolor",..: 2 2 2 2 2 2 2 2 2 2 $: List of 5.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 6.9 6.7 6.1 6.4 6.4 6.7 5.7 6.5 6.4 6.3.. .. $ Sepal.Width:num [1:10] 3.2 2.5 2.6 2.8 3.1 3.3 2.5 3 2.7 2.9.. .. $ Petal.Length:num [1:10] 5.7 5.8 5.6 5.6 5.5 5.7 5 5.5 5.3 5.6.. .. $ Petal.Width:num [1:10] 2.3 1.8 1.4 2.1 1.8 2.1 2 1.8 1.9 1.8.. .. $ species:factor W/3 Levels "Setosa", "versicolor",..:3333333333.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 5.8 7.7 6.5 6.4 7.4 6.3 6.8 6 6.7 6.8.. .. $ Sepal.Width:num [1:10] 2.8 2.8 3.2 3.2 2.8 3.3 3 2.2 3.3 3.2.. .. $ Petal.Length:num [1:10] 5.1 6.7 5.1 5.3 6.1 6 5.5 5 5.7 5.9.. .. $ Petal.Width:num [1:10] 2.4 2 2 2.3 1.9 2.5 2.1 1.5 2.5 2.3.. .. $ species:Factor W/3 Levels "Setosa", "versicolor",..:3333333333.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 5.8 6.2 6 6.1 7.7 5.6 6.3 7.3 7.2 6.9.. .. $ Sepal.Width:num [1:10] 2.7 2.8 3 3 2.6 2.8 2.8 2.9 3 3.1.. .. $ Petal.Length:num [1:10] 5.1 4.8 4.8 4.9 6.9 4.9 5.1 6.3 5.8 5.4.. .. $ Petal.Width:num [1:10] 1.9 1.8 1.8 1.8 2.3 2 1.5 1.8 1.6 2.1.. .. $ species:factor W/3 Levels "Setosa", "versicolor",..:3333333333.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 6.7 7.2 7.2 6.3 6.3 6.5 6.3 7.7 7.9 6.5.. .. $ Sepal.Width:num [1:10] 3 3.2 3.6 2.7 2.5 3 3.4 3.8 3.8 3.. .. $ Petal.Length:num [1:10] 5.2 6 6.1 4.9 5 5.8 5.6 6.7 6.4 5.2.. .. $ Petal.Width:num [1:10] 2.3 1.8 2.5 1.8 1.9 2.2 2.4 2.2 2 2.. .. $ species:factor W/3 Levels "Setosa", "versicolor",..:3333333333.. $: ' Data.frame ': Ten obs. of 5 variables:.. .. $ Sepal.Length:num [1:10] 7.7 6.4 6.2 6.9 6.7 7.1 5.8 4.9 5.97.6.. .. $ Sepal.Width:num [1:10] 3 2.8 3.4 3.1 3.1 3 2.7 2.5 3 3.. .. $ Petal.Length:num [1:10] 6.1 5.6 5.4 5.1 5.6 5.9 5.1 4.5 5.1 6.6.. .. $ Petal.Width:num [1:10] 2.3 2.2 2.3 2.3 2.4 2.1 1.9 1.7 1.8 2.1.. ..   $ species:factor W/3 Levels "Setosa", "Versicolor",..: 3 3 3 3 3 3 3 3 3 3

  

After averaging, the data behaves as follows:

> rep[[1]][[1]][[1]] sepal.length sepal.width petal.length petal.width Species46 4.8 3.0 1.         4 0.3 setosa28 5.2 3.5 1.5 0.2 SETOSA12 4.8 3.4 1.6         0.2 Setosa30 4.7 3.2 1.6 0.2 setosa37 5.5 3.5 1.3         0.2 Setosa22 5.1 3.7 1.5 0.4 Setosa31 4.8 3.1 1.6         0.2 setosa39 4.4 3.0 1.3 0.2 setosa25 4.8 3.4 1.9 0.2 setosa2 4.9 3.0 1.4 0.2 setosa[[1]][[2]] sepal.length sepal.width petal.lengt         H Petal.width SPECIES41 5.0 3.5 1.3 0.3 SETOSA3 4.7 3.2 1.3         0.2 Setosa13 4.8 3.0 1.4 0.1 Setosa29 5.2 3.4 1.4      0.2 SETOSA1     5.1 3.5 1.4 0.2 SETOSA20 5.1 3.8 1.5 0.3 setosa10          4.9 3.1 1.5 0.1 Setosa21 5.4 3.4 1.7 0.2 Setosa44   5.0 3.5 1.6 0.6 Setosa34 5.5 4.2 1.4 0.2 setosa[[1]][[3]]          Sepal.length sepal.width petal.length petal.width Species17 5.4 3.9 1.3 0.4 Setosa14           4.3 3.0 1.1 0.1 setosa38 4.9 3.6 1.4 0.1 Setosa6          5.4 3.9 1.7 0.4 Setosa43 4.4 3.2 1.3 0.2 Setosa23          4.6 3.6 1.0 0.2 SETOSA40 5.1 3.4 1.5 0.2 setosa27 5.0 3.4 1.6 0.4 Setosa45 5.1 3.8 1.9 0.4 Setosa47 5          .1 3.81.6 0.2 setosa[[1]][[4]] sepal.length sepal.width petal.length petal.width SPECIES9 4.4 2.9          1.4 0.2 setosa42 4.5 2.3 1.3 0.3 Setosa49 5.3 3.7          1.5 0.2 Setosa50 5.0 3.3 1.4 0.2 SETOSA8 5.0 3.4          1.5 0.2 Setosa24 5.1 3.3 1.7 0.5 Setosa11 5.4 3.7          1.5 0.2 Setosa33 5.2 4.1 1.5 0.1 Setosa18 5.1 3.5 1.4 0.3 Setosa32 5.4 3.4 1.5 0.4 setosa[[1]][[5]] sepal.length sepal.width P Etal.          Length petal.width SPECIES7 4.6 3.4 1.4 0.3 SETOSA15 5.8 4.0 1.2 0.2 SETOSA5 5.0 3.6 1.4 0.2 setosa36 5.0 3.2 1 .2 0.2 SetOSA26 5.0 3.0 1.6 0.2 Setosa48 4.6 3.2 1.4 0.2 Setos A16 5.7 4.4 1.5 0.4 Setosa35 4.9 3.1 1.5 0.2 setosa1 9 5.7 3.8 1.7 0.3 setosa4 4.6 3.1 1.5 0.2 setosa[[2         ]][[2]][[1]] sepal.length sepal.width petal.length petal.width Species98 6.2 2.9 4.3         1.3 Versicolor86 6.0 3.4 4.5 1.6 versicolor83 5.8 2.7 3.9 1.2 versicolor57 6.3 3.3 4.7 1.6 versicolor91 5.5 2.6 4.          4 1.2 versicolor93 5.8 2.6 4.0 1.2 versicolor68 5.8 2.7          4.1 1.0 versicolor92 6.1 3.0 4.6 1.4 versicolor69 6.2 2.2 4.5 1.5 veRsicolor89 5.6 3.0 4.1 1.3 versicolor[[2]][[2]] sepal.length sepal.width petal.length          Petal.width SPECIES52 6.4 3.2 4.5 1.5 versicolor70 5.6 2.5          3.9 1.1 versicolor56 5.7 2.8 4.5 1.3 versicolor76 6.6 3.0          4.4 1.4 VERSICOLOR63 6.0 2.2 4.0 1.0 versicolor75 6.4 2.9 4.3 1.3 versicolor62 5.9 3.0 4.2 1.5 versicolor53 6.9 3         .1 4.9 1.5 versicolor66 6.7 3.1 4.4 1.4 Versicolor90 5.5           2.5 4.0 1.3 Versicolor[[2]][[3]] sepal.length sepal.width petal.length petal.width SPECIES55 6.5 2.8 4.6 1.5 Versicolor60 5.2 2.7 3.9 1.4 versicolor        77 6.8 2.8 4.8 1.4 versicolor79 6.0 2.9 4.5 1.5 versicolor97 5.7           2.9 4.2 1.3 Versicolor94 5.0 2.3 3.3 1.0 versicolor73           6.3 2.5 4.9 1.5 versicolor100 5.7 2.8 4.1 1.3 versicolor54 5.5 2.3 4.0 1.3 versicolor67 5.6 3.0 4.5 1.5 versicolor [[2]] [[4]] sepal.length sepal.width petal.length petal.width Species59 6.6 2.9 4.6 1.3 V Ersicolor78 6.7 3.0 5.0 1.7 VERSICOLOR61 5.0 2.0 3.5 1         .0 Versicolor87 6.7 3.1 4.7 1.5 versicolor71 5.9 3.2 4.8         1.8 VERSICOLOR72 6.1 2.8 4.0 1.3 VERSICOLOR80 5.7 2.6 3.5 1.0 VersicolOr85 5.4 3.0 4.5 1.5 Versicolor84 6.0 2.7 5.1 1.6 vers ICOLOR99 5.1 2.5 3.0 1.1 Versicolor[[2]][[5]] sepal.length sepal.width petal.length Pe Tal.         Width Species95 5.6 2.7 4.2 1.3 versicolor64 6.1 2.9 4.7 1.4 Versicolor88 6.3 2.3 4.4 1.3 versicolor51 7.0 3.2 4.          7 1.4 versicolor58 4.9 2.4 3.3 1.0 versicolor96 5.7 3.0          4.2 1.2 versicolor81 5.5 2.4 3.8 1.1 versicolor82 5.5 2.4           3.7 1.0 versicolor74 6.1 2.8 4.7 1.2 VERSICOLOR65 5.6 2.9          3.6 1.3 Versicolor[[3]][[3]][[1]] sepal.length sepal.width petal.length petal.width Species121 6.9 3.2          5.7 2.3 virginica109 6.7 2.5 5.8 1.8 virginica135 6.1         2.6 5.6 1.4 virginica129 6.4 2.8 5.6 2.1 virginica138 6.4          3.1 5.5 1.8 virginica125 6.7 3.3 5.7 2.1 virginica114 5.7          2.5 5.0 2.0 virginica117 6.5 3.0 5.5 1.8 virginica112 6.4 2.7 5.3 1.9 virginica104 6.3 2.9 5.6 1.8 virginica[[3]][[2 ]] sepal.length sepal.width petal.length petal.width Species115 5.8 2.8 5.1 2.4 vir  Ginica123 7.7 2.8 6.7 2.0 virginica111 6.5 3.2 5.1 2.0         virginica116 6.4 3.2 5.3 2.3 virginica131 7.4 2.8 6.1       1.9 virginica101   6.3 3.3 6.0 2.5 virginica113 6.8 3.0 5.5 2.1 virginica120 6.0 2.2 5.0 1.5 virginica145 6.7 3.3 5.7 2.5 VIRGINICA14 4 6.8 3.2 5.9 2.3 virginica[[3]][[3]] sepal.length sepal.width petal.length petal.wid         TH Species143 5.8 2.7 5.1 1.9 virginica127 6.2 2.8 4.8         1.8 virginica139 6.0 3.0 4.8 1.8 virginica128 6.1 3.0 4.9 1.8 virginica119 7.7 2.6 6.9 2.3 virginica122 5.6 2.8 4          .9 2.0 virginica134 6.3 2.8 5.1 1.5 virginica108 7.3 2.9          6.3 1.8 virginica130 7.2 3.0 5.8 1.6 virginica140 6.9 3.1 5.4 2.1 VIRGINICA[[3]][[4]] sepal.length sepal.width petal.length petal.width Species146 6.7 3.0 5.2          2.3 virginica126 7.2 3.2 6.0 1.8 virginica110 7.2 3.6          6.1 2.5 virginica124 6.3 2.7 4.9 1.8 virginica147 6.3 2.5          5.0 1.9 virginica105 6.5 3.0 5.8 2.2 virginica137 6.3 3.4         5.6 2.4 virginica118 7.7 3.8 6.7 2.2 virginica132 7.9 3.8 6.4 2.0 virginica148 6.5 3.0 5.2 2.0 Virginica[[3]][[5]] Sepal.l          Ength sepal.width petal.length petal.width Species136 7.7 3.0 6.1 2.3 virginica133           6.4 2.8 5.6 2.2 VIRGINICA149 6.2 3.4 5.4 2.3 virginica142 6.9 3.1 5.1 2.3 virginica141 6.7 3.1 5.6 2.4 virginica103 7.1         3.0 5.9 2.1 virginica102 5.8 2.7 5.1 1.9 virginica107 4.9 2.5 4.5 1.7 virginica150 5.9 3.0 5.1 1.8 virginica106 7. 6 3.0 6.6 2.1 virginica

  

R language--k random-sharing data set for folded cross-validation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.