R in Action reading notes (16) 12th chapter re-sampling and self-help method replacement test

Last Update:2015-05-01 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The 12th chapter: re-sampling and self-help method

In this chapter, we will explore two widely used statistical methods based on randomized thinking: displacement testing and self-help method

12.1 Replacement Test

Displacement test, also known as randomized test or re-randomized test.

There are two types of experiments where 10 subjects have been randomly assigned to one of the conditions (a or B), and the corresponding result variable (score) has also been recorded. The experimental results are as follows:

If the two treatments are really equivalent, then the label assigned to the observation score (A or b processing) is arbitrary. To verify the differences between the two treatments, we can follow these steps:

(1) Similar to the parameter method, the T statistic of the observed data is calculated, called T0;

(2) Put 10 points in a group;

(3) Randomly assign five scores to a processing, and assign five scores to B processing;

(4) Calculate and record the new observed t statistics;

(5) Repeat (3) ~ (4) steps for each possible random allocation, where there are 252 possible allocation combinations;

(6) The 252 T statistics are arranged in ascending order, which is the empirical distribution based on (or on the condition of) sample data;

(7) If the T0 falls on the outside of the 95% part of the empirical distribution, the two processing group is rejected at the 0.05 significance level.

0 hypothesis that the overall mean is equal.

12.2 Replacement test with coin bag

For the independence issue, the coin package provides a general framework for displacement testing. Through the package you can answer

The following questions:

? is the response value independent of the group assignment?

? Two numeric variables independent?

? Two categories variable independent?

The coin function for optional displacement testing is provided in relation to the traditional test:

Inspection	Coin function
Two-sample and K-Sample replacement tests	Oneway_test (y ~ A)
Two-sample and K-sample substitution tests with one stratified (Zone group) factor	Oneway_test (y ~ A \| C
Wilcoxon-mann-whitney Rank and test	Wilcox_test (y ~ A)
Kruskal-wallis Inspection	Kruskal_test (y ~ A)
Person Chi-square test	Chisq_test (A ~ B)
Cochran-mantel-haenszel Inspection	Cmh_test (A ~ B \| C
Linear correlation Test	Lbl_test (D ~ E)
Spearman Inspection	Spearman_test (y ~ x)
Friedman Inspection	Friedman_test (y ~ A \| C
Wilcoxon symbol rank test	Wilcoxsign_test (y1 ~ y2)

In the coin function, y and x are numeric variables, A and B are categorical factors, C is a category-type Zone group variable, D and e are ordered, and Y1 and Y2 are matched

numeric variables.

Functional form: function (formula,data,distribution=)

which

? Formula describes the relationship between variables to be tested. Examples can be found in table 12-2;

? data is a database frame;

The distribution specifies that the experience is distributed in the form of 0 hypothetical conditions, with possible values of exact,asymptotic and

Approximate. If distribution = "Exact", then the calculation of the distribution is accurate (i.e. according to all possible permutations) under 0 assumptions. Of course, approximate calculations can also be made based on its asymptotic distribution (distribution = "asymptotic") or Monte Carlo resampling (distribution = "Approxiamate (B = #)"), where # refers to the number of repetitions required. Distribution = "Exact" is currently available only for two sample problems.

12.2.1 independent two-sample and K-sample test

T-Test and single-factor substitution test in virtual data:

> library (coin) > Score<-c (40,57,45,55,58,57,64,55,62,65) >treatment<-factor (C (Rep ("A", 5), Rep ("B", 5 )) > Mydata<-data.frame (treatment,score) > T.test (score~treatment,data=mydata,var.equal=true) Samplet-testdata:score Bytreatmentt = -2.345, df = 8, P-value = 0.04705alternative hypothesis:true difference in means I s notequal to 095 percent confidence interval:-19.0405455-0.1594545sample Estimates:mean in group A mean in group B51.0 6 0.6>oneway_test (score~treatment,data=mydata,distribute= "exact") Asymptotic2-sample Permutation Testdata:score By treatment (A, B) Z = -1.9147, p-value = 0.05553alternative hypothesis:true mu are not equal to 0wilcoxon-mann-whitney u test > Library (MASS) > Uscrime<-transform (Uscrime,so=factor (SO)) >wilcox_test (Prob~so,data=uscrime, distribute= "exact") asymptotic Wilcoxonmann-whitney Rank Sum testdata:prob by SO (0, 1) Z = -3.7493, P-value = 0.0001774alte Rnative hypothesis:true mu isn't equal to 0

Approximate K-Sample replacement Test

> Library (Multcomp) > Set.seed (1234) > oneway_test (response~trt,data=cholesterol,+ distribution=approximate (b=9999)) Approximativek-sample permutation testdata:response bytrt (1time, 2times, 4times, Drugd, druge) Maxt = 4.7623, P-value < ; 2.2e-16

12.2.2 independence in the list of tables

With the Chisq_test () or cmh_test () function, we can determine the independence of the two category variables using the displacement test.

The latter function is required when the data can be layered against a third category variable. If the variables are ordered, you can use the

Lbl_test () function to verify that there is a linear trend.

> Library (COIN)

> Library (VCD)

Load the required thread bundle: Grid

> Arthritis<-transform (Arthritis,

+ Improved=as.factor (as.numeric (improved)))

> set.seed (1234)

> Chisq_test (treatment~improved,data=arthritis,distribution=approximate (b=9999))

Approximativepearson ' s chi-squared Test

Data:treatment byimproved (1, 2, 3)

chi-squared = 13.055, P-value = 0.0018

It is necessary to change the variable improved from an ordered factor to a classification factor because, if the order factor is used, coin ()

A linear and linear trend test will be generated instead of a chi-square test.

12.2.3 Independence between numerical variables

The Spearman_test () function provides independent substitution testing of two numeric variables. > States<-as.data.frame (state.x77) > Set.seed (1234) >spearman_test (Illiteracy~murder,data=states, Distribution=approximate (b=9999)) Approximativespearman Correlation testdata:illiteracyby Murderz = 4.7065, P-value < 2.2e-16alternative hypothesis:true Mu is not equal to 0# independence assumptions are not met.

12.2.42 Sample and K sample correlation test

Sample-related tests can be useful when observations in different groups have been allocated properly or when repeated measurements have been used.

For permutation tests of two paired groups, use the Wilcoxsign_test () function, or use friedman_ when more than two groups are available.

The test () function.

> library (Coin) > Library (MASS) >wilcoxsign_test (u1~u2,data=uscrime,distribution= "exact") Exactwilcoxon-signed-rank testdata:y by X (Neg,pos) stratified by Blockz = 5.9691, P-value = 1.421e-14alternative hypothes Is:true mu isn't equal to 0# the result shows that the unemployment rate is different.

R in Action reading notes (16) 12th chapter re-sampling and self-help method replacement test

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

R in Action reading notes (16) 12th chapter re-sampling and self-help method replacement test

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

R in Action reading notes (16) 12th chapter re-sampling and self-help method replacement test

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support