R in Action reading notes (16) 12th chapter re-sampling and self-help method replacement test

Source: Internet
Author: User

The 12th chapter: re-sampling and self-help method

In this chapter, we will explore two widely used statistical methods based on randomized thinking: displacement testing and self-help method

12.1 Replacement Test

Displacement test, also known as randomized test or re-randomized test.

There are two types of experiments where 10 subjects have been randomly assigned to one of the conditions (a or B), and the corresponding result variable (score) has also been recorded. The experimental results are as follows:

If the two treatments are really equivalent, then the label assigned to the observation score (A or b processing) is arbitrary. To verify the differences between the two treatments, we can follow these steps:

(1) Similar to the parameter method, the T statistic of the observed data is calculated, called T0;

(2) Put 10 points in a group;

(3) Randomly assign five scores to a processing, and assign five scores to B processing;

(4) Calculate and record the new observed t statistics;

(5) Repeat (3) ~ (4) steps for each possible random allocation, where there are 252 possible allocation combinations;

(6) The 252 T statistics are arranged in ascending order, which is the empirical distribution based on (or on the condition of) sample data;

(7) If the T0 falls on the outside of the 95% part of the empirical distribution, the two processing group is rejected at the 0.05 significance level.

0 hypothesis that the overall mean is equal.

12.2 Replacement test with coin bag

For the independence issue, the coin package provides a general framework for displacement testing. Through the package you can answer

The following questions:

? is the response value independent of the group assignment?

? Two numeric variables independent?

? Two categories variable independent?

The coin function for optional displacement testing is provided in relation to the traditional test:

Inspection

Coin function

Two-sample and K-Sample replacement tests

Oneway_test (y ~ A)

Two-sample and K-sample substitution tests with one stratified (Zone group) factor

Oneway_test (y ~ A | C

Wilcoxon-mann-whitney Rank and test

Wilcox_test (y ~ A)

Kruskal-wallis Inspection

Kruskal_test (y ~ A)

Person Chi-square test

Chisq_test (A ~ B)

Cochran-mantel-haenszel Inspection

Cmh_test (A ~ B | C

Linear correlation Test

Lbl_test (D ~ E)

Spearman Inspection

Spearman_test (y ~ x)

Friedman Inspection

Friedman_test (y ~ A | C

Wilcoxon symbol rank test

Wilcoxsign_test (y1 ~ y2)

In the coin function, y and x are numeric variables, A and B are categorical factors, C is a category-type Zone group variable, D and e are ordered, and Y1 and Y2 are matched

numeric variables.

Functional form: function (formula,data,distribution=)

which

? Formula describes the relationship between variables to be tested. Examples can be found in table 12-2;

? data is a database frame;

The distribution specifies that the experience is distributed in the form of 0 hypothetical conditions, with possible values of exact,asymptotic and

Approximate. If distribution = "Exact", then the calculation of the distribution is accurate (i.e. according to all possible permutations) under 0 assumptions. Of course, approximate calculations can also be made based on its asymptotic distribution (distribution = "asymptotic") or Monte Carlo resampling (distribution = "Approxiamate (B = #)"), where # refers to the number of repetitions required. Distribution = "Exact" is currently available only for two sample problems.

12.2.1 independent two-sample and K-sample test

T-Test and single-factor substitution test in virtual data:

> library (coin) > Score<-c (40,57,45,55,58,57,64,55,62,65) >treatment<-factor (C (Rep ("A", 5), Rep ("B", 5 )) > Mydata<-data.frame (treatment,score) > T.test (score~treatment,data=mydata,var.equal=true) Samplet-testdata:score Bytreatmentt = -2.345, df = 8, P-value = 0.04705alternative hypothesis:true difference in means I s notequal to 095 percent confidence interval:-19.0405455-0.1594545sample Estimates:mean in group A mean in group B51.0 6 0.6>oneway_test (score~treatment,data=mydata,distribute= "exact") Asymptotic2-sample Permutation Testdata:score By treatment (A, B) Z = -1.9147, p-value = 0.05553alternative hypothesis:true mu are not equal to 0wilcoxon-mann-whitney u test > Library (MASS) > Uscrime<-transform (Uscrime,so=factor (SO)) >wilcox_test (Prob~so,data=uscrime, distribute= "exact") asymptotic Wilcoxonmann-whitney Rank Sum testdata:prob by SO (0, 1) Z = -3.7493, P-value = 0.0001774alte Rnative hypothesis:true mu isn't equal to 0

Approximate K-Sample replacement Test

> Library (Multcomp) > Set.seed (1234) > oneway_test (response~trt,data=cholesterol,+ distribution=approximate (b=9999)) Approximativek-sample permutation testdata:response bytrt (1time, 2times, 4times, Drugd, druge) Maxt = 4.7623, P-value < ; 2.2e-16

12.2.2 independence in the list of tables

With the Chisq_test () or cmh_test () function, we can determine the independence of the two category variables using the displacement test.

The latter function is required when the data can be layered against a third category variable. If the variables are ordered, you can use the

Lbl_test () function to verify that there is a linear trend.

> Library (COIN)

> Library (VCD)

Load the required thread bundle: Grid

> Arthritis<-transform (Arthritis,

+ Improved=as.factor (as.numeric (improved)))

> set.seed (1234)

> Chisq_test (treatment~improved,data=arthritis,distribution=approximate (b=9999))

Approximativepearson ' s chi-squared Test

Data:treatment byimproved (1, 2, 3)

chi-squared = 13.055, P-value = 0.0018

It is necessary to change the variable improved from an ordered factor to a classification factor because, if the order factor is used, coin ()

A linear and linear trend test will be generated instead of a chi-square test.

12.2.3 Independence between numerical variables

The Spearman_test () function provides independent substitution testing of two numeric variables. > States<-as.data.frame (state.x77) > Set.seed (1234) >spearman_test (Illiteracy~murder,data=states, Distribution=approximate (b=9999)) Approximativespearman Correlation testdata:illiteracyby Murderz = 4.7065, P-value < 2.2e-16alternative hypothesis:true Mu is not equal to 0# independence assumptions are not met.

  

12.2.42 Sample and K sample correlation test

Sample-related tests can be useful when observations in different groups have been allocated properly or when repeated measurements have been used.

For permutation tests of two paired groups, use the Wilcoxsign_test () function, or use friedman_ when more than two groups are available.

The test () function.

> library (Coin) > Library (MASS) >wilcoxsign_test (u1~u2,data=uscrime,distribution= "exact") Exactwilcoxon-signed-rank testdata:y by X (Neg,pos) stratified by Blockz = 5.9691, P-value = 1.421e-14alternative hypothes Is:true mu isn't equal to 0# the result shows that the unemployment rate is different.

  

R in Action reading notes (16) 12th chapter re-sampling and self-help method replacement test

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.