R language: Common statistical test methods _r

Source: Internet
Author: User

Transferred from http://blog.sciencenet.cn/home.php?mod=space&uid=255662&do=blog&id=240107

Hypothesis Test T-test for normal population mean value
Single overall
Example One
The life of a component is X (hours), obeys a normal distribution, N (mu,sigma^2), wherein the mu,sigma^2 is unknown, and the life of 16 components is as follows: Ask if there is reason to think that the average life span of a component is greater than 255 hours.
Command:
X<-c (159, 280, 101, 212, 224, 379, 179, 264,
222, 362, 168, 250, 149, 260, 485, 170)
T.test (X, alternative = "greater", mu = 225)


Two general example two
X for the old steelmaking furnace oven, y for the new steelmaking furnace, ask whether the new operation can increase the rate of the oven
Command:
X<-c (78.1,72.4,76.2,74.3,77.4,78.4,76.0,75.5,76.7,77.3)
Y<-c (79.1,81.0,77.3,79.1,80.0,79.1,79.1,77.3,80.2,82.1)

T.test (X, Y, var.equal=true, alternative = "less")


Pair of data T-test example three
Paired T-test for each blast furnace
Command:
X<-c (78.1,72.4,76.2,74.3,77.4,78.4,76.0,75.5,76.7,77.3)
Y<-c (79.1,81.0,77.3,79.1,80.0,79.1,79.1,77.3,80.2,82.1)
T.test (xy, alternative = "less")


The hypothesis test example of normal population variance four
The height (cm) of the 20 male students from the 5 grade primary school was measured as follows:
Asked, at 0.05 significant levels,
Is the average equal to 149
Sigma^2 is equal to 75
Command:
X<-scan ()
136 144 143 157 137 159 135 158 147 165
158 142 159 150 156 152 140 149 148 155
Var.test (X,y)


Analysis of the data of the steelmaking furnace with five cases
Command:
X<-c (78.1,72.4,76.2,74.3,77.4,78.4,76.0,75.5,76.7,77.3)
Y<-c (79.1,81.0,77.3,79.1,80.0,79.1,79.1,77.3,80.2,82.1)
Var.test (X,y)


Total test of two-item distribution

Example six have a number of vegetable seeds of the average germination rate of p=0.85, now randomly selected 500 tablets, seed dressing agent for seed soaking treatment, the result of 445 germination, ask whether the seed coating agent has no effect.
Command:
Binom.test (445,500,p=0.85)


Example seven in accordance with previous experience, the rate of chromosomal abnormalities in newborns is generally 1%, a hospital observed 400 local newborns, there is a case of chromosomal abnormalities, asked whether the region's newborn chromosome is lower than the general level.
Command:
Binom.test (1,400,p=0.01,alternative= "less")
Non-parametric test
Test-chisq of-pearson fit of #数据是否正态分布的Neyman


Examples of 85 brands of beer enthusiasts are as follows
A 210
B 312
C 170
D 85
E 223
Ask whether there is a difference between the number of beer lovers of different brands.
Command:
X<-c (210, 312, 170, 85, 223)
Chisq.test (X)


Nine to test whether the students ' grades conform to normal distribution
Command:
X<-scan ()
25 45 50 54 55 61 64 68 72 75 75
78 79 81 83 84 84 84 85 86 86 86
87 89 89 89 90 91 91 92 100
A<-table (Cut (X, Br=c (0,69,79,89,100))
P<-pnorm (c (70,80,90,100), mean (x), SD (x))
P<-c (P[1], p[2]-p[1], p[3]-p[2], 1-p[3]
Chisq.test (a,p=p)
# cut divides the range of variables into several ranges
# table calculates the number of factors after merging
# There is no significant difference between the mean values
The ratio of awn characters of hybrid progeny of barley is no awn: Long awn: Short awn =9:3:4, but the actual observation value is 335:125:160, whether the observation value conforms to the theory hypothesis.
Command:
Chisq.test (C (335, 160), P=c (9,3,4)/16)


For example, there are 42 data available, representing the number of times a telephone switchboard has borrowed a call for a certain period of time.
# Number of calls received 0 1 2 3 4 5 6
# frequency 7 10 12 8 3 2 0
# Q: The number of calls received within a certain time period is consistent with the possion distribution.
Command:
X<-0:6
Y<-c (7,10,12,8,3,2,0)
Mean<-mean (Rep (x,y))
Q<-ppois (X,mean)
N<-length (y)
P[1]<-Q[1]
P[N]<-1-Q[N-1]
For (I in 2: (n-1))
P<-q-q
Chisq.test (y, p=p)
Z<-c (7, 10, 12, 8)
N<-length (Z); p<-p[1:n-1]; P[N]<-1-Q[N-1]
Chisq.test (Z, p=p)


Content from

Shiry Chen Liping "Statistical Modeling and R Software" Tsinghua University Press 2006

-----------------------------------


Transferred from http://blog.sciencenet.cn/home.php?mod=space&uid=255662&do=blog&id=240107

# When theoretical distributions depend on a number of unknown parameters
# Kolmogorov-smirnov Test
# ks.test ()
Example One
The life test of a equipment, record 10 times without fault operation time, and in order from small to large in the following sequence,
# Use KS test method to check whether this equipment working time is in accordance with the exponential distribution of rambda=1/1500
Command
X&LT;-C (420, 500, 920, 1380, 1510, 1650, 1760, 2100, 2300, 2350)
Ks.test (X, "Pexp", 1/1500)

Case II
Suppose that 25 and 20 observation samples are randomly sampled from the population of the distribution function f (x) and g (x) respectively, and that f (x) and g (x) are the same.
# command
X<-scan ()
0.61 0.29 0.06 0.59-1.73-0.74 0.51-0.56 0.39
1.64 0.05-0.06 0.64-0.82 0.37 1.77 1.09-1.28
2.36 1.31 1.05-0.32-0.40 1.06-2.47
Y<-scan ()
2.20 1.66 1.38 0.20 0.36 0.00 0.96 1.56 0.44
1.50-0.30 0.66 2.31 3.29-0.27-0.37 0.38 0.70
0.52-0.71
Ks.test (X, Y)
# KS The limitations of multiple sample testing, only used in the theoretical distribution of one-dimensional continuous distribution, and the distribution of fully known cases. When KS test is available, the efficacy is generally superior to Pearson CHISQ test

# The independence test of the table (contingerncy table)
# Pearson Chisquare for independent testing
Example Three
In order to study whether smoking is related to lung cancer, 63 patients and 43 non-lung cancer patients surveyed the number of smokers, get 2*2 list
# Data on lung cancer health total
# Smoking 60 32 92
# no Smoking 3 11 14
# Total 63 43 106
# command
X<-c (60, 3, 32, 11)
Dim (x) <-c (2,2)
Chisq.test (X,correct = FALSE) # without continuous correction
Chisq.test (x) # with continuous correction

Example Four
In a social survey, questionnaires were conducted to survey the annual income of 901 people and their satisfaction with the work, in which annual income a was divided into four stalls: less than 6000 yuan, 6000-15000 yuan, 15000 yuan to 25000 yuan, more than 25000 yuan. The satisfaction of the work of B divided into very dissatisfied, less satisfied with the basic satisfaction and satisfaction of four stalls, the results are as follows
# very dissatisfied with less satisfied basic satisfaction satisfied Total
# < 6000 20 24 80 82 206
# 6000 ~15000 22 38 104 125 289
# 15000~25000 13 28 81 113 235
# > 25000 7 18 54 92 171
# Total 62 108 319 412 901
# commands are as follows
X<-scan ()
20 24 80 82 22 38 104 125
13 28 81 113 7 18 54 92
Dim (x) <-c (4,4)
Chisq.test (x)

# Fisher's exact independent test
# Trial Condition sample number less than 4
Case Five
A physician studies the effect of hepatitis B immunoglobulin to prevent HBV infection in utero, 33 cases of hbsag-positive pregnant women are randomly divided into the prevention injection group and the control group, the results from the following table, the two groups of neonatal HBV overall infection rate is no difference
# group positive negative total infection rate
# 4 18 22 18.8 injection prevention Group
# control Group 5 6 11 45.5
# commands are as follows
X<-c (4,5,18,6); Dim (x) <-c (2,2)
Fisher.test (x)
# test for the lung cancer mentioned earlier
X<-c (60, 3, 32, 11);
Dim (x) <-c (2,2)
Fisher.test (x)

# McNemar Test
# McNemar test is not an independent test, but it is about the inspection of the column table
Example Six
The results of two methods of detection of bacteria
# Party B Law
# Total
# a Method +-
# + 49 25 74
#-21 107 128
# Total 70 132 202
# command
X <-C (49, 21, 25, 107);
Dim (X) <-C (2,2)
Mcnemar.test (X,correct=false)

# symbol Test
# 1 Assuming that a sample is from a general
Example Seven
The cost of living for United Nations personnel in 66 major cities in the world (in New York, December 1996, 100) is as follows from a small to large, with Beijing's index of 99. Suppose the sample was randomly sampled from the world's largest cities. With the symbol test, Beijing is above the median, or under the median number.
X<-scan ()
66 75 78 80 81 81 82 83 83 83 83
84 85 85 86 86 86 86 87 87 88 88
88 88 88 89 89 89 89 90 90 91 91
91 91 92 93 93 96 96 96 97 99 100
101 102 103 103 104 104 104 105 106 109 109
110 110 110 111 113 115 116 117 118 155 192
Binom.test (sum (x>99), Length (X), al= "L")

# 2 Using a pair of samples to test whether there is a difference between the two populations
Example Eight
Two kinds of feed, the pig weight gain situation is as follows, analysis two kinds of feed pig have no difference
# command
X<-scan ()
25 30 28 23 27 35 30 28 32 29 30 30 31-16
Y<-scan ()
19 32 21 19 25 31 31 26 30 25 28 31 25-25
Binom.test (sum (x<y), Length (x))

Case Nine
A beverage shop for the investigation of the customer's taste for beverages, a day randomly surveyed 13 for customers, like tea more than coffee-said, like coffee more than milk tea with + said, both like to use 0 said, the results are as follows, analysis of customers are more like coffee began tea.
# customer Number 1 2 3 4 5 6 7 8 9 10 11 12 13
# like coffee 1 1 1 1 0 1 1 1 1 1
# like milk tea 1 1 1
Binom.test (3,12,P=1/2, al= "L", Conf.level = 0.90)

# rank Statistics
# Spearman Rank Correlation test
Example Ten
A competition with six people to perform, two were evaluated and the results were evaluated as shown in the table, and the test methods of Spearman rank were tested to see if the two reviewers had any relevance to grade assessment.
# contestant Number 1 2 3 4 5 6
# A rating 4 2 2 4 5 6
# B's Rating 5 3 4 3 2 5
X<-c (4,2,2,4,5,6); Y<-c (5,3,4,3,2,5)
Cor.test (x, y, method = "Spearman")

# Kendall related inspection
Case Xi.
A kindergarten tested the intelligence of 9 pairs of twins and, according to hundred, tried to test the correlation between the intelligence of twins by using Kendall-related tests.
# 1 2 3 4 5 6 7 8 9
# 86 77 68 91 70 71 85 87 63
# 88 76 64 96 65 80 81 72 60
X<-c (86, 77, 68, 91, 70, 71, 85, 87, 63)
Y&LT;-C (88, 76, 64, 96, 65, 80, 81, 72, 60)
Cor.test (X, Y, method = "Kendall")

# Wilcoxon Rank test--Consider the difference of the median number of months in the total of the sample observations.
# 1 Test for a sample from the same population
Example 12
A battery factory said it produced some kind of battery, median of 140 hours, is randomly from its newly produced batteries to take 20, test its life, 137.0 140.0 138.3 139.0 144.3 139.1 141.7 137.3 133.5 138.2 141.1 139.2 136.5 136.5 135.6 138.0 140.9 140.6 136.3 134.1
# Use the Wilcoxon symbol test to analyze whether the battery produced by this plant meets the standard
X<-scan ()
137.0 140.0 138.3 139.0 144.3 139.1 141.7 137.3 133.5 138.2
141.1 139.2 136.5 136.5 135.6 138.0 140.9 140.6 136.3 134.1
Wilcox.test (X, mu=140, alternative= "less",
Exact=false, Correct=false, Conf.int=true)

# This method can also be used to test the sample attains
Example 13
In order to test a new kind of fertilizer, the existing wheat field is divided into 10 pieces, then each piece is divided into two parts, half of the ordinary fertilizer, half application of new fertilizers, the use of Wilcoxon symbol test to check whether the new compound fertilizer can significantly improve wheat yield.
# 1 2 3 4 5 6 7 8 9 10
# 459 367 303 392 310 342 421 446 430 412
# 414 306 321 443 281 301 353 391 405 390
#
X<-c (459, 367, 303, 392, 310, 342, 421, 446, 430, 412)
Y<-c (414, 306, 321, 443, 281, 301, 353, 391, 405, 390)
Wilcox.test (x, y, alternative = "greater", paired = TRUE)
Wilcox.test (xy, alternative = "greater")
Binom.test (sum (x>y), Length (x), alternative = "greater")

# rank and test of non-pair samples
# Wilcoxon-mann-whitney Statistics U
Example 14
The blood lead levels of 10 workers in different working groups were measured and the differences between the two groups were analyzed.
# Non-lead Operations Group 24 26 29 34 43 58 63 72 87 101
# Lead-free Operation Group 82 87 97 121 164 208 213
X<-c (24, 26, 29, 34, 43, 58, 63, 72, 87, 101)
Y<-c (82, 87, 97, 121, 164, 208, 213)
Wilcox.test (x,y,alternative= "Less", Exact=false,correct=false)
Wilcox.test (x, y, alternative= "less", Exact=false)

Case XV
The ranking of students ' mathematical ability
New Method 3 5 7 9 10
Original Method 1 2 4 6 8
Ask whether there is a difference between the old and new methods.
X<-c (3, 5, 7, 9, 10)
Y<-c (1, 2, 4, 6, 8)
Wilcox.test (x, Y, alternative= "greater")

Example 16
To examine the effect of a drug on chronic bronchitis, and to extract 216 cases of treatment. To analyze the effect of the drug on the treatment of two kinds of chronic bronchitis.
# Control Effect Progress ineffective
# Simple Type 62 41 14 11
# wheezing Type 20 37 16 15
X<-rep (1:4, C (62, 41, 14,11))
Y<-rep (1:4, C (20, 37, 16, 15))
Wilcox.test (x, y, Exact=false)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.