International - English

Cart Console

Topic Center

Contact Sales

Home > Others

The addition of p-value and significance markers in the R language Visual learning notes

Last Update:2017-11-20 Source: Internet

Author: User

Tags ggplot

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The addition of p-value and significance markers in the R language Visual learning notes

Http://www.jianshu.com/p/b7274afff14f?from=timeline

In the previous article, I mentioned how to add the GGPUBR package to the ggplot diagram p-value and the significance of the markup, this article will be described in detail. Demo with Data set Toothgrowth

#先加载包library(ggpubr)#加载数据集ToothGrowthdata("ToothGrowth")head(ToothGrowth)

##    len  supp  dose## 1  4.2   VC   0.5## 2  11.5  VC   0.5## 3  7.3   VC   0.5## 4  5.8   VC   0.5## 5  6.4   VC   0.5## 6  10.0  VC   0.5

Comparison method

The main methods of comparison used in R are the following:

Method	R function	Description
T-test	T.test ()	Compare two groups (parameters)
Wilcoxon test	Wilcox.test ()	Compare two groups (non-parametric)
ANOVA	AoV () or ANOVA ()	Compare multiple groups (parameters)
Kruskal-wallis	Kruskal.test ()	Compare multiple groups (non-parametric)

A variety of comparison methods followed by time one by one to explain.

Add to p-value

The main use of the GGPUBR package of two functions:

compare_means(): One or more groups can be compared
stat_compare_mean(): Auto Add p-value , significant mark to Ggplot chart

Compare_means () function

The function is mainly used as follows:

compare_means(formula, data, method = "wilcox.test", paired = FALSE,  group.by = NULL, ref.group = NULL, ...)

Comments:

formula: Shape x~group , where x is a numeric variable, group is a factor, can be one or more
data: Data Set
method: The method of comparison, the default is "wilcox.test" , the other optional methods are: "t.test" , "anova" ,"kruskal.test"
paired: Do you want to paired test ( TRUE or FALSE )
group_by: Whether to group when comparing
ref.group: Whether you need to specify a reference group

Stat_compare_means () function

Main usage:

stat_compare_means(mapping = NULL, comparisons = NULL hide.ns = FALSE,                   label = NULL,  label.x = NULL, label.y = NULL,  ...)

Comments:

mapping: aes() a set of aesthetic mappings created by
comparisons: Specify groups that need to be compared and added p-value , marked with significance
hide.ns: Do you want to display the significance mark NS
label: The type of the significance tag, Optional: p.signif (significant mark), p.format (display p-value )
label.x, label.y : Significant label adjustment
...: Other parameters

Compare two separate sets of

compare_means(len~supp, data=ToothGrowth)

Results explained:

.y: The y variable used in the test
p:p-value
p.adj: Adjusted p-value . Default isp.adjust.method="holm"
p.format: After rounding thep-value
p.signif: Level of significance
method: A method for statistical testing to draw a box line diagram
```
p <- ggboxplot(ToothGrowth, x="supp", y="len", color = "supp", palette = "jco", add = "jitter")#添加p-valuep+stat_compare_means()
```
```
#使用其他统计检验方法p+stat_compare_means(method = "t.test")
```
The above significance marks can be label.x adjusted by, label.y , hjust and vjust
The significance tag can be aes() changed by mapping:
- aes(label=..p.format..)Or aes(lebel=paste0("p=",..p.format..)) : Show only p-value , do not show statistical test method
- aes(label=..p.signif..): Show only significant levels
- aes(label=paste0(..method..,"\n", "p=",..p.format..)): p-value displayed with a significant level of the branch

Give me a chestnut:

p+stat_compare_means(aes(label=..p.signif..), label.x = 1.5, label.y = 40)

You can also specify the label as a character vector, do not map, just p.signif both ends of the. Remove and then

p+stat_compare_means(label = "p.signif", label.x = 1.5, label.y = 40)

Comparison of two paired sample

compare_means(len~supp, data=ToothGrowth, paired = TRUE)

Visualize with ggpaired ()

ggpaired(ToothGrowth, x="supp", y="len", color = "supp", line.color = "gray", line.size = 0.4, palette = "jco")+ stat_compare_means(paired = TRUE)

Multi-Group comparison

Global Test

compare_means(len~dose, data=ToothGrowth, method = "anova")

Visualization of

ggboxplot(ToothGrowth, x="dose", y="len", color = "dose", palette = "jco")+stat_compare_means()

#使用其他的方法ggboxplot(ToothGrowth, x="dose", y="len", color = "dose", palette = "jco")+ stat_compare_means(method = "anova")

Pairwise comparisons: If the grouping variable contains more than two levels, the pairwise test is automatically performed, and the default method is "Wilcox.test"

compare_means(len~dose, data=ToothGrowth)

#可以指定比较哪些组my_comparisons <- list(c("0.5", "1"), c("1", "2"), c("0.5", "2"))ggboxplot(ToothGrowth, x="dose", y="len", color = "dose",palette = "jco")+stat_compare_means(comparisons=my_comparisons)+ # Add pairwise comparisons p-value stat_compare_means(label.y = 50) # Add global p-value

You can change the position of the label by modifying the parameter Label.y

ggboxplot(ToothGrowth, x="dose", y="len", color = "dose",palette = "jco")+stat_compare_means(comparisons=my_comparisons, label.y = c(29, 35, 40))+ # Add pairwise comparisons p-value stat_compare_means(label.y = 45) # Add global p-value

As for the two groups that are connected by adding lines to the comparison, this feature has been implemented by the package ggsignif

##设定参考组compare_means(len~dose, data=ToothGrowth, ref.group = "0.5",  #以dose=0.5组为参考组 method = "t.test" )

#可视化ggboxplot(ToothGrowth, x="dose", y="len", color = "dose", palette = "jco")+ stat_compare_means(method = "anova", label.y = 40)+ # Add global p-valuestat_compare_means(label = "p.signif", method = "t.test", ref.group = "0.5") # Pairwise comparison against reference

The reference group can also be set to. All. That is, all averages

compare_means(len~dose, data=ToothGrowth, ref.group = ".all.", method = "t.test")

#可视化ggboxplot(ToothGrowth, x="dose", y="len", color = "dose", palette = "jco")+stat_compare_means(method = "anova", label.y = 40)+# Add global p-valuestat_compare_means(label = "p.signif", method = "t.test", ref.group = ".all.")#Pairwise comparison against all

Next, use the dataset myeloma in the Survminer package to explain why sometimes we need to set the Ref.group to. All.

library(survminer)#没安装的先安装再加载data("myeloma")head(myeloma)

We will draw the expression profile of the DEPDC1 gene according to the patient's group, to see if there are significant differences between the groups, we can compare between the 7 groups, but there are so many combinations of group comparisons, so we can compare each group in the 7 group with the full average. See if the DEPDC1 gene is expressed or low in different groups.

compare_means(DEPDC1~molecular_group, data = myeloma, ref.group = ".all.", method = "t.test")

#可视化DEPDC1基因表达谱ggboxplot(myeloma, x="molecular_group", y="DEPDC1", color = "molecular_group", add = "jitter", legend="none")+ rotate_x_text(angle = 45)+ geom_hline(yintercept = mean(myeloma$DEPDC1), linetype=2)+# Add horizontal line at base mean stat_compare_means(method = "anova", label.y = 1600)+ # Add global annova p-value stat_compare_means(label = "p.signif", method = "t.test", ref.group = ".all.")# Pairwise comparison against all

It can be seen that the DEPDC1 gene is significantly expressed in the proliferation group, while in hyperdiploid and low bone disease significantly lower expression

We can also remove the non-significant mark NS by simply hide.ns=true the parameters

ggboxplot(myeloma, x="molecular_group", y="DEPDC1", color = "molecular_group", add = "jitter", legend="none")+rotate_x_text(angle = 45)+ geom_hline(yintercept = mean(myeloma$DEPDC1), linetype=2)+# Add horizontal line at base mean stat_compare_means(method = "anova", label.y = 1600)+ # Add global annova p-value stat_compare_means(label = "p.signif", method = "t.test", ref.group = ".all.", hide.ns = TRUE)# Pairwise comparison against all

Multiple grouping variables

A statistical test is performed after grouping by another variable, such as grouping by variable dose:

compare_means(len~supp, data=ToothGrowth, group.by = "dose")

#可视化p <- ggboxplot(ToothGrowth, x="supp", y="len", color = "supp", palette = "jco", add = "jitter", facet.by = "dose", short.panel.labs = FALSE)#按dose进行分面#label只绘制p-valuep+stat_compare_means(label = "p.format")

#label绘制显著性水平p+stat_compare_means(label = "p.signif", label.x = 1.5)

#将所有箱线图绘制在一个panel中p <- ggboxplot(ToothGrowth, x="dose", y="len", color = "supp", palette = "jco", add = "jitter")p+stat_compare_means(aes(group=supp))

#只显示p-valuep+stat_compare_means(aes(group=supp), label = "p.format")

#显示显著性水平p+stat_compare_means(aes(group=supp), label = "p.signif")

进行paired sample检验compare_means(len~supp, data=ToothGrowth, group.by = "dose", paired = TRUE)

#可视化p <- ggpaired(ToothGrowth, x="supp", y="len", color = "supp", palette = "jco", line.color="gray", line.size=0.4, facet.by = "dose", short.panel.labs = FALSE)#按dose分面#只显示p-valuep+stat_compare_means(label = "p.format", paired = TRUE)

Other graphical bar and line graphs (a grouping variable)

#有误差棒的条形图，实际上我以前的文章里有纯粹用ggplot2实现ggbarplot(ToothGrowth, x="dose", y="len", add = "mean_se")+ stat_compare_means()+ stat_compare_means(ref.group = "0.5", label = "p.signif", label.y = c(22, 29))

#有误差棒的线图ggline(ToothGrowth, x="dose", y="len", add = "mean_se")+stat_compare_means()+ stat_compare_means(ref.group = "0.5", label = "p.signif", label.y = c(22, 29))

Bar and Line charts (two grouping variables)

ggbarplot(ToothGrowth, x="dose", y="len", add = "mean_se", color = "supp", palette = "jco", position = position_dodge(0.8))+ stat_compare_means(aes(group=supp), label = "p.signif", label.y = 29)

ggline(ToothGrowth, x="dose", y="len", add = "mean_se", color = "supp", palette = "jco")+ stat_compare_means(aes(group=supp), label = "p.signif", label.y = c(16, 25, 29))

Sessioninfo

Sessioninfo () # # R version 3.4.0 (2017-04-21) # # platform:x86_64-w64-mingw32/x64 (64-bit) # # Running Under:windows 8.1 x64 (Build 9600) # # # Matrix products:default## # locale:## [1] lc_collate=chinese (Simplified) _china.936 # # [2] Lc_ctype=ch Inese (Simplified) _china.936 # # [3] Lc_monetary=chinese (Simplified) _china.936## [4] lc_numeric=c # # [5] LC_TIME=Chinese (Simplified) _china.936 # # # # attached base packages:## [1] stats graphics grdevices utils datasets Methods Base # # # # Othe R attached packages:## [1] survminer_0.4.0 ggpubr_0.1.3 magrittr_1.5 ggplot2_2.2.1 # # # Loaded via a namespace (and not a ttached): # # [1] rcpp_0.12.11 compiler_3.4.0 plyr_1.8.4## [4] tools_3.4.0 digest_0.6.12 evaluate_0.10 # # [7] tibble_1.3.3 G  table_0.2.0 nlme_3.1-131 # [16] lattice_0.20-35 rlang_0.1.1 matrix_1.2-10 # [psych_1.7.5] ggsci_2.4 DBI_0.6-1 Cmprsk_2.2-7 yaml_2.1.14 parallel_3.4.0 # # [[] gridextra_2.2.1 dplyr_0.5.0 stringr_1.2.0 # # [] knitr_1.16 survMisc_0.5 .4 rprojroot_1.2 # # [25]grid_3.4.0 data.table_1.10.4 kmsurv_0.1-5 # # [[] r6_2.2.1 km.ci_0.5-2 survival_2.41-3 # # [] foreign_0.8-68 rmarkdown_1 .5 reshape2_1.4.2 # # [tidyr_0.6.3] purrr_0.2.2.2 splines_3.4.0 # # [PNS] backports_1.1.0 scales_0.4.1 htmltools_0.3.6 # # [assertthat_0.2.0] mnormt_1.5-5 xtable_1.8-2 # # [] colorspace_1.3-2 ggsignif_0.2.0 labeling_0.3 # # [+] stringi_1.1. 5 lazyeval_0.2.0 munsell_0.4.3 # # [broom_0.4.2] zoo_1.8-0

R Language Visual Learning notes add p-value and significance tags

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

get the permalink addition of two arrays in c express associative property of addition in words abstract painting elements of visual language addition of two arrays result of addition html language notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

The addition of p-value and significance markers in the R language Visual learning notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support