As usual, we check for some packages

rm(list = ls())
# Check for Packages
if (!require(ggsignif)) install.packages('ggsignif')
if (!require(ggplot2)) install.packages('ggplot2')

I re-use the data from the previous post and create a data.frame (same code as before). There are, however, 4 additional lines of script. Using the function combn, I determine all possible combinations of 2 factor levels. Note that I have converted levels to numerics (e.g. "T1" is 1, "T2" is 2).

# data
T1<-c(20,17,18,22,20,22,24,25)
T2<-c(14,12,12,13,16,14,18,11,17,14)
T3<-c(19,20,18,21,21,17,22)
T4<-c(27,24,22,30,26,29,26,28)
# data.frame
values<-c(T1,T2,T3,T4)
exp<-as.factor(c(rep('T1',length(T1)),rep('T2',length(T2)),rep('T3',length(T3)),rep('T4',length(T4))))
all<-data.frame(values,exp) # just create a data.frame with values and factor

#  Combinations
all$factor_num<-as.numeric(all$exp) # I need to convert those levels to numeric
v_factors<- 1:tail(all$factor_num,n=1) all_pairwise<-combn(v_factors,2) all_pairwise # this calculates all combinations ## [,1] [,2] [,3] [,4] [,5] [,6] ## [1,] 1 1 1 2 2 3 ## [2,] 2 3 4 3 4 4 There are indeed 6 possibilities (the first one is level T1 with level T2, the second one is level T1 with level T3). I need now to perform 6 t-tests. This is basically what the function main does: for z ranging from 1 to n_pairwise, it will extract (from the data.frame) correct values (e.g. values corresponding to T1 and T2 for the first t-test) and perform the test (I use an extra argument var.equal to perform a t-test and not a Welch's t-test). Note finally that I am using the function sapply to perform all those operations (instead of using a for loop). n_pairwise<-length(all_pairwise[1,]) # will return 6 here # The function below calculate all t.test (for all combinations) main<-sapply(1:n_pairwise, function (z) { data1<-all_pairwise[1,z] data2<-all_pairwise[2,z] t.test(all[all[,3]==data1,1],all[all[,3]==data2,1],var.equal = TRUE) } ) main[1,] Let us have a look at main. First we look at main[1,], which outputs all t-tests statistics. ## [[1]] ## t ## 5.792583 ## ## [[2]] ## t ## 1.045278 ## ## [[3]] ## t ## -4.075272 ## ## [[4]] ## t ## -5.417693 ## ## [[5]] ## t ## -10.73389 ## ## [[6]] ## t ## -5.756966 main[,1] gives all output parameters for a given combination. main[,1] ##$statistic
##        t
## 5.792583
##
## $parameter ## df ## 16 ## ##$p.value
## [1] 2.749152e-05
##
## $conf.int ## [1] 4.374814 9.425186 ## attr(,"conf.level") ## [1] 0.95 ## ##$estimate
## mean of x mean of y
##      21.0      14.1
##
## $null.value ## difference in means ## 0 ## ##$stderr
## [1] 1.191179
##
## $alternative ## [1] "two.sided" ## ##$method
## [1] " Two Sample t-test"
##
## \$data.name
## [1] "all[all[, 3] == data1, 1] and all[all[, 3] == data2, 1]"

Notice the [[ ]], which indicates that main is actually a list (a structure that may contain elements of different types). FInally, we create a new data.frame (ALL), which will contain all results (in a readable format).

# We create a new data.frame where we put the results of mainALL<-as.data.frame(t(as.matrix(all_pairwise)))
ALL[,3]<-unlist(main[1,])
ALL[,4]<-unlist(main[3,])
ALL[,5]<-p.adjust(ALL[,4], method = "fdr", n = length(ALL[,4]))
colnames(ALL)<-c("Exp_1", "Exp_2","statistics","Naive p value","FDR")
ALL # Let's have a look
##   Exp_1 Exp_2 statistics Naive p value          FDR
## 1     1     2   5.792583  2.749152e-05 8.247457e-05
## 2     1     3   1.045278  3.149419e-01 3.149419e-01
## 3     1     4  -4.075272  1.135702e-03 1.362842e-03
## 4     2     3  -5.417693  7.126807e-05 1.069021e-04
## 5     2     4 -10.733894  1.016466e-08 6.098795e-08
## 6     3     4  -5.756966  6.632425e-05 1.069021e-04

Look at the fifth column where we have calculated new p-values using a FDR correction (see also my previous post). You might notice that those values are slightly different from that calculated here. The reason is simple: pairwise.t.test has default parameters. Using:

# first let us re-do a pairwise calculation
pairwise.t.test(all[, 1],all[, 2], p.adjust.method = "fdr", pool.sd=FALSE, var.equal = TRUE)
##
##  Pairwise comparisons using t tests with non-pooled SD
##
## data:  all[, 1] and all[, 2]
##
##    T1      T2      T3
## T2 8.2e-05 -       -
## T3 0.31494 0.00011 -
## T4 0.00136 6.1e-08 0.00011
##
## P value adjustment method: fdr

We get identical values. Last but not least, we use the geom_signif function (with appropriate arguments) to plot corrected p-values (ALL[,5]) for all combinations (asplit(all_pairwise,2)).

#ggplot
p<-ggplot(all, aes(x=exp, y=values,)) +
geom_boxplot( col= c("#00AFBB", "#E7B800", "#FC4E07","#080808"),size=1.) +
geom_jitter()+
geom_signif(annotations=signif(ALL[,5],digits=1), comparisons=asplit(all_pairwise,2),step_increase = 0.1)+
labs(x="Treatment", y="PSA Concentration (ng/ml)",title="Pairwise comparisons using FDR correction for p-values")+
theme_classic()
p # display

BTW, I also can save the plot in a svg format, using:

ggsave(file="test.svg", plot=p, width=10) # this saves automatically. You need install the svglite package
## Saving 10 x 7.61 in image