As usual, we check for some packages.

rm(list = ls()) 
if (!require(gridExtra)) install.packages('gridExtra')
## Loading required package: gridExtra
# Packages
if (!require(ggplot2)) install.packages('ggplot2')
## Loading required package: ggplot2

When comparing two means (two-sample t-test), the null and alternative hypothesis are: H 0 : μ 1 = μ 2 H 1 : μ 1 μ 2 Under the null hypothesis, we assume the two samples (1 and 2) originate from two populations that have equal means (μ1 and μ2). Under the alternative hypothesis, we assume the two samples (1 and 2) originate from two populations that have different means (and we do not specify this difference).

We first generate some data and use an histogram to represent the distribution. Note that we are using par(mfrow=c(1,2)) to plot both histograms on the same panel (2 columns, 1 row).

set.seed(12) # set random

# data
d1<-rnorm(40,23,6) # create 40 random values that follow a normal distribution with mean 23 and sd 6
d2<-rnorm(50,20,4)  # create 50 random values that follow a normal distribution with mean 20 and sd 4
value<-0.84 # CI level

# display

To calculate confidence intervals for both datasets, I use the function t.test. Note that I use 84% (instead of 95%) confidence intervals to visually identify possible statistical differences (see here).


# caculate CI at 0.84 for the barplot
d1_CI<-t.test(d1,conf.level=value) # CI for d1
d2_CI<-t.test(d2,conf.level=value) # CI for d2

d1_lCI<-d1_CI$[1] # lower 0.84 CI
d1_uCI<-d1_CI$[2] # upper 0.84 CI
d1_mCI<-d1_CI$estimate  # mean value

d2_lCI<-d2_CI$[1] # lower 0.84 CI
d2_uCI<-d2_CI$[2] # upper 0.84 CI
d2_mCI<-d2_CI$estimate  # mean value

I then store all those calculated values (lower and upper limits for the CI and means) in a data.frame.

# create data frame
##   name     mean      lCI      uCI
## 1   T1 22.29939 21.08856 23.51022
## 2   T2 20.15461 19.44643 20.86279

Next, I perform a t-test on those 2 datasets using var.equal = FALSE (default argument). Because the data have unequal variances and unequal sample sizes, a Welch's t-test is more robust than a conventional t-test [Note that a t-test assumes that the data are normally distributed. This is the case here as I sampled the data using rnorm)]. The results are stored in a data.frame (df_test).

# Now perform a real (on 2 datasets) t-test
all_test<-t.test(d1,d2,var.equal = TRUE)  # t.test
# create data frame
df_test <- data.frame(x = 0, 
                      y = (all_test$[2] + all_test$[1]) / 2,
                      ymin = all_test$[1], 
                      ymax = all_test$[2])
Let's make graphs ! The first plot (p1) is a barplot thats shows 0.84 Confidence Intervals for the means. The second plot (p2) represents a 0.95 Confidence Interval for the difference in means. I also draw a dotted line (difference equals 0). There is now a tendency to show those values (REF) as it gives much more information than a single p-value. In particular, this allows to emphasize the sample's sizes (which, in turns, directly affect Statistical Power). I will come back to this in a new blog post.

# plot 1. A barplot with CI @ 0.84
p1<-ggplot(all) +
  geom_bar( aes(x=name, y=mean), stat="identity", color = c("#00AFBB", "#E7B800"), fill= "white", width = 0.4 ,size=1) +
  geom_errorbar( aes(x=name, ymin=lCI, ymax=uCI), width=0.1,  size=1, color = c("#00AFBB", "#E7B800"))+
  annotate("text",  x=Inf, y = Inf, label = paste0("p-value: ",signif(all_test$p.value,2)), vjust=1, hjust=1)+
  labs(x="Treatment", y="PSA Concentration (ng/ml)",xlab=c("T1","T2"))+

# plot 2. This is the CI for the difference in means @ 0.95

  geom_errorbar(aes(x, ymin = ymin, ymax = ymax), width = .05, size =1) +
  geom_point(aes(x, y), size = 4)+
  geom_hline(yintercept = 0, linetype = 3, size = 2) +
  xlim(-.1, .1) +
  labs(y="Difference in estimated means (ng/ml). 0.95 CI")+
    panel.grid.minor.x = element_line(colour = "grey95"),
    panel.grid.minor.y = element_blank(),

To have both plots on a single graph, I am using grid.arrange from the gridExtra library.

# Both plots on the same panel
p<-grid.arrange(p1,p2, ncol=1, nrow=2,  heights=c(2,1))

Here, the p value is smaller than 0.05 and so we null hypothesis is rejected (type-I error (α) assumed at 0.05).