0
votes

violin plots generated from data frame The numbers in each column represent localisation of signal relative to another signal inside nuclei of cells. There are 3 treatment conditions and 7 time points of treatment + 2 controls giving a total of 23 columns (see violin plots). I would like to perform a t-test or a Wilcox t-test with each column to each column. I think I have done it before with a pairwise.t.test(Chr). However, the function requires to define how you group your data and I would like to group mine by columns. I've imported my data:

Chr <- read_csv("Chromocenters-intensity.csv", 
+     na = "NA")

Parsed with column specification: cols( .default = col_double() )

Imported dataset into R

And then tried:

 pairwise.t.test(Chr, cols())

Error in order(y) : unimplemented type 'list' in 'orderVector1'

pairwise.wilcox.test(Chr,g=cols(Chr))

Error: Some col_types are not S3 collector objects: 1

I do not understand what the errors mean.

a normal t.test works fine:

t.test(Chr$S0,Chr$S1)

Welch Two Sample t-test

data: Chr$S0 and Chr$S1 t = 0.85955, df = 154.12, p-value = 0.3914 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.920629 4.879370 sample estimates: mean of x mean of y 100.41579 98.93642

but how do I scale it up to include every column by every column?

Thank you

2

2 Answers

0
votes

You could use expand.grid and apply.

data <- as.data.frame(sapply(1:23,function(x){runif(470,1,200)}))
names(data) <-c(paste0("S",0:7),paste0("N",1:7),paste0("P",1:7),"TKO")
pairs <- expand.grid(names(data),names(data))
result <- data.frame(pairs,p.val = apply(pairs,1,function(x){t.test(data[x[1]],data[x[2]])$p.val}))
result
    Var1 Var2          p.val
1     S0   S0 1.000000000000
2     S1   S0 0.573722556263
3     S2   S0 0.874552764274
4     S3   S0 0.467670724537
5     S4   S0 0.700539636188
6     S5   S0 0.736422364244
7     S6   S0 0.599066387580
8     S7   S0 0.940641228509
9     N1   S0 0.727290760056
10    N2   S0 0.057120608982
11    N3   S0 0.523554180769
12    N4   S0 0.485633891380

Don't forget to correct later for multiple testing.

0
votes

Hi welcome to Stack Overflow!

First, make sure you do some p-value correction for the multiple tests you are running, ?p.adjust might be helpful.

Second your problem requires you to get all of the possible pairing from your data set columns, which is a job for ?combn to do:

mtcars_numbers <- dplyr::select_if(mtcars, is.numeric) # simulate some data like you describe

?combn
all_pairs <- combn(names(mtcars_numbers), 2, simplify = F)
all_pairs # I wanted a this to be a list for lapply, but you can get a matrix with simplify=T

Then you iterate over each pair, I'm using ?lapply() but you could use a for loop or another function:

lapply(all_pairs,
       function(x) {
         t.test(mtcars[[x[1]]], mtcars[[x[2]]])
       })