2
votes

I am writing loops or functions in R, and I still haven't really understood how to do that. Currently, I need to write a loop/function (not sure which one would be better) to create several results of Bootstrap within the same data frame.

sample dataset looks like:

"ID A_d B_d C_d D_d E_D f_D chkgp
M1  10  20  60  30  54  33  Treatment
M1  20  50  40  33  31  44  Placebo
M2  40  80  40  23  15  66  Placebo
M2  30  90  40  67  67  66  Treatment
M3  30  10  20  22  89  77  Treatment
M3  40  50  30  44  50  88  Placebo
M4  40  30  40  42  34  99  Treatment
M4  30  40  50  33  60  80  Placebo",header = TRUE, stringsAsFactors = FALSE)

I had written a function to find the spearman correlation

k=cor(df$A_d,df$E_D,method="spearman")
k

result is -0.325407

now I have to run the bootstrap method to get the correlation value 5000 times by shuffling data in both variables

so used following code

fc <- function(d, i){
    d2 <- d[i,]
    return(cor(df$A_d,df$E_D,method="spearman"))
}

With the function fc defined, we can use the boot command, providing our dataset name, our function, and the number of bootstrap samples to be drawn.

calculated BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 5000 bootstrap replicates.

#turn off set.seed() if you want the results to vary
set.seed(626)
bootcorr <- boot(hsb2, fc, R=500)
bootcorr

I find out the confidence interval form the 5000 replicates

boot.ci(boot.out = bootcorr, type =c( "perc"))

result

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 500 bootstrap replicates

CALL : 
boot.ci(boot.out = bootcorr, type = c("perc"))

Intervals : 
Level     Percentile     
95%   (-0.3254, -0.3254 )  
Calculations and Intervals on Original Scale

I need to write a loop condition to get the result as follows

Variable1 Variable2 confidence interval
A_d       E_D        (-0.3254, -0.3254 )  
A_d       f_D
B_d       E_D
B_d       f_D
C_d       E_D
C_d       f_D
D_d       E_D
d_d       f_D                              

because I have a dataset which contains more than 100 variables so its difficult do each time so i need automation part to do it.

1
This is probably a dupe. I found these: stackoverflow.com/questions/13112238/… and stackoverflow.com/questions/13486000/pairwise-correlation-table but they are not exactly like your question so I did not flagged for closing. P.S. First link reminded me about psych package.M--

1 Answers

1
votes

We can create a vectorized function and use outer():

corpij <- function(i,j,df) {cor(df[,i],df[,j],method="spearman")}
corp <- Vectorize(corpij, vectorize.args=list("i","j"))

outer(2:(ncol(df1)-1),2:(ncol(df1)-1),corp,df1)

#>            [,1]         [,2]         [,3]       [,4]        [,5]
#> [1,]  1.0000000  0.289588955 -0.480042672 0.22663483 -0.32540701
#> [2,]  0.2895890  1.000000000 -0.006379918 0.53614458 -0.35928788
#> [3,] -0.4800427 -0.006379918  1.000000000 0.01913975 -0.13952023
#> [4,]  0.2266348  0.536144578  0.019139754 1.00000000  0.02395253
#> [5,] -0.3254070 -0.359287879 -0.139520230 0.02395253  1.00000000
#> [6,]  0.7680403 -0.120481928 -0.421074589 0.33734940  0.07185758
#>             [,6]
#> [1,]  0.76804027
#> [2,] -0.12048193
#> [3,] -0.42107459
#> [4,]  0.33734940
#> [5,]  0.07185758
#> [6,]  1.00000000

Another approach would be using psych::corr.test():

library(psych)

corr.test(df1[,-c(1,ncol(df1))], method = "spearman")$r

Data:

df1 <- read.table(text="ID A_d B_d C_d D_d E_D f_D chkgp
                        M1  10  20  60  30  54  33  Treatment
                        M1  20  50  40  33  31  44  Placebo
                        M2  40  80  40  23  15  66  Placebo
                        M2  30  90  40  67  67  66  Treatment
                        M3  30  10  20  22  89  77  Treatment
                        M3  40  50  30  44  50  88  Placebo
                        M4  40  30  40  42  34  99  Treatment
                        M4  30  40  50  33  60  80  Placebo",
header = TRUE,stringsAsFactors = FALSE)