5
votes

Some background: First, I wanted to generate multiple sets of samples (each of sample size n) from a uniform (0,1) distribution in [R]. I know that the command for generating from a uniform distribution is runif(n=x) for some sample size x, e.g. if I wanted sample size 20 the command would be

runif(n=20)

Next, I used the command

replicate( 100, runif(n=20))

This generated a double matrix of values which I could then convert into a dataset with 100 columns and 20 rows.

Is it possible for me to generate a dataset consisting of the sample means of all the column vectors (the sample means of the 100 sets taken from the uniform distribution)?

Thank you for your help.

4
minor point: in R, they're functions, not commands! - Spacedman

4 Answers

10
votes

You can use colMeans.

data <- replicate(100, runif(n=20))
means <- colMeans(data)
4
votes

Generate data:

data <- replicate(100, runif(n=20))

Means of columns, rows:

col_mean <- apply(data, 2, mean)
row_mean <- apply(data, 1, mean)

Standard deviation of columns, rows

col_sd   <- apply(data, 2, sd)
row_sd   <- apply(data, 1, sd)
2
votes

if i understand correctly: apply(replicate(100,runif(n=20)),2,mean)

2
votes

Building off of Nico's answer, you could instead make one call to runif(), format it into a matrix, and then take the colMeans of that. It proves faster and is equivalent to the other answers.

library(rbenchmark)
#reasonably fast
f1 <- function() colMeans(replicate(100,runif(20)))
#faster yet
f2 <- function() colMeans(matrix(runif(20*100), ncol = 100))

benchmark(f1(), f2(), 
          order = "elapsed", 
          columns = c("test", "elapsed", "relative"),
          replications=10000)

#Test results
  test elapsed relative
2 f2()    0.91 1.000000
1 f1()    5.10 5.604396