1
votes

I am trying to generate my own stimuli for an experiment using R. Below is the code that creates my (x,y) coordinates using the rnorm() with different a sample size of 100, different means and sd. I also create another variable to represent the size of the circles, which are determined by the runif().

dt <- data.frame(x = NA,
         y = NA,
         size = NA,
         M = NA, 
         sd = NA,
         col = NA,
         iter = NA)
sa<-0


mySD<-c(5, 15)
myMeans<-c(35, 45)
colors<-c("Blues", "Reds") 

for(i in 1:10){
    for(s in mySD){
      for(m in myMeans){

        x = abs(rnorm(n=1, mean=m, sd=s))
        y = abs(rnorm(n=1, mean=m, sd=s))

        size = runif(1, 1, 25) #select a random x speed between [25,35]

        sa<-sa+1
        dt[sa,] <- NA
        dt$x[sa]<-x
        dt$y[sa]<-y
        dt$M[sa]<-m
        dt$sd[sa]<-s
        dt$size[sa]<-size
        dt$iter[sa]<-i
      }
    }
  }
}

Next, I want to use ggplot(dt, aes(x, y, size=size) to plot. I want to randomly select 4 (x,y) values to plot for one graph, then 8 for another, then 16 for another, etc. Basically, I want to plot different graphs with a different number of data points. For example, some graphs that you would see would have 4 data points that vary by size and color, others would have 32 data points that vary in size and color. I m not sure how to select a set of unique data points from the data frame that I created. Any help would be great. I'm pretty new to R.

2
Why is n = 100 in x = abs(rnorm(n=100, mean=m, sd=s)) if you are assigning only one value at a time in dt$x[sa]<-x?Rui Barradas
Thanks for your reply. I had thought rnorm() selected one random value from the specified sample size. I guess that is only runif(). I'll make that change now, but that still doesn't get at what I'm trying to ultimately do.sara connor
Do you want each group to not contain points from any other group?Paul
Thanks for the question, Paul. Yes, I want them to be unique for each group (trial). Would that mean that I should use: x = abs(unique(rnorm(n=1, mean=m, sd=s)))sara connor

2 Answers

1
votes

Here are two ways - depending if you wanted each group to not contain points from any other group.

I'll just use a dummy data frame that just has columns x, y, and size.

library(tidyverse)

dt <- tibble(x = runif(100), y = runif(100), size = runif(100))

Allowing groups to share the same points

Create a vector for the size of each group.

sample_sizes <- 2^(seq_len(4) + 1)
sample_sizes
#> [1]  4  8 16 32

Randomly sample the data frame and add a group column.

sampled <- map_dfr(
  sample_sizes,
  ~sample_n(dt, .),
  .id = "group"
)

Plot using facets.

ggplot(sampled, aes(x, y, size = size)) +
  geom_point() +
  facet_wrap(~group)

plot1

Requiring groups to have different points

First, we need a way to generate four 1s, eight 2s etc. This can be done using log2 and some tricks.

groups <- floor(log2(seq_len(nrow(dt)) + 3)) - 1
groups
#>  [1] 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4
#> [36] 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5
#> [71] 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5

Shuffle this vector and add it as a column.

dt$group <- sample(groups)

Facet using this new column to generate the desired plots.

ggplot(dt, aes(x, y, size = size)) +
  geom_point() +
  facet_wrap(~group)

plot2

1
votes

First of all, the question's data creation code can be greatly simplified, rewritten with no loops at all. R is a vectorized language and the following will create a data frame with the same structure.

Don't forget to set the RNG seed, in order to make the results reproducible.

library(ggplot2)

set.seed(2020)    # make the results reproducible

sd <- rep(rep(mySD, each = 2), 10)
M <- rep(myMeans, 2*10)
x <- abs(rnorm(n = 40, mean = M, sd = sd))
y <- abs(rnorm(n = 40, mean = M, sd = sd))
size <- runif(40, 1, 25)
iter <- seq_along(x)
dt2 <- data.frame(x, y, size, M, sd, iter)
dt2$col <- c("blue", "red")

Now the plots. The following function accepts a data frame X as its first argument and a number of points to draw as the second one. Then plots n points chosen at random with color col and size (a continuous variable) size.

plot_fun <- function(X, n){
  Colors <- unique(X[["col"]])
  Colors <- setNames(Colors, Colors)
  i <- sample(nrow(X), n)
  g <- ggplot(X[i,], aes(x, y, size = size, color = col)) +
    geom_point() +
    scale_color_manual(values = Colors) +
    theme_bw()
  g
}

plot_fun(dt2, 8)

To plot several values for n, produce the plots with lapply then use grid.arrange from package gridExtra.

plot_list <- lapply(c(4,8,16,32), function(n) plot_fun(dt2, n))
gridExtra::grid.arrange(grobs = plot_list)

enter image description here

Individual plots are still possible with

plot_list[[1]]
plot_list[[2]]

and so on.


Another way is to use faceting. Write another function, plot_fun_facets assigning the number of points to a new variable in the sample data frames, n, and use that variable as a faceting variable.

plot_fun_facets <- function(X, n){
  Colors <- unique(X[["col"]])
  Colors <- setNames(Colors, Colors)
  X_list <- lapply(n, function(.n){
    i <- sample(nrow(X), .n)
    Y <- X[i,]
    Y$n <- .n
    Y
  })
  X <- do.call(rbind, X_list)
  g <- ggplot(X, aes(x, y, size = size, color = col)) +
    geom_point() +
    scale_color_manual(values = Colors) +
    facet_wrap(~ n) +
    theme_bw()
  g
}

plot_fun_facets(dt2, c(4,8,16,32))

enter image description here