0
votes

What is the best way to vectorize an R function with arguments that accept vectors, static values, and NULLs? I'm running into a problem when I Map() a function with arguments that are sometimes supplied with NULLs. I get the follow error message (replicated using the code below):

Error in mapply(FUN = f, ..., SIMPLIFY = FALSE) : zero-length inputs cannot be mixed with those of non-zero length

To replicate this problem, I've written a function that returns n simulated values using parameters from data with optionally implemented min and max values.

#' foo (example function with some args defaulting to NULL)
#'
#' Returns simulated normal values using population parameters from data
#' 
#' @param data Numeric vector used to calculate population parameters
#' @param n Number of simulated data points to return
#' @param min Optional. Creates a truncation effect. Simulated values
#'   below min will be replaced with min.
#' @param max Optional. Creates a truncation effect. Simulated values
#'   above max will be replaced with max.
#' @return Numeric vector of simulated values.
foo <- function(data, n, min = NULL, max = NULL) {
  x <- rnorm(n, mean(data), sd(data))
  if (!is.null(min)) {
    x[x < min] <- min
  }
  if (!is.null(max)) {
    x[x > max] <- max
  }
  x
}

I'm working with lists and would like the function to return lists. So, here, the data vector is a list of numeric vectors.

## data vector
data <- replicate(5, rnorm(3), simplify = FALSE)

Other arguments can accept static (length(x) == 1) or dynamic values (length(x) == length(data)). When non-NULL values are supplied, it works whether args are given one or multiple values.

## static args (this works)
n <- 10
min <- -1.96
max <- 1.96
Map(foo, data, n, min, max)

## vector args (this works)
n <- sample(2:100, 5)
min <- runif(5, -4, -1)
max <- runif(5, 1, 4)
Map(foo, data, n, min, max)

But when args are passed a NULL value it breaks.

## null args (this doesn't work)
n <- sample(2:100, 5)
min <- NULL
max <- NULL
Map(foo, data, n, min, max)

## it doesn't matter if n is a vector
n <- 10
min <- NULL
max <- NULL
Map(foo, data, n, min, max)


Error in mapply(FUN = f, ..., SIMPLIFY = FALSE) : 
  zero-length inputs cannot be mixed with those of non-zero length
1
Do you want to use the mean of each vector in your input list? Or the mean of all vectors (aggregate) in your input list? - CPak
This is not the actual function I'm trying to fix, but the equivalent would be to assume I want to use each vector independently. - mkearney
Since in your function, you declare min=NULL and max=NULL, you can get away with not passing min and max in your call. Using Map(foo,data,10) works for me. - CPak
This is to be used as a function so sometimes it will be given NULLs sometimes not. I realize I could create a different Map() call for each permutation that doesn't include a potentially NULL value, but the actual function in question has several args defaulting to NULL, so I'd rather have it done in one shot. - mkearney
Pass Map(foo,data,10,NA,NA) instead? min <- NA and max <- NA... - CPak

1 Answers

4
votes

I think the code you're looking for is

n <- sample(2:100, 5)
min <- list(NULL)
max <- list(NULL)
Map(foo, data, n, min, max)

The Map function expects each argument after the function to be a vector or list of arguments, which will be recycled to the length of the longest one. So in this case, we have length(data) and length(n) equal to 5 and length(min) and length(max) equal to 1, so the single NULL in the min and max lists is recycled 5 times and passed to the function each time.

Alternatively, if you want to do an "apply"-like operation where some arguments are vectors and others are scalars (i.e. single values to be passed to every call of the function), use mapply, passing the vector args directly and the scalar args inside MoreArgs:

n <- sample(2:100, 5)
min <- NULL
max <- NULL
mapply(foo, data, n, MoreArgs=list(min, max))

(Also, I haven't done so here for consistency with your code, but you should almost always pass arguments to apply-type functions with names (e.g. MoreArgs=list(min=min, max=max).)