3
votes

I'm working on a project that involves many different tibbles, all of which have a period variable of the format YYYYMM. Below is an example of how all my tibbles look like:

tibble_1 <- tibble::tibble(
    period = c(201901, 201912, 201902, 201903),
    var_1 = rnorm(4),
    var_2 = rnorm(4)
)

But for some operations (i.e. time series plots) it's easier to work with an actual Date variable. So I'm using mutate to transform the period variable into a date like follows:

tibble_1 %>% 
  dplyr::mutate(
    date = lubridate::ymd(stringr::str_c(period, "01"))
  )

Since I will be doing this a lot, and the date transformation is not the only mutation I am going to be doing when calling mutate, I'd like to have a user-defined function that I can call from within the mutate call. Here's my function:

period_to_date <- function() {
  lubridate::ymd(stringr::str_c(period, "01"))
}

Which I would later call like this:

tibble_1 %>% 
  dplyr::mutate(
    date = period_to_date()
  )

Problem is, R can't find the period object (which is not really an object in on itself, but part of the tibble).

Error in stri_c(..., sep = sep, collapse = collapse, ignore_null = TRUE) : object 'period' not found

I'm pretty sure I need to define a data-mask so that the envir in which period_to_date is executed can look for the object in it's parent envir (which should always be the caller envir since the tibble containing the period variable is not always the same), but I can't seem to figure out how to do it.

2

2 Answers

4
votes

The function does not know which object you want to modify. Pass the period object in the function and use it like :

period_to_date <- function(period) {
  lubridate::ymd(stringr::str_c(period, "01"))
  #Can also use
  #as.Date(paste0(period,"01"), "%Y%m%d")
}

tibble_1 %>% 
  dplyr::mutate(date = period_to_date(period))

#  period   var_1  var_2 date      
#   <dbl>   <dbl>  <dbl> <date>    
#1 201901 -0.476  -0.456 2019-01-01
#2 201912 -0.645   1.45  2019-12-01
#3 201902 -0.0939 -0.982 2019-02-01
#4 201903  0.410   0.954 2019-03-01
1
votes

Consider passing the column name as an argument to your function:

library(dplyr)


period_to_date <- function(x) {
  lubridate::ymd(stringr::str_c(x, "01"))
}

df <- data.frame(x = 1:3, period = c('201903', '202001', '201511'))

df %>% mutate(p2 = period_to_date(period))
#>   x period         p2
#> 1 1 201903 2019-03-01
#> 2 2 202001 2020-01-01
#> 3 3 201511 2015-11-01

Created on 2020-01-10 by the reprex package (v0.3.0)