3
votes

I'm using RMarkdown to functionally create a document using results = 'asis' with a purrr::map. There are multiple plots that come out of the chunk on each purrr iteration. Most of them are the same size, and can be set using the chunk options for figure size. However one or two need to have a different size. It is not possible to separate the code into different chunks due to the way the loop/map is set up.

The closest I've found is http://michaeljw.com/blog/post/subchunkify/, however when I use this on the plot that needs different sizing, it causes the first iteration's plots that were output using the print() function to be recycled in the subchunkify's plots location.

Is there a different, less hacky way to do this? Or is there something obvious in the subchunkify code that would be fixable?

Here is subchunkify():

subchunkify <- function(g, fig_height=7, fig_width=5) {
  g_deparsed <- paste0(deparse(
    function() {g}
  ), collapse = '')

  sub_chunk <- paste0("
  `","``{r sub_chunk_", floor(runif(1) * 10000), ", fig.height=", fig_height, ", fig.width=", fig_width, ", echo=FALSE}",
  "\n(", 
    g_deparsed
    , ")()",
  "\n`","``
  ")

  cat(knitr::knit(text = knitr::knit_expand(text = sub_chunk), quiet = TRUE))
}
3

3 Answers

3
votes

You can create a list of all of the specs for the plots then use purrr::pwalk:

```{r, echo = FALSE, results = 'asis'}
library(ggplot2)
library(purrr)
plots <- map(1:3, ~ggplot(mtcars, aes(wt, mpg)) + geom_point())
specs <- list(plots, fig_height = 1.5, fig_width = list(2, 3, 4))
pwalk(specs, subchunkify)
```

enter image description here

1
votes

So I haven't found an alternative to subchunkify(), however I did solve the issue with it reusing the same plots on each loop iteration (though I haven't dug into why it was yet).

I added an id argument to subchunkify() and included it in the file name, and then within my loop/map I created an id value that would be a combination of variables within each iteration that would be unique for each one.

subchunkify <- function(g, fig_height=7, fig_width=5, id = NULL) {
  g_deparsed <- paste0(deparse(
    function() {g}
  ), collapse = '')

  sub_chunk <- paste0("
  `","``{r sub_chunk_", id, "_", floor(runif(1) * 10000), ", fig.height=", fig_height, ", fig.width=", fig_width, ", echo=FALSE}",
  "\n(", 
    g_deparsed
    , ")()",
  "\n`","``
  ")

  cat(knitr::knit(text = knitr::knit_expand(text = sub_chunk), quiet = TRUE))
}

So I'm not sure why the runif in subchunkify was failing to result in distinct file names on each iteration. My suspicion is that it has something to do with how knitr caching works. I noticed that if a subsequent iteration of my loop ended up going through the same conditional chain to produce graph A, then graph A would be reused everywhere that the condition chain matched. However if an iteration went off on a different conditional branch to produce graph B, it would correctly generate a new graph. (However then graph B would be reused in all places with the same conditional branch ending).

This still doesn't explain why me introducing a unique file name with id works, but using runif doesn't since in both cases the file name should be unique, so this is only a guess.

So I guess if anyone else is having problems, I have a solution here but not an explanation. Very unsatisfying but does the trick!

1
votes

This might be too late but I want to share my way hacking through the reuse-plot-issue using subchunkify().

The main idea of subchunkify() is having the plot embedded inside a pseudo-subchunk. Each pseudo-subchunk need a unique name to be correctly referenced when knitting the final document. Subchunkify() utilized a random-number generator - runif() to handle unique pseudo-subchunk name, which works most of the time except dealing with loops or complicated markdown blocks.

Based on my observation the cause of reuse-plot-issue is locked random number seed. I suspected that knitting process mistakenly lock seed - set.seed() in complicated markdown structures, which led to identical random number list output from runif(), finally having same plots referenced at multiple location.

Adding a suffix id definitely fix this problem as it preserve unique subchunk names. Another hacky way is to unlock random number seed every time subchunkify() is needed.

subchunkify <- function(g, fig_height=7, fig_width=5, id = NULL) {
  rm(.Random.seed, envir=globalenv()) # to remove locked seed
  g_deparsed <- paste0(deparse(function() {g}), collapse = '')
  sub_chunk <- paste0("
  `","``{r sub_chunk_", id, "_", floor(runif(1) * 100000), ", fig.height=", fig_height, ", fig.width=", fig_width, ", echo=FALSE}",
    "\n(", 
  g_deparsed
  , ")()",
  "\n`","``
  ")

  cat(knitr::knit(text = knitr::knit_expand(text = sub_chunk), quiet = TRUE))
}

Simply have rm(.Random.seed, envir=globalenv()) added. For me this quick fix works like a charm.

By the way I would also encourage to have more digits in random number. runif() by chance can output same number in a long RMD file. With floor(runif(1) * 10000) used in a report with 50 embedded plots, it actually have ~10% chance to have at least one collison. Do floor(runif(1) * 1000000) instead of floor(runif(1) * 10000) to decrease chance of accidentally having random number collision.