95
votes

I often have a main R Markdown file or knitr LaTeX file where I source some other R file (e.g., for data processing). However, I was thinking that in some instances it would be beneficial to have these sourced files be their own reproducible documents (e.g., an R Markdown file that not only includes commands for data processing but also produces a reproducible document that explains the data processing decisions).

Thus, I would like to have a command like source('myfile.rmd') in my main R Markdown file. that would extract and source all the R code inside the R code chunks of myfile.rmd. Of course, this gives rise to an error.

The following command works:

```{r message=FALSE, results='hide'}
knit('myfile.rmd', tangle=TRUE)
source('myfile.R')
```

where results='hide' could be omitted if the output was desired. I.e., knitr outputs the R code from myfile.rmd into myfile.R.

However, it doesn't seem perfect:

  • it results in the creation of an extra file
  • it needs to appear in its own code chunk if control over the display is required.
  • It's not as elegant as simple source(...).

Thus my question: Is there a more elegant way of sourcing the R code of an R Markdown file?

11
I'm actually having a really hard time understanding your question (I read it several times). You can source other R scripts easily into a Rmd file. But you also want to source in other markdown files into a file being knitted?Maiasaura
I want to source the R code inside R code chunks in R Markdown files (i.e., *.rmd)? I've edited the question a little bit to try to make things clearer.Jeromy Anglim
Something along the lines of include in latex. If markdown supports inclusion of other markdown documents, it should be relatively easy to create such a function.Paul Hiemstra
@PaulHiemstra I guess that the ability to source the text and R code chunks would be useful also. I'm specifically thinking of sourcing just the code in an R Markdown document.Jeromy Anglim

11 Answers

36
votes

It seems you are looking for a one-liner. How about putting this in your .Rprofile?

ksource <- function(x, ...) {
  library(knitr)
  source(purl(x, output = tempfile()), ...)
}

However, I do not understand why you want to source() the code in the Rmd file itself. I mean knit() will run all the code in this document, and if you extract the code and run it in a chunk, all the code will be run twice when you knit() this document (you run yourself inside yourself). The two tasks should be separate.

If you really want to run all the code, RStudio has made this fairly easy: Ctrl + Shift + R. It basically calls purl() and source() behind the scene.

21
votes

Factor the common code out into a separate R file, and then source that R file into each Rmd file you want it in.

so for example let's say I have two reports I need to make, Flu Outbreaks and Guns vs Butter Analysis. Naturally I'd create two Rmd documents and be done with it.

Now suppose boss comes along and wants to see the variations of Flu Outbreaks versus Butter prices (controlling for 9mm ammo).

  • Copying and pasting the code to analyze the reports into the new report is a bad idea for code reuse, etc.
  • I want it to look nice.

My solution was to factor the project into these files:

  • Flu.Rmd
    • flu_data_import.R
  • Guns_N_Butter.Rmd
    • guns_data_import.R
    • butter_data_import.R

within each Rmd file I'd have something like:

```{r include=FALSE}
source('flu_data_import.R')
```

The problem here is that we lose reproducibility. My solution to that is to create a common child document to include into each Rmd file. So at the end of every Rmd file I create, I add this:

```{r autodoc, child='autodoc.Rmd', eval=TRUE}
``` 

And, of course, autodoc.Rmd:

Source Data & Code
----------------------------
<div id="accordion-start"></div>

```{r sourcedata, echo=FALSE, results='asis', warnings=FALSE}

if(!exists(autodoc.skip.df)) {
  autodoc.skip.df <- list()
}

#Generate the following table:
for (i in ls(.GlobalEnv)) {
  if(!i %in% autodoc.skip.df) {
    itm <- tryCatch(get(i), error=function(e) NA )
    if(typeof(itm)=="list") {
      if(is.data.frame(itm)) {
        cat(sprintf("### %s\n", i))
        print(xtable(itm), type="html", include.rownames=FALSE, html.table.attributes=sprintf("class='exportable' id='%s'", i))
      }
    }
  }
}
```
### Source Code
```{r allsource, echo=FALSE, results='asis', warning=FALSE, cache=FALSE}
fns <- unique(c(compact(llply(.data=llply(.data=ls(all.names=TRUE), .fun=function(x) {a<-get(x); c(normalizePath(getSrcDirectory(a)),getSrcFilename(a))}), .fun=function(x) { if(length(x)>0) { x } } )), llply(names(sourced), function(x) c(normalizePath(dirname(x)), basename(x)))))

for (itm in fns) {
  cat(sprintf("#### %s\n", itm[2]))
  cat("\n```{r eval=FALSE}\n")
  cat(paste(tryCatch(readLines(file.path(itm[1], itm[2])), error=function(e) sprintf("Could not read source file named %s", file.path(itm[1], itm[2]))), sep="\n", collapse="\n"))
  cat("\n```\n")
}
```
<div id="accordion-stop"></div>
<script type="text/javascript">
```{r jqueryinclude, echo=FALSE, results='asis', warning=FALSE}
cat(readLines(url("http://code.jquery.com/jquery-1.9.1.min.js")), sep="\n")
```
</script>
<script type="text/javascript">
```{r tablesorterinclude, echo=FALSE, results='asis', warning=FALSE}
cat(readLines(url("http://tablesorter.com/__jquery.tablesorter.js")), sep="\n")
```
</script>
<script type="text/javascript">
```{r jqueryuiinclude, echo=FALSE, results='asis', warning=FALSE}
cat(readLines(url("http://code.jquery.com/ui/1.10.2/jquery-ui.min.js")), sep="\n")
```
</script>
<script type="text/javascript">
```{r table2csvinclude, echo=FALSE, results='asis', warning=FALSE}
cat(readLines(file.path(jspath, "table2csv.js")), sep="\n")
```
</script>
<script type="text/javascript">
  $(document).ready(function() {
  $('tr').has('th').wrap('<thead></thead>');
  $('table').each(function() { $('thead', this).prependTo(this); } );
  $('table').addClass('tablesorter');$('table').tablesorter();});
  //need to put this before the accordion stuff because the panels being hidden makes table2csv return null data
  $('table.exportable').each(function() {$(this).after('<a download="' + $(this).attr('id') + '.csv" href="data:application/csv;charset=utf-8,'+encodeURIComponent($(this).table2CSV({delivery:'value'}))+'">Download '+$(this).attr('id')+'</a>')});
  $('#accordion-start').nextUntil('#accordion-stop').wrapAll("<div id='accordion'></div>");
  $('#accordion > h3').each(function() { $(this).nextUntil('h3').wrapAll("<div>"); });
  $( '#accordion' ).accordion({ heightStyle: "content", collapsible: true, active: false });
</script>

N.B., this is designed for the Rmd -> html workflow. This will be an ugly mess if you go with latex or anything else. This Rmd document looks through the global environment for all the source()'ed files and includes their source at the end of your document. It includes jquery ui, tablesorter, and sets the document up to use an accordion style to show/hide sourced files. It's a work in progress, but feel free to adapt it to your own uses.

Not a one-liner, I know. Hope it gives you some ideas at least :)

6
votes

Try the purl function from knitr:

source(knitr::purl("myfile.rmd", quiet=TRUE))

4
votes

Probably one should start thinking different. My issue is the following: Write every code you normally would have had in a .Rmd chunk in a .R file. And for the Rmd document you use to knit i.e. an html, you only have left

```{R Chunkname, Chunkoptions}  
source(file.R)  
```

This way you'll probably create a bunch of .R files and you lose the advantage of processing all the code "chunk after chunk" using ctrl+alt+n (or +c, but normally this does not work). But, I read the book about reproducible research by Mr. Gandrud and realized, that he definitely uses knitr and .Rmd files solely for creating html files. The Main Analysis itself is an .R file. I think .Rmd documents rapidly grow too large if you start doing your whole analysis inside.

3
votes

If you are just after the code I think something along these lines should work:

  1. Read the markdown/R file with readLines
  2. Use grep to find the code chunks, searching for lines that start with <<< for example
  3. Take subset of the object that contains the original lines to get only the code
  4. Dump this to a temporary file using writeLines
  5. Source this file into your R session

Wrapping this in a function should give you what you need.

3
votes

The following hack worked fine for me:

library(readr)
library(stringr)
source_rmd <- function(file_path) {
  stopifnot(is.character(file_path) && length(file_path) == 1)
  .tmpfile <- tempfile(fileext = ".R")
  .con <- file(.tmpfile) 
  on.exit(close(.con))
  full_rmd <- read_file(file_path)
  codes <- str_match_all(string = full_rmd, pattern = "```(?s)\\{r[^{}]*\\}\\s*\\n(.*?)```")
  stopifnot(length(codes) == 1 && ncol(codes[[1]]) == 2)
  codes <- paste(codes[[1]][, 2], collapse = "\n")
  writeLines(codes, .con)
  flush(.con)
  cat(sprintf("R code extracted to tempfile: %s\nSourcing tempfile...", .tmpfile))
  source(.tmpfile)
}
2
votes

I use the following custom function

source_rmd <- function(rmd_file){
  knitr::knit(rmd_file, output = tempfile())
}

source_rmd("munge_script.Rmd")
1
votes

I would recommend keeping the main analysis and calculation code in .R file and importing the chunks as needed in .Rmd file. I have explained the process here.

1
votes

sys.source("./your_script_file_name.R", envir = knitr::knit_global())

put this command before calling the functions contained in the your_script_file_name.R.

the "./" adding before your_script_file_name.R to show the direction to your file if you already created a Project.

You can see this link for more detail: https://bookdown.org/yihui/rmarkdown-cookbook/source-script.html

1
votes

I use this one-liner:

```{r optional_chunklabel_for_yourfile_rmd, child = 'yourfile.Rmd'}
```

See: My .Rmd file becomes very lengthy. Is that possible split it and source() it's smaller portions from main .Rmd?

0
votes

this worked for me

source("myfile.r", echo = TRUE, keep.source = TRUE)