1
votes

When working on an R Markdown Rmd., can I prevent Knitr from downloading a file each time the Markdown is knitted?

My code chunk is:

download.file(url = paste('https://d396qusza40orc.cloudfront.net/',
                      'repdata/data/StormData.csv.bz2',
                      sep = ''),
          destfile = './storm.csv.bz2',
          method = 'curl')) 

The system time of the chunk isn't that significant in and by itself:

user    system   elapsed 
0.893   1.139    28.825 

But perhaps there's a way to cache the download or something so I can review the HTML quicker.

2
You could use knitr caching, for starters. If you want it to be more permanent even when the cache folder is deleted (or for any other reason don't want to use caching), you could put the download in an if statement, like if (!file.exists('./storm.csv.bz2')) {David Robinson
Very useful reference, thanks David.RDJ

2 Answers

5
votes

You need to check if the file exists before attempting to download.

   destfile <- './storm.csv.bz2'
    if (!file.exists(destfile))
    {
      your code
    }
3
votes

Use httr, GET and write_disk since, if destfile exists, write_disk will not let GET perform the download (acts like a mini-cache operation). GET also uses RCurl under the covers.

library(httr)

try(GET(url, write_disk(destfile)))