439
votes

I often find myself writing R scripts that generate a lot of output. I find it cleaner to put this output into it's own directory(s). What I've written below will check for the existence of a directory and move into it, or create the directory and then move into it. Is there a better way to approach this?

mainDir <- "c:/path/to/main/dir"
subDir <- "outputDirectory"

if (file.exists(subDir)){
    setwd(file.path(mainDir, subDir))
} else {
    dir.create(file.path(mainDir, subDir))
    setwd(file.path(mainDir, subDir))

}
9
I'm sure I've seen an R function that creates a temporary directory with a randomly generated name and returns the name. I think there's a similar one that creates a temp file. I can't find them offhand, but the Databel package (cran.r-project.org/web/packages/DatABEL/index.html) has a function get_temporary_file_name.PaulHurleyuk
You should never use setwd() in R code - it basically defeats the idea of using a working directory because you can no longer easily move your code between computers.hadley
@hadley interesting topic to ponder, I'd appreciate your thoughts on other methods to the same end. At work, all computers are sync'd to the same network so file paths are consistent. If they aren't, we have bigger issues to deal with than portability of a script. In this particular example, I was writing a script that would be loaded on a machine that will be carried around our national parks for 2 years. This script will grab data from a local SQL instance, do some processing, and spit out a .csv. The end product will be a .bat file that the end user will never have to modify.Chase
@Marek - ahh, I see. So you're saying I should replace my calls to setwd() with something like write.table(file = "path/to/output/directory", ...)?Chase
Yep. Or parametrize out_dir <- "path/to/output/directory" and then use write.table(file = file.path(out_dir,"table_1.csv"), ...). Or even out_file <- function(fnm) file.path("path/to/output/directory", fnm) and then write.table(file = out_file("table_1.csv"), ...) (similar method I use when working with network drives).Marek

9 Answers

440
votes

Use showWarnings = FALSE:

dir.create(file.path(mainDir, subDir), showWarnings = FALSE)
setwd(file.path(mainDir, subDir))

dir.create() does not crash if the directory already exists, it just prints out a warning. So if you can live with seeing warnings, there is no problem with just doing this:

dir.create(file.path(mainDir, subDir))
setwd(file.path(mainDir, subDir))
174
votes

As of April 16, 2015, with the release of R 3.2.0 there's a new function called dir.exists(). To use this function and create the directory if it doesn't exist, you can use:

ifelse(!dir.exists(file.path(mainDir, subDir)), dir.create(file.path(mainDir, subDir)), FALSE)

This will return FALSE if the directory already exists or is uncreatable, and TRUE if it didn't exist but was succesfully created.

Note that to simply check if the directory exists you can use

dir.exists(file.path(mainDir, subDir))
21
votes

Here's the simple check, and creates the dir if doesn't exists:

## Provide the dir name(i.e sub dir) that you want to create under main dir:
output_dir <- file.path(main_dir, sub_dir)

if (!dir.exists(output_dir)){
dir.create(output_dir)
} else {
    print("Dir already exists!")
}
17
votes

In terms of general architecture I would recommend the following structure with regard to directory creation. This will cover most potential issues and any other issues with directory creation will be detected by the dir.create call.

mainDir <- "~"
subDir <- "outputDirectory"

if (file.exists(paste(mainDir, subDir, "/", sep = "/", collapse = "/"))) {
    cat("subDir exists in mainDir and is a directory")
} else if (file.exists(paste(mainDir, subDir, sep = "/", collapse = "/"))) {
    cat("subDir exists in mainDir but is a file")
    # you will probably want to handle this separately
} else {
    cat("subDir does not exist in mainDir - creating")
    dir.create(file.path(mainDir, subDir))
}

if (file.exists(paste(mainDir, subDir, "/", sep = "/", collapse = "/"))) {
    # By this point, the directory either existed or has been successfully created
    setwd(file.path(mainDir, subDir))
} else {
    cat("subDir does not exist")
    # Handle this error as appropriate
}

Also be aware that if ~/foo doesn't exist then a call to dir.create('~/foo/bar') will fail unless you specify recursive = TRUE.

15
votes

One-liner:

if (!dir.exists(output_dir)) {dir.create(output_dir)}

Example:

dateDIR <- as.character(Sys.Date())
outputDIR <- file.path(outD, dateDIR)
if (!dir.exists(outputDIR)) {dir.create(outputDIR)}
9
votes

The use of file.exists() to test for the existence of the directory is a problem in the original post. If subDir included the name of an existing file (rather than just a path), file.exists() would return TRUE, but the call to setwd() would fail because you can't set the working directory to point at a file.

I would recommend the use of file_test(op="-d", subDir), which will return "TRUE" if subDir is an existing directory, but FALSE if subDir is an existing file or a non-existent file or directory. Similarly, checking for a file can be accomplished with op="-f".

Additionally, as described in another comment, the working directory is part of the R environment and should be controlled by the user, not a script. Scripts should, ideally, not change the R environment. To address this problem, I might use options() to store a globally available directory where I wanted all of my output.

So, consider the following solution, where someUniqueTag is just a programmer-defined prefix for the option name, which makes it unlikely that an option with the same name already exists. (For instance, if you were developing a package called "filer", you might use filer.mainDir and filer.subDir).

The following code would be used to set options that are available for use later in other scripts (thus avoiding the use of setwd() in a script), and to create the folder if necessary:

mainDir = "c:/path/to/main/dir"
subDir = "outputDirectory"

options(someUniqueTag.mainDir = mainDir)
options(someUniqueTag.subDir = "subDir")

if (!file_test("-d", file.path(mainDir, subDir)){
  if(file_test("-f", file.path(mainDir, subDir)) {
    stop("Path can't be created because a file with that name already exists.")
  } else {
    dir.create(file.path(mainDir, subDir))
  }
}

Then, in any subsequent script that needed to manipulate a file in subDir, you might use something like:

mainDir = getOption(someUniqueTag.mainDir)
subDir = getOption(someUniqueTag.subDir)
filename = "fileToBeCreated.txt"
file.create(file.path(mainDir, subDir, filename))

This solution leaves the working directory under the control of the user.

9
votes

I had an issue with R 2.15.3 whereby while trying to create a tree structure recursively on a shared network drive I would get a permission error.

To get around this oddity I manually create the structure;

mkdirs <- function(fp) {
    if(!file.exists(fp)) {
        mkdirs(dirname(fp))
        dir.create(fp)
    }
} 

mkdirs("H:/foo/bar")
2
votes

To find out if a path is a valid directory try:

file.info(cacheDir)[1,"isdir"]

file.info does not care about a slash on the end.

file.exists on Windows will fail for a directory if it ends in a slash, and succeeds without it. So this cannot be used to determine if a path is a directory.

file.exists("R:/data/CCAM/CCAMC160b_echam5_A2-ct-uf.-5t05N.190to240E_level1000/cache/")
[1] FALSE

file.exists("R:/data/CCAM/CCAMC160b_echam5_A2-ct-uf.-5t05N.190to240E_level1000/cache")
[1] TRUE

file.info(cacheDir)["isdir"]
0
votes

I know this question was asked a while ago, but in case useful, the here package is really helpful for not having to reference specific file paths and making code more portable. It will automatically define your working directory as the one that your .Rproj file resides in, so the following will often suffice without having to define the file path to your working directory:

library(here)

if (!dir.exists(here(outputDir))) {dir.create(here(outputDir))}