48
votes

In the documentation, R suggests that raw data files (not Rdata nor Rda) should be placed in inst/extdata/

From the first paragraph in: http://cran.r-project.org/doc/manuals/R-exts.html#Data-in-packages

The data subdirectory is for data files, either to be made available via lazy-loading or for loading using data(). (The choice is made by the ‘LazyData’ field in the DESCRIPTION file: the default is not to do so.) It should not be used for other data files needed by the package, and the convention has grown up to use directory inst/extdata for such files.

So, I have moved all of my raw data into this folder, but when I build and reload the package and then try to access the data in a function with (for example):

read.csv(file=paste(path.package("my_package"),"/inst/extdata/my_raw_data.csv",sep="")) 
# .path.package is now path.package in R 3.0+

I get the "cannot open file" error.

However, it does look like there is a folder called /extdata in the package directory with the files in it (post-build and install). What's happening to the /inst folder?

Does everything in the /inst folder get pushed into the / of the package?

2
All the folders in the /inst folder get their own place in the top directory of the package. Basically everything in /inst ends up in the top directory so any folders in there end up as their own folders. But this is just from experience and I can't find anything in R exts that explains that...Dason
I'll just add that I prefer file.path for creating a path to a fileDason
you can/should use system.file to access files in the inst folderandy

2 Answers

71
votes

More useful than using file.path would be to use system.file. Once your package is installed, you can grab your file like so:

fpath <- system.file("extdata", "my_raw_data.csv", package="my_package")

fpath will now have the absolute path on your HD to the file.

38
votes

You were both very close and essentially had this. A formal reference from 'Writing R Extensions' is:

1.1.3 Package subdirectories

[...]

The contents of the inst subdirectory will be copied recursively to the installation directory. Subdirectories of inst should not interfere with those used by R (currently, R, data, demo, exec, libs, man, help, html and Meta, and earlier versions used latex, R-ex). The copying of the inst happens after src is built so its Makefile can create files to be installed. Prior to R 2.12.2, the files were installed on POSIX platforms with the permissions in the package sources, so care should be taken to ensure these are not too restrictive: R CMD build will make suitable adjustments. To exclude files from being installed, one can specify a list of exclude patterns in file .Rinstignore in the top-level source directory. These patterns should be Perl-like regular expressions (see the help for regexp in R for the precise details), one per line, to be matched(10) against the file and directory paths, e.g. doc/.*[.]png$ will exclude all PNG files in inst/doc based on the (lower-case) extension.