4
votes

I'm creating an R package that will use a single function from plyr. According to this roxygen2 vignette:

If you are using just a few functions from another package, the recommended option is to note the package name in the Imports: field of the DESCRIPTION file and call the function(s) explicitly using ::, e.g., pkg::fun().

That sounds good. I'm using plyr::ldply() - the full call with :: - so I list plyr in Imports: in my DESCRIPTION file. However, when I use devtools::check() I get this:

* checking dependencies in R code ... NOTE
All declared Imports should be used:
  ‘plyr’
  All declared Imports should be used.

Why do I get this note?

I am able to avoid the note by adding @importFrom dplyr ldply in the file that is using plyr, but then I end but having ldply in my package namespace. Which I do not want, and should not need as I am using plyr::ldply() the single time I use the function.

Any pointers would be appreciated!

(This question might be relevant.)

1
Is roxygen2 overwriting your NAMESPACE file? I believe you need to have @import dplyr to make sure it is added automatically.cdeterman
Thank you for answering. roxygen2 is indeed updating my NAMESPACE file. I do not want to use @import dplyr, because that imports the whole namespace of dplyr. According to the roxygen2 vignette and the other question I referred to I shouldn't need to import it either.L42

1 Answers

12
votes

If ldply() is important for your package's functionality, then you do want it in your package namespace. That is the point of namespace imports. Functions that you need, should be in the package namespace because this is where R will look first for the definition of functions, before then traversing the base namespace and the attached packages. It means that no matter what other packages are loaded or unloaded, attached or unattached, your package will always have access to that function. In such cases, use:

@importFrom plyr ldply

And you can just refer to ldply() without the plyr:: prefix just as if it were another function in your package.

If ldply() is not so important - perhaps it is called only once in a not commonly used function - then, Writing R Extensions 1.5.1 gives the following advice:

If a package only needs a few objects from another package it can use a fully qualified variable reference in the code instead of a formal import. A fully qualified reference to the function f in package foo is of the form foo::f. This is slightly less efficient than a formal import and also loses the advantage of recording all dependencies in the NAMESPACE file (but they still need to be recorded in the DESCRIPTION file). Evaluating foo::f will cause package foo to be loaded, but not attached, if it was not loaded already—this can be an advantage in delaying the loading of a rarely used package.

(I think this advice is actually a little outdated because it is implying more separation between DESCRIPTION and NAMESPACE than currently exists.) It implies you should use @import plyr and refer to the function as plyr::ldply(). But in reality, it's actually suggesting something like putting plyr in the Suggests field of DESCRIPTION, which isn't exactly accommodated by roxygen2 markup nor exactly compliant with R CMD check.

In sum, the official line is that Hadley's advice (which you are quoting) is only preferred for rarely used functions from rarely used packages (and/or packages that take a considerable amount of time to load). Otherwise, just do @importFrom like WRE advises:

Using importFrom selectively rather than import is good practice and recommended notably when importing from packages with more than a dozen exports.