They are different packages with different purposes. One is not a substitute for the other, despite there being a small subset of functionality for which they overlap.
Here is the brief summary of each package, from the packages themselves:
The plyr package is a set of clean and consistent tools that implement the split-apply-combine pattern in R. This is an extremely common pattern in data analysis: you solve a complex problem by breaking it down into small pieces, doing something to each piece and then combining the results back together again.
and
data.table
... offers fast subset, fast grouping, fast update, fast ordered joins and list columns in a short and flexible syntax, for faster development. It is inspired by A[B] syntax in R where A is a matrix and B is a 2-column matrix.
Where they overlap is in the "fast grouping" which plyr also does by splitting data.frames, operating on pieces, and recombining them into a single data.frame. data.table
has many other features which make operations on data.frame like structures fast; plyr
has features which apply the split-apply-combine paradigm to other data structures such as lists and arrays (both as inputs and outputs).
So, really, they are two different tools that happen to have a small area of overlap which address the same problem domain, but each does much more than that and if you want/need that additional functionality, then that package should be used.
plyr
do anything fordata.frame
's better? – eddiarray
is much faster thataaply
. – Paul Hiemstra