Fans of the Tidyverse regularly give several advantages of using tibbles rather than data frames. Most of them seem designed to protect the user from making mistakes. For example, unlike data frames, tibbles:
- Don't need a
,drop=FALSE
argument to not drop dimensions from your data. - Will not let the
$
operator do partial matching for column names. - Only recycle your input vectors if they are of exactly length one.
I'm steadily becoming convinced to replace all of my data frames with tibbles. What are the primary disadvantages of doing so? More specifically, what can a data frame do that a tibble cannot?
Preemptively, I would like to make it clear that I am not asking about data.table
or any big-picture objections to the Tidyverse. I am strictly asking about tibbles and data frames.
data.frame
, just with additional methods. So it's not so much what's different about a data frame, as how tibble modifies data frame behaviour. The differences are captured in the tibbles vignette. Personally I think the modifiedprint
method is the most useful feature. – neilfwsdata.frame
]x[i]
with a logical or a 2-column integer matrixi
using[
is not recommended" (?[.data.frame
) it can be handy (e.g. here, here). It seems like such indexing can't be used on a tibble.tb = tibble(x = 1:3, y = 4:6)
;m = cbind(c(3, 2), c(2, 1))
;tb[m]
;df = as.data.frame(tb)
;df[m]
– Henrik