122
votes

I am working with the dataset LearnBayes. For those that want to see the actual data:

install.packages('LearnBayes')

I am trying to filter out rows based on the value in the columns. For example, if the column value is "water", then I want that row. If the column value is "milk", then I don't want it. Ultimately, I am trying to filter out all individuals who's Drink column is "water".

3
Try reading ?'[' and then read ?subset.joran
Thanks for the pointers. Definitely handy advice and look forward to using it in the future.user722224
I suggest you read the very good R manuals: cran.r-project.org/doc/manuals/R-intro.htmlAndrie

3 Answers

258
votes

The subset command is not necessary. Just use data frame indexing

studentdata[studentdata$Drink == 'water',]

Read the warning from ?subset

This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like ‘[’, and in particular the non-standard evaluation of argument ‘subset’ can have unanticipated consequences.

84
votes

Try this:

subset(studentdata, Drink=='water')

that should do it.

47
votes

Thought I'd update this with a dplyr solution

library(dplyr)    
filter(studentdata, Drink == "water")