2
votes

Consider the following tibble:

## # A tibble: 16 x 4
##    name              genus        order        sleep_total
##    <chr>             <chr>        <chr>              <dbl>
##  1 Cheetah           Acinonyx     Carnivora          12.1 
##  2 Northern fur seal Callorhinus  Carnivora           8.70
##  3 Vesper mouse      Calomys      Rodentia            7.00
##  4 Dog               Canis        Carnivora          10.1 
##  5 Roe deer          Capreolus    Artiodactyla        3.00
##  6 Goat              Capri        Artiodactyla        5.30
##  7 Guinea pig        Cavis        Rodentia            9.40
##  8 Domestic cat      Felis        Carnivora          12.5 
##  9 Gray seal         Haliochoerus Carnivora           6.20
## 10 Tiger             Panthera     Carnivora          15.8 
## 11 Jaguar            Panthera     Carnivora          10.4 
## 12 Lion              Panthera     Carnivora          13.5 
## 13 Caspian seal      Phoca        Carnivora           3.50
## 14 Genet             Genetta      Carnivora           6.30
## 15 Arctic fox        Vulpes       Carnivora          12.5 
## 16 Red fox           Vulpes       Carnivora           9.80

I'd like to only select the columns that contain at least one row value called 'Carnivora'.

The expected output in this case would be:

## order       
## <chr>       
## Carnivora   
## Carnivora   
## Rodentia    
## Carnivora   
## Artiodactyla
## Artiodactyla
## Rodentia    
## Carnivora   
## Carnivora   
## Carnivora   
## Carnivora   
## Carnivora   
## Carnivora   
## Carnivora   
## Carnivora   
## Carnivora

Someone else has provided how to get the rows that contain this. However, this includes columns that don't contain this value.

sleep %>% 
  select(name:order, sleep_total, -vore) %>% 
  filter_all(any_vars(str_detect(., pattern = "Ca")))
2

2 Answers

4
votes

We can use select_if

library(dplyr)
ggplot2::msleep %>% select_if(~any(. == "Carnivora", na.rm = TRUE))

#          order
#1     Carnivora
#2     Carnivora
#3      Rodentia
#4     Carnivora
#5  Artiodactyla
#6  Artiodactyla
#7      Rodentia
#8     Carnivora
#9     Carnivora
#10    Carnivora
#11    Carnivora
#12    Carnivora
#13    Carnivora
#14    Carnivora
#15    Carnivora
#16    Carnivora

Or in base R using colSums

msleep[colSums(msleep == "Carnivora", na.rm = TRUE) > 0]

Or apply

msleep[apply(msleep == "Carnivora", 2, any, na.rm = TRUE)]
0
votes

We can use %in% in base R with sapply

library(ggplot2)
msleep[sapply(msleep, function(x) "Carnivora" %in% x)]