What would be a good tidyverse approach to this type of problem? I want to filter out the duplicated rows of group
that have an NA
in them (keeping the row that has values for both var1
and var2
) but keep the rows when there is no duplicated value in group
. dat
illustrates the raw example with expected_output
showing what I'd hope to have.
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tibble)
dat <- tibble::tribble(
~group, ~var1, ~var2,
"A", "foo", NA,
"A", "foo", "bar",
"B", "foo", NA,
"C", NA, "bar",
"C", "foo", "bar",
"D", NA, "bar",
"E", "foo", "bar",
"E", NA, "bar"
)
expected_output <- tibble::tribble(
~group, ~var1, ~var2,
"A", "foo", "bar",
"B", "foo", NA,
"C", "foo", "bar",
"D", NA, "bar",
"E", "foo", "bar"
)
expected_output
#> # A tibble: 5 x 3
#> group var1 var2
#> <chr> <chr> <chr>
#> 1 A foo bar
#> 2 B foo <NA>
#> 3 C foo bar
#> 4 D <NA> bar
#> 5 E foo bar
Any suggestions or ideas?