I have a data frame in which several data sources are merged. This creates rows with the same id. Now I want to define which values from which row should be kept.
So far I have been using dplyr with group_by and summarize all to keep the first value if it is not NA.
Here's an example:
# function f for summarizing
f <- function(x) {
x <- na.omit(x)
if (length(x) > 0) first(x) else NA
}
# test data
test <- data.frame(id = c(1,2,1,2), value1 = c("a",NA,"b","c"), value2 = c(0:4))
id value1 value2
1 a 0
2 <NA> 1
1 b 2
2 c 3
The following result is obtained when merging
test <- test %>% group_by(id) %>% summarise_all(funs(f))
id value1 value2
1 a 0
2 c 1
Now the question: that NA (na.omit) be replaced already works, but how can I define that not the numerical value 0, but the value not equal to 0 is accepted. So the expected result looks like this:
id value1 value2
1 a 2
2 c 1