0
votes

Given two data frames:

data_1 have all the variables I need to use, but some variables have missing values(NA).

> ID      Group                    Ne             Cars          
>  1      Control                  NA              Yes     
>  
   2      Patient                  A              NA
>
   3      Patient                  NA             No

data_2 is formed by just some of the row_names and by some of the variables of data_1, but those data_2 variables contained some of the missing values of the variables in data_1.

> ID      Ne     Cars
> 
   1       A      Yes
>
   3       B       NA

I need it to look like

>ID     Group     NE  Cars
>
1    Control       A   Yes
>
2    Patient       A   NA
> 
3    Patient       B   No

I have tried data_3 <- merge(data_1, data2, by=c("Id","Group","Ne","Cars",...), all=TRUE/and all.x=TRUE/and all.y=TRUE), and obviously it does not work.

How can I merge both dataframe by keeping the information of data_1 and adding the missing values of the variables (from data_2) without adding/duplicating the row_names but merging?

Thanks!

1

1 Answers

0
votes

Try with:

library(tidyverse)

bind_rows(df1, df2) %>% group_by(ID) %>% 
    arrange(is.na(Group)) %>% fill(Group) %>% 
    arrange(is.na(Ne)) %>% fill(Ne) %>% 
    arrange(is.na(Cars)) %>% fill(Cars) %>% 
    distinct()

Output:

     ID Group   Ne    Cars 
  <int> <chr>   <chr> <chr>
1     1 Control A     Yes  
2     2 Patient A     NA   
3     3 Patient B     No