1
votes

nubie here with a dataframe/mutate question... I want to update a dataframe (df1) based on data in another dataframe (df2). For one offs I've used MUTATE so I figure this is the way to go. Additionally I would like a check function added (TRUE/FALSE ?) to indicate if the the field in df1 was updated.

For Example..

df1-
 State
   <chr>
 1 N.Y. 
 2 FL   
 3 AL   
 4 MS   
 5 IL   
 6 WS   
 7 WA   
 8 N.J. 
 9 N.D. 
10 S.D. 
11 CALL 

df2
State New_State   
   <chr> <chr>       
 1 N.Y.  New York    
 2 FL    Florida     
 3 AL    Alabama     
 4 MS    Mississippi 
 5 IL    Illinois    
 6 WS    Wisconsin   
 7 WA    Washington  
 8 N.J.  New Jersey  
 9 N.D.  North Dakota
10 S.D.  South Dakota
11 CAL   California 

I want the output to look like this

df3
New_State          Test
  <chr>         
 1 New York        TRUE
 2 Florida         TRUE
 3 Alabama         TRUE
 4 Mississippi     TRUE
 5 Illinois        TRUE
 6 Wisconsin       TRUE
 7 Washington      TRUE
 8 New Jersey      TRUE
 9 North Dakota    TRUE
10 South Dakota    TRUE
11 CALL            FALSE

In essence I want R to read the data in df1 and change df1 based on the match in df2 chaining out to the full state name and replace. Lastly if the data in df1 was update mark as "TRUE" (N.Y. to NEW YORK) and "FALSE" if not updated (CALL vs CAL)

Thanks in advance for any and all help.

1

1 Answers

0
votes

This should give you the result you're looking for:

match_vec <- match(df1$State, table = df2$State)

This vector should match all the abbreviated state names in df1 with those in df2. Where there's no match, you end up with a missing value:

Then the following code using dplyr should produce the df3 you requested.

library(dplyr)
df3 <- df1 %>% 
  mutate(New_State = df2$New_State[match_vec]) %>% 
  mutate(Test = !is.na(match_vec)) %>% 
  mutate(New_State = ifelse(is.na(New_State), 
                            State, New_State)) %>% 
  select(New_State, Test)