1
votes

I have a data frame that looks like this (df1):

options(stringsAsFactors = F)

Car <- c('Chevrolet', 'GM', 'Lexus', 'Ford', 'Mitsubishi', 'Audi')
Alternative.Cars <- c('Acura, Honda', NA, 'Nissan', 'Ferrari, Toyota, Tesla', 'Infiniti', 'Cadillac, Benz')
contact <- c('Mickey', 'Minnie', 'Daffy', 'Pluto', 'Donald', 'Goofy')
numCars <- c(0,0,0,0,0,0)
df1 <- data.frame(Car = Car, Alternative.Cars = Alternative.Cars, contact = contact, numCars = numCars)

and a second data frame that looks like this (df2):

CAR <- c('Acura', 'Benz', 'Toyota', 'Nissan')
years <- c('Y2001,Y2003','Y2014', 'Y1999,Y2015,Y2016', 'Y2013')
df2 <- data.frame(CAR = CAR, years = years)

DF1 is a short list of car inventory that are currently out of stock. DF2, is a list of cars with year and make. You'll notice that DF2 has Car names that are listed only in the DF1$Alternative.Cars column. For some reason, there was a huge mix up and it needs to be set straight. I want to create a new column called 'Real.Car' that finds the car name from DF1$CAR within DF2$Alternative.Cars, so it looks like the data frame below (DF3):

Car         Alternative.Cars         Real.Car   contact  numCars years
Chevrolet   Acura, Honda             Acura      Mickey      0    Y2001,Y2003  
GM          <NA>                     <NA>       Minnie      0    <NA>
Lexus       Nissan                   Nissan     Daffy       0    Y2013
Ford        Ferrari, Toyota, Tesla   Toyota     Pluto       0   Y1999,Y2015,Y2016  
Mitsubishi  Infiniti                 <NA>       Donald      0    <NA>
Audi        Cadillac, Benz           Benz       Goofy       0    Y2014

I've tried looking across each row in df2 and finding the match in df1. If there is a match, I want to save the years in a new column called 'years' but it isn't working out right:

 for (i in 1:length(df2$CAR)) {

   singleEntry <- as.data.frame(df2[1,])

   df1$Car <- ifelse(grep(singleEntry, df1$Alternative.Cars), 
   singleEntry$years, df1$Car)

 }

Any help would be greatly appreciated!

1

1 Answers

3
votes

This may not be the most efficient way, but it will do the job.

df1$Real.car = NA
df1$years = NA
for (i in 1:nrow(df2)) {
  match.line = grep(df2$CAR[i], df1$Alternative.Cars)

  df1$Real.car[match.line] = df2$CAR[i]
  df1$years[match.line] = df2$years[i]
}

You can then re-order the columns to match with yours.

> df1
         Car       Alternative.Cars contact numCars Real.car             years
1  Chevrolet           Acura, Honda  Mickey       0    Acura       Y2001,Y2003
2         GM                   <NA>  Minnie       0     <NA>              <NA>
3      Lexus                 Nissan   Daffy       0   Nissan             Y2013
4       Ford Ferrari, Toyota, Tesla   Pluto       0   Toyota Y1999,Y2015,Y2016
5 Mitsubishi               Infiniti  Donald       0     <NA>              <NA>
6       Audi         Cadillac, Benz   Goofy       0     Benz             Y2014