7
votes

Suppose I have two data frames df1 and df2 as follows

Df1

Id Price Profit Month
10  5     2      1
10  5     3      2
10  5     2      3
11  7     3      1
11  7     1      2
12  0     0      1
12  5     1      2

Df2

Id Name
9  Kane
10 Jack
10 Jack
11 Will
12 Matt
13 Lee
14 Han

Now I want to insert a new column in Df1 named Name and get its value from Df2 based on matching Id

So modified Df1 will be

Id Price Profit Month Name
10  5     2      1    Jack
10  5     3      2    Jack
10  5     2      3    Jack
11  7     3      1    Will
11  7     1      2    Will
12  0     0      1    Matt
12  5     1      2    Matt
4
Df2$Name[match(Df1$Id,Df2$Id)] for instance. - nicola

4 Answers

9
votes
df1 <- data.frame(Id=c(10L,10L,10L,11L,11L,12L,12L),Price=c(5L,5L,5L,7L,7L,0L,5L),Profit=c(2L,3L,2L,3L,1L,0L,1L),Month=c(1L,2L,3L,1L,2L,1L,2L),stringsAsFactors=F);
df2 <- data.frame(Id=c(9L,10L,10L,11L,12L,13L,14L),Name=c('Kane','Jack','Jack','Will','Matt','Lee','Han'),stringsAsFactors=F);
df1$Name <- df2$Name[match(df1$Id,df2$Id)];
df1;
##   Id Price Profit Month Name
## 1 10     5      2     1 Jack
## 2 10     5      3     2 Jack
## 3 10     5      2     3 Jack
## 4 11     7      3     1 Will
## 5 11     7      1     2 Will
## 6 12     0      0     1 Matt
## 7 12     5      1     2 Matt
8
votes

use left_join in dplyr

library(dplyr)
left_join(df1, df2, "Id")

eg:

> left_join(df1, df2)
Joining by: "Id"
  Id Price Profit Month Name
1 10     5      2     1 Jack
2 10     5      3     2 Jack
3 10     5      2     3 Jack
4 11     7      3     1 Will
5 11     7      1     2 Will
6 12     0      0     1 Matt
7 12     5      1     2 Matt

Data wrangling cheatsheet by RStudio is a very helpful resource.

3
votes

Here is an option using data.table

library(data.table)
setDT(Df1)[unique(Df2), on = "Id", nomatch=0]
#   Id Price Profit Month Name
#1: 10     5      2     1 Jack
#2: 10     5      3     2 Jack
#3: 10     5      2     3 Jack
#4: 11     7      3     1 Will
#5: 11     7      1     2 Will
#6: 12     0      0     1 Matt
#7: 12     5      1     2 Matt

Or as @Arun mentioned in the comments, we can assign (:=) the "Name" column after joining on "Id" to reflect the changes in the original dataset "Df1".

setDT(Df1)[Df2, Name:= Name, on = "Id"]
Df1
1
votes

A simple base R option could be merge()

merge(Df1,unique(Df2), by="Id")
#  Id Price Profit Month Name
#1 10     5      2     1 Jack
#2 10     5      3     2 Jack
#3 10     5      2     3 Jack
#4 11     7      3     1 Will
#5 11     7      1     2 Will
#6 12     0      0     1 Matt
#7 12     5      1     2 Matt

The function unique() is used here because of the duplicate entry in Df2 concerning "Jack". For the example data described in the OP the option by="Id" can be omitted, but it might be necessary in a more general case.