Good morning all, this is my first time posting on stack overflow. Thank you for any help!
I have 2 dataframes that I am using to analyze stock data. One data frame has dates among other information, we can call it df:
df1 <- tibble(Key = c('a','b','c'), i =11:13, date= ymd(20110101:20110103))
The second dataframe also has dates and other important information.
df2 <-tibble(Answer = c('a','d','e','b','f','c'), j =14:19, date= ymd(20150304:20150309))
Here is what I want to do: For each row in df1, I need to:
-Find the date in df2 where, for when df2$answer is the same as df1$key, it is the closest to that row's date in df1.
-Then extract information for another column in that row in df2, and put it in a new row in df1.
The code i tried:
df1 %>%
group_by(Key, i) %>%
mutate(
`New Column` = df2$j[
which.min(subset(df2$date, df2$Answer== Key) - date)])
This has the result:
Key i date `New Column`
1 a 11 2011-01-01 14
2 b 12 2011-01-02 14
3 c 13 2011-01-03 14
This is correct for the first row, a
. In df2
, the closest date is 2015-03-04
, for which the value of j
is in fact 14
.
However, for the second row, Key=b
, I want df2
to subset to only look at dates for rows where df2$Answer = b
. Therefore, the date should be 2015-03-07
, for which j =17
.
Thank you for your help!
Jesse
left_join
the data on"Answer" = "Key"
,arrange
by the difference in the date columns,group_by
Answer andslice
the top row from each group (smallest date difference, because of the sorting). – Gregor Thomasleft-join
with df2 on left, df1 on right, likeleft_join(df2, df1, by = c("Answer" = "Key"))
? If I do it the other way it is not accepted sinceKey
only has 3 values andAnswer
has 6 – Jesse Kerrdf1 %>% right_join(df2, by = c('Key' = 'Answer')) %>% group_by(Key) %>% arrange(abs(date.x - date.y)) %>% slice(1)
Is Gregor's suggestion I believe (Or switchdf2
anddf1
and changeright_join
toleft_join
) – IceCreamToucanKey= Answer
, for each key there are many matching answers, and if do it by c("Key" = "Answer", "date = "date), the dates are not exactly matching, that is why I need to do which.min, to find the closest date to link them up, not the same date. – Jesse Kerrwhich.min
, but in an easier workflow. – Gregor Thomas