I'm cleaning a large dataset and have a Comments
column where I am commenting on changes made to the data. I've provided a dummy sample set below as an example of what I'm trying to achieve. I'm using tidyverse
packages.
Data:
structure(list(Date = structure(c(17199, 17226, 17263, 17300,
17346, 17504, 17508), class = "Date"), Skipper = c("Agatha",
"Gertrude", "Julio", "Dylis", "Agatha", "Dylis", "Julio"), Success = c("No",
"Yes", "Yes", "Yes", "No", "Yes", "No"), Time = c(60L, 50L, 120L,
30L, 100L, 120L, 40L), Comments = c("Pirates spotted.", "Illegal fishers spotted.",
"Engine troubles.", "Lost fishing line.", NA, "Pirates spotted.",
"Lost fishing line.")), class = "data.frame", row.names = c(NA,
-7L))
I'm looking to add text to string values in the Comments
in relation to Date
, without deleting values already present.
So for 2017-04-07
and 2017-12-04
I would like to add Iceberg spotted.
to the Comments
for the respected Date
's.
Date Skipper Success Time Comments
1 2017-02-02 Agatha No 60 Pirates spotted.
2 2017-03-01 Gertrude Yes 50 Illegal fishers spotted.
3 2017-04-07 Julio Yes 120 Engine troubles.
4 2017-05-14 Dylis Yes 30 Lost fishing line.
5 2017-06-29 Agatha No 100 <NA>
6 2017-12-04 Dylis Yes 120 Pirates spotted.
7 2017-12-08 Julio No 40 Lost fishing line.
Using stringr
and str_c
R_example$Comments %>% str_c("Iceberg spotted")
[1] "Pirates spotted.Iceberg spotted"
How can I select what dates to apply the above ^ code so that I can see changes made to the dataset for specified dates. Do I need to supply a filter
or if_else
function?
I have been trying with case_when
but this replaces the existing values. I also could do this by creating another column then binding two columns together but I rather not do this.
R_example %>%
mutate(Comments = case_when(Date == "2017-04-07" & Date == "2017-12-04" ~ "Iceberg spotted.",
TRUE ~ as.character(Comments)))
Thank you.
EDIT:
I forgot a little detail in my dataset. If I have multiple rows for a fishing trip and in the Comments
column <NA>
values, how can the string replace the <NA>
values instead of adding to the <NA>
value? Like so:
What my data looks like:
Date Skipper Success Time Comments
1 2017-02-02 Agatha No 60 Pirates spotted.
2 2017-02-02 Agatha No 60 <NA>
What I would like to achieve:
Date Skipper Success Time Comments
1 2017-02-02 Agatha No 60 Pirates spotted. Iceberg spotted.
2 2017-02-02 Agatha No 60 Iceberg spotted.
What I currently get from the code in answers below:
Date Skipper Success Time Comments
1 2017-02-02 Agatha No 60 Pirates spotted. Iceberg spotted.
2 2017-02-02 Agatha No 60 NAIceberg spotted.
2017-04-07
and2017-12-04
. Making them longer :). That was a typo, sorry I've made an edit now! – JL_seycase_when
example, as a date cannot be2017-04-07
and2017-12-04
simultaneously. You need an OR statement (|
instead of&
). – Freguglia