1
votes

I have a Raw 'data.frame A' containing results from a set of measurements taken in a time course experiment. There are Control and Test Treatment variables, two Animals per Treatment, three measurements per Animal, and Day 1, 2, and 3 as Time points.

data.frame A

I have written code to generate a separate 'data.frame B' that converts a number of outliers into NA's. These NA's are associated with specific combinations of Treatment-Animal-Measure column values. My goal is to use a list of such combined values from 'data.frame B' to search for matched cases in 'data.frame A' and replace the number in the value column with NA, across all Timepoints in the data set.

data.frame B

I have looked into indexing, lapply(), and for loops to tackle this problem, but am getting stuck pretty early in each case. Here is an image of the desired 'data.frame C' showing the replacements I am after:

Resultant "data.frame C"

Any guidance on best course of action, or a solution, would be much appreciated!

1
I don't think you need to use any loops to achieve this. You can specify the conditions to replace a value and use replace. To get the condition in your third image you could do something like df$Value = replace(df$Value, df$Measure == 'B2', NA_real) - svenhalvorson
Thanks svenhalvorson. In getting slightly more specific, df$Measure == "B2" in your code would not address the problem by itself. There are multiple conditions (column values)... something like dfB$Treatment+Animal+Measure (although this is not written correctly) needed to find and replace each dfA$Value to generate dfC. In addition, there are many different combinations of dfB$Treatment+Animal+Measure from dfB, so I would need to somehow go search and replace through the whole list to generate the proper dfC. I hope this is a little more clear. Still advise no Loops? Thanks! - Banksy
Please don't post just pictures of your data! It makes helping you much more difficult - Chuck P
@Banksy It's a little hard to know without seeing all the conditions you want to apply but I suspect you will not need explicit loops for this. You can create more complex conditions with & and | between columns of data.frame A and then use these to index the replacements in data frame B. - svenhalvorson
Chuck P and svenhalvorson, I appreciate the guidance. Indeed the loop was not necessary, and the leftjoin() did the trick. I did not know how that could be employed as a search and replace tool! - Banksy

1 Answers

1
votes

Here's one solution using dplyr. Make sure that your dfb only has the rows that you want to change to NA then we'll do a left join and a simple case_when to do the work.

dfa <- data.frame(
 Treatment = rep(c(rep("Control", 6), rep("Test", 6)), 3),
 Timepoint = c(rep("Day1", 12), rep("Day2", 12), rep("Day3", 12)),
 Animal = rep(c(rep("A", 3), rep("B", 3)), 6),
 Measure = rep(c(c("A1", "A2", "A3"), c("B1", "B2", "B3")), 6),
 Value = c(10, 11, 9, 10, 2, 9, 10, 11, 9, 10, 2, 9, rep(10, 24))
)

Note the minor modifications to dfb...

dfb <- data.frame(
  Treatment = c("Test", "Control"),
  Animal = c("B", "B"),
  Measure = c("B2", "B2"),
  ReplaceValue = c(TRUE, TRUE)
)

dfb
  Treatment Animal Measure ReplaceValue
1      Test      B      B2         TRUE
2   Control      B      B2         TRUE
library(dplyr)

dfc <- 
  left_join(dfa, dfb, by = c("Treatment", "Animal", "Measure")) %>%
  mutate(Value = case_when(
    is.na(ReplaceValue) ~ Value,
    TRUE ~ NA_real_
    )
  ) %>%
  select(-ReplaceValue)
head(dfc, 12)
#>    Treatment Timepoint Animal Measure Value
#> 1    Control      Day1      A      A1    10
#> 2    Control      Day1      A      A2    11
#> 3    Control      Day1      A      A3     9
#> 4    Control      Day1      B      B1    10
#> 5    Control      Day1      B      B2    NA
#> 6    Control      Day1      B      B3     9
#> 7       Test      Day1      A      A1    10
#> 8       Test      Day1      A      A2    11
#> 9       Test      Day1      A      A3     9
#> 10      Test      Day1      B      B1    10
#> 11      Test      Day1      B      B2    NA
#> 12      Test      Day1      B      B3     9