I have 2 data frames that I'm trying to keep separate, but would like to calculate a new column in df1 based on the means of rows in df2:
df1:
Name <- c("t1", "t2", "t3")
Diam <- c(95, 104, 72)
Rad <- c(47.5, 52, 36)
df1 <- data.frame(Name, Diam, Rad)
df2:
Name <- c("t1A", "t1B", "t2A", "t2B", "t2C", "t3B")
Width <- c(4000, 3200, 2300, 2700, 2100, 3500)
df2 <- data.frame(Name, Width)
I'd like to add a fourth column to df1 that calculates the means of "Width" in df2 when the first 2 characters of "Name" are identical. My intended output looks like:
df3:
Name <- c("t1", "t2", "t3")
Diam <- c(95, 104, 72)
Rad <- c(47.5, 52, 36)
WidthMean <- c(3600, 2366.667, 3500)
df3 <- data.frame(Name, Diam, Rad, WidthMean)
I can do it by splitting the "Name" column in df2, then finding the mean according to the new split name, then adding the mean values from df2 as a new column to df1. However, I have to do this to approximately 75 data.frames and was hoping to find a faster way to process the data (and with fewer lines of code!).
Current code:
df2$"NameID" <- df2$Name
df2$NameID <- gsub(".?$", "", df2$NameID)
df2out <- aggregate(Width ~ NameID, data = df2, FUN = mean)
df1$"WidthMean" <- df2out$Width
df1
df1
or likedf2
in your example? – TARehman