I'm sorry for the basic question. I'm just struggling with something that should be simple. Say I have the the data frame "Test" that originally has three fields: Col1, Col2, Col3.
I want to create new columns based on each of the original columns. The values in each row of the new columns would specify whether the corresponding value in the matching row on the original column is above or below the initial column's median. So, for example, in the image attached, Col4 is based on Col1. Col5 is based on Col2. Col6 based on Col3.
test dataframe example:
It's quite easy to perform this function on a single column and output a single column:
Test <- Test %>% mutate(Col4 = derivedFactor(
"below"= Col1 > median(Test$Col1),
"at"= Col1 == median(Test$Col1),
"above"= Col1 < median(Test$Col1)
.default = NA)
)
But if I'm performing this same operation over 50 columns, writing out/copy-paste and editing the code can be tedious and inefficient. I should mention that I am hoping to add the new columns to the data frame, not create another data frame. Additionally, there are about 200 other fields in the data frame that will not have this function performed on them (so I can't just use a mutate_all). And the columns are not uniformly named (my examples above are just examples, not the actual dataset) so I'm not able to find a pattern for mutate_at. Maybe there is a way to manually pass a list of column names to the mutate command?
There must be an easy and elegant way to do this. If anyone could help, that would be amazing.
mutate_all
– kath