Create a new column by aggregating multiple columns in R

Question

Background

I have a dataset, df, where I would like to aggregate multiple columns and create a new column. I need to multiply Type, Span and Population columns and create a new Output column

ID       Status      Type     Span   State   Population

A        Yes         2        70%    Ga      10000

Desired output

ID        Status     Type      Span   State   Population   Output

A         Yes        2         70%    Ga      10000        14000

dput

structure(list(ID = structure(1L, .Label = "A ", class = "factor"), 
Status = structure(1L, .Label = "Yes", class = "factor"), 
Type = 2L, Span = structure(1L, .Label = "70%", class = "factor"), 
State = structure(1L, .Label = "Ga", class = "factor"), Population = 10000L), class = "data.frame", 
row.names = c(NA, 
-1L))

This is what I have tried

 df %>% 
 mutate(Output = Type * Span * Population)

akrun akrun · Accepted Answer · 2020-05-10T23:48:46

Here, we are creating a new column based on the inputs from different column. We can just use mutate to get the Span percent of Population and multiply by 'Type'. Note that 'Span' is not numeric, as it is having %, so we extract the numeric part with parse_number divide by 100, then multiply with Population along with the 'Type'

library(dplyr)
df %>%
  mutate(Output = Type * Population * readr::parse_number(as.character(Span))/100)
#   ID Status Type Span State Population Output
#1 A     Yes    2  70%    Ga      10000  14000

If the columns 'Type', 'Population' are not numeric, it is better to convert to numeric with as.numeric(as.character(df$Type)) and for 'Population' (assuming they are factor class). Another option is type.convert(df, as.is = TRUE) and then work on that modified class dataset

Create a new column by aggregating multiple columns in R

2 Answers