Dplyr: looping the creation of new columns

Question

EDIT: My data (for reproducible research) looks as follows. The dplyr will summarise the values for each win_name category:

inv_name    inv_province    inv_town    nip win_name    value   start   duration    year
CustomerA   łódzkie TownX   1111111111  CompX   233.50  2015-10-23  24  2017
CustomerA   łódzkie TownX   1111111111  CompX   300.5   2015-10-23  24  2017
CustomerA   łódzkie TownX   1111111111  CompX   200.5   2015-10-23  24  2017
CustomerB   łódzkie TownY   2222222222  CompY   200.5   2015-10-25  12  2017
CustomerB   łódzkie TownY   2222222222  CompY   1200.0  2015-10-25  12  2017
CustomerB   łódzkie TownY   2222222222  CompY   320.00  2015-10-25  12  2017

The dplyr will summarise the values, then the spread will make the summary spread into several columns for each win_name category with numeric values.

I would like to create new columns with formatted text corresponding to existing columns with numbers. Create as many columns as there are numeric columns with numeric data. The number of these columns can change from analysis to analysis. My code so far looks like:

county_marketshare<-df_monthly_val %>% 
   select(win_name,value,inv_province) %>% 
   group_by(win_name,inv_province)%>% 
   summarise(value=round(sum(value),0))%>% 
   spread(key="win_name", value=value, fill=0) %>%  # teraz muszę stworzyc kolumny sformatowane "finansowo"
   mutate(!!as.symbol(paste0(bestSup[1],"_lbl")):= formatC(!!as.symbol(bestSup[1]),digits = 0, big.mark = " ", format = "f",zero.print = ""),
          !!as.symbol(paste0(bestSup[2],"_lbl")):= formatC(!!as.symbol(bestSup[2]),digits = 0, big.mark = " ", format = "f",zero.print = ""),
          !!as.symbol(paste0(bestSup[3],"_lbl")):= formatC(!!as.symbol(bestSup[3]),digits = 0, big.mark = " ", format = "f",zero.print = "")
          )

is there a way to loop the mutate function so that as many columns are created as there are existing numeric columns? The relavant lines with the repetitive code are the last three. Each new formatted text column has the name of existing numeric column with a suffix. !!as.symbol makes it possible to put together a parameter, the name of the source column, with _lbl suffix.

... you are long enough at SOF to know that a reproducible example would be nice. — tjebo

tjebo tjebo · Accepted Answer · 2018-02-28T00:05:45

you could for example use mutate_at with a function and a conditional such as

dat %>% 
mutate_at(.vars = c('num_col1','num_col2'), 
.funs = function(x) if(is.numeric(x)) as.character(x))

This will replace the specified numeric columns with character columns. You can tweak the function to your needs, i.e. specifying how the columns should look like. We could help you a bit more with a better data example.

You can also filter only the numeric columns and then use mutate_all:

dat %>%Filter(is.numeric,.) %>% mutate_all(funs(as.character)) 

# Filter() is not dplyr, but base R, caveat capital 'F' ! 
# You can also use dat %>%.[sapply(.,is.numeric)], with the same result
# or dplyr::select_if

...:)

P.S. Always worth to cite the reference. Have a look at this gorgeous question: Selecting only numeric columns from a data frame

Dplyr: looping the creation of new columns

3 Answers