I have a data frame with several numeric columns with "comma" class which is needed in order to save the data frame to an excel file and show the numeric columns in an excel comma format using Openxlsx package.
Now when i use dplyr package in order to group and summarize the data, the comma class is lost from the numeric columns.
Is it possible in some way to use dplyr package and still preserve the original comma classes ?
Here is the data frame with the comma classes:
library(tidyverse)
library(stringr)
set.seed(10)
df_central_database <- data.frame(Category = as.character(sample(words[1:10], size = 50, replace = TRUE)) ,
Summ_Income =sample(1000:10000, size = 50, replace = TRUE),
Summ_Securities =sample(1000:10000, size = 50, replace = TRUE),
Summ_Bonds =sample(1000:10000, size = 50, replace = TRUE),
Summ_Options =sample(1000:10000, size = 50, replace = TRUE)
)
class(df_central_database$Summ_Income) <- "comma"
class(df_central_database$Summ_Securities) <- "comma"
class(df_central_database$Summ_Bonds) <- "comma"
class(df_central_database$Summ_Options) <- "comma"
str(df_central_database)
'data.frame': 50 obs. of 5 variables:
$ Category : Factor w/ 10 levels "a","able","about",..: 6 4 5 7 1 3 3 3 7 5 ...
$ Summ_Income :Class 'comma' int [1:50] 4189 9428 3213 5258 2724 6249 5135 5207 4598 5548 ...
$ Summ_Securities:Class 'comma' int [1:50] 4099 1551 4321 4668 9229 8999 9854 5295 7242 4832 ...
$ Summ_Bonds :Class 'comma' int [1:50] 8916 2774 1625 2416 4001 2620 2318 3615 9425 1922 ...
$ Summ_Options :Class 'comma' int [1:50] 3008 5823 6963 8633 2342 7031 7855 9988 3369 8967 ...
Now using dplyr package to group and summarize resets the new data frame columns back to int :
df_rep1 <- df_central_database %>%
group_by(Category) %>%
summarise_all(.funs = sum)
str(df_rep1)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 10 obs. of 5 variables:
$ Category : Factor w/ 10 levels "a","able","about",..: 1 2 3 4 5 6 7 8 9 10
$ Summ_Income : int 23632 24434 48506 28288 26662 22076 19452 22832 25071 3469
$ Summ_Securities: int 20390 20588 48728 31054 31550 33387 25930 28458 35604 8760
$ Summ_Bonds : int 21531 23576 33218 29206 26030 25966 34724 30306 36029 7113
$ Summ_Options : int 24345 31356 54054 28524 44705 28161 35068 25267 28022 5713
Is it possible to somehow prevent dplyr from resetting the class?
Thanks Rafael
df_central_database %>% group_by(Category) %>% summarise_all(.funs = sum) %>% mutate_at(vars(contains('Summ')), funs(f1))
, then the class will becomma
– Sotos