I have a data frame as below:
df <- data.frame(
id = c(1:5),
a = c(3,10,4,0,15),
b = c(2,1,1,0,3),
c = c(12,3,0,3,1),
d = c(9,7,8,0,0),
e = c(1,2,0,2,2)
)
I need to add multiple columns of which names are given by a combination of a:c
and 3:5
. 3:5
is also used insum
function:
df %>% mutate(
usa_3 = sum(1+3),
usa_4 = sum(1+4),
usa_5 = sum(1+5),
canada_3 = sum(1+3),
canada_4 = sum(1+4),
canada_5 = sum(1+5),
nz_3 = sum(1+3),
nz_4 = sum(1+4),
nz_5 = sum(1+5)
)
The result is really simple but I do not want to put similar codes repeatedly.
id a b c d e usa_3 usa_4 usa_5 canada_3 canada_4 canada_5 nz_3 nz_4 nz_5
1 1 3 2 12 9 1 4 5 6 4 5 6 4 5 6
2 2 10 1 3 7 2 4 5 6 4 5 6 4 5 6
3 3 4 1 0 8 0 4 5 6 4 5 6 4 5 6
4 4 0 0 3 0 2 4 5 6 4 5 6 4 5 6
5 5 15 3 1 0 2 4 5 6 4 5 6 4 5 6
The variables are alphabetical prefix and range of integers as postfix.
Postfix is also related to the sum
funcion as 1+postfix
.
In this case, they have 3 values for each so the result have 9 additional columns.
I do not prefer to define function outside the a bunch of codes and suppose map
functino in purrr
may help it.
Do you know how to make it work? Especially it is difficult to give dynamic column name in pipe.
I found some similar questions but it does not match my need.
Multivariate mutate
How to use map from purrr with dplyr::mutate to create multiple new columns based on column pairs
===== ADDITIONAL INFO =====
Let me clarify some conditions of this issue.
Actually sum(1+3)
, sum(1+4)
... part is replaced by as.factor(cutree(X,k=X))
where X
is reuslt of cluster analysis and Y
is a variable defined as 3:5
in the example. cutree()
is a function to define in which part we cut a dendrogram stored in the result of cluster analysis.
As for the column names usa_3, usa_4 ... nz_5
, country name is replaced by methods of cluster analysis such as ward, McQuitty, Median method, etc. (seven methods), and integers 3, 4, 5, are the parameter to define in which part I need to cut a dendrogram as explained.
As for an X
in the functionas.factor(cutree(X,k=X))
, results of cluster analysis also have several data frame which is corresponded to each method. I realized that another issue how to apply the function to each data frame (result of cluster analysis stored in different dataframe).
Actual scripts that I am using currently is something like this:
cluste_number <- original_df %>% mutate(
## Ward
ward_3=as.factor(cutree(clst.ward,k=3)),
ward_4=as.factor(cutree(clst.ward,k=4)),
ward_5=as.factor(cutree(clst.ward,k=5)),
ward_6=as.factor(cutree(clst.ward,k=6)),
## Single
sing_3=as.factor(cutree(clst.sing,k=3)),
sing_4=as.factor(cutree(clst.sing,k=4)),
sing_5=as.factor(cutree(clst.sing,k=5)),
sing_6=as.factor(cutree(clst.sing,k=6)))
It is sorry not to clarify the actual issue; howerver, due to this reason above, number of countries as usa, canada, nz
and number of parameters as 1:3
do not match.
Also some suggestions using i + .
does not meet the issue as a function as.factor(cutree(X,k=X))
is used in the actual operation.
Thank you for your support.