0
votes

I'm new to R and was wondering if anybody could help? I have 200+ columns and one weighted column so I need to multiply each column by its associated weight to create new weighted columns so I can further analyse the data.Each column has multiple levels, so my gender has 2 (for male, female for example). How do I iterate through all columns to create new variables like I have done below for one column?

DF[,gender_w:=gender*weight/gender]
    

DF[,lapply(.SD,sum, na.rm=T),by= gender, .SDcols=c(all_weighted_column_names)]

Thanks in advance for any advice.

EDIT- more info

    DF <- (Gender = c(1,2,2,1,2),
Age_group = c(1,5,5,4,3)
Question1 = c(1,0,0,1,1)
Question2 = c(0,1,0,1,1)
Weight = c(2,5,3,1,5))

I have to post dummy variables but hopefully this will help you see the picture.

In this example I need the sum of each group in each variable, but I need to weight them first. So for gender, if 1 = M and 2 = F then I don’t need 2 Males and 3 Females but I need to multiply them by their corresponding weight and then sum them. So I would need it to show 3 Males and 13 Females.

In the age_group, instead of Age_group_1 = 1, Age_group_3 = 1, Age_group_4 = 1 and Age_group_5 = 2. I need to produce Age_group_1 = 2, Age_group_3 = 5, Age_group_4 = 1 and Age_group_5 = 8

Is there a way to do this and iterate through all the columns? I have +200 all together and I can’t figure out an efficient way to do it.

Thanks again for your help.

1
Would be best if you made a dummy example with, say, 2 columns of data and one weight variable and just few rows. Then you could share it with via dput().sindri_baldur
You probably should melt the data.table to long format. You can dcast later on if you need wide format.Roland
Your question is confusing. Maybe you are looking for this: DF[, paste0(head(names(DF), -1), "_w") := lapply(.SD, function(x) x * Weight), .SDcols = head(names(DF), -1)]Roland

1 Answers

0
votes

You can calculate component wise products. So let's say you have a data.table

dt <- data.table(
  x = 1:10,
  y = 11:20,
  weight = 2
)

You can now weigh your columns x and y using

dt[, c("x", "y")] <- dt[, c("x", "y")] * dt[, "weight"]

This also works if you have a unique weight vector for each column. Note that this is not data.table syntax but it keeps dt's data.table structure.