4
votes

I've a large dataframe which contains 12 columns each for two types of values, Rested and Active. I want to convert the columns of each month into rows, thus bring all the month columns (Jan, Feb, Mar... ) under 'Month'

My data is as follows:

ID      L1  L2  Year    JR  FR  MR  AR  MYR JR  JLR AGR SR  OR  NR  DR  JA  FA  MA  AA  MYA JA  JLA AGA SA  OA  NA  DA
1234    89  65  2003    11  34  6   7   8   90  65  54  3   22  55  66  76  86  30  76  43  67  13  98  67  0   127 74
1234    45  76  2004    67  87  98  5   4   3   77  8   99  76  56  4   3   2   65  78  44  53  67  98  79  53  23  65

I'm trying to make it appear as below (column R represents Rested and column A represents Active. and monthly JR, FR, MR respectively means Jan Rested, Feb Rested, Mar Rested and JA, FA, MA respectively means Jan Active, Feb Active, Mar Active and etc):

So, here I'm trying to convert each of the monthly columns to rows and keeping them beside each other for R and A values by creating a new Month column.

 ID     L1  L2  Year    Month   R   A
1234    89  65  2003    Jan     11  76
1234    89  65  2003    Feb     34  86
1234    89  65  2003    Mar     6   30
1234    89  65  2003    Apr     7   76
1234    89  65  2003    May     8   43
1234    89  65  2003    Jun     90  67
1234    89  65  2003    Jul     65  13
1234    89  65  2003    Aug     54  98
1234    89  65  2003    Sep     3   67
1234    89  65  2003    Oct     22  0
1234    89  65  2003    Nov     55  127
1234    89  65  2003    Dec     66  74
1234    45  76  2004    Jan     67  3
1234    45  76  2004    Feb     87  2
1234    45  76  2004    Mar     98  65
1234    45  76  2004    Apr     5   78
1234    45  76  2004    May     4   44
1234    45  76  2004    Jun     3   53
1234    45  76  2004    Jul     77  67
1234    45  76  2004    Aug     8   98
1234    45  76  2004    Sep     99  79
1234    45  76  2004    Oct     76  53
1234    45  76  2004    Nov     56  23
1234    45  76  2004    Dec     4   65

I've tried various things like stack,melt,unlist

data_reshape <- reshape(df,direction="long", varying=list(c("JR", "FR", "MR", "AR", "MYR", "JR", "JLR", "AGR", "SR", "OR", "NR", "DR", "JA", "FA","MA", "AA", "MYA", "JA", "JLA","AGA", "SA", "OA","NA", "DA")), v.names="Precipitation", timevar="Month")

data_stacked <- stack(data, select = c("JR", "FR", "MR", "AR", "MYR", "JR", "JLR", "AGR", "SR", "OR", "NR", "DR", "JA", "FA","MA", "AA", "MYA", "JA", "JLA","AGA", "SA", "OA","NA", "DA"))

but their result is not quite expected - they are giving Jan values of all years, and then Feb values of all years, and then March values of all years, and etc. But I want to structure them in an proper monthly manner for each Year for each ID existing in the entire dataset.

How to achieve this in R?

3
Please include your data in a (easily) reproduced form. For example, you can use dput(). What would expected result look like?Roman Luštrik

3 Answers

5
votes

Here's a base reshape approach:

res <- reshape(mydf, direction="long", varying=list(5:16, 17:28), v.names=c("R", "A"), times = month.name, timevar = "Month")
res[with(res, order(ID, -L1, L2, Year)), -8]
5
votes

Here's a possible solution using the devel version of data.table

library(data.table) ## v >= 1.9.5

res <- melt(setDT(df),
            id = 1:4, ## id variables
            measure = list(5:16, 17:ncol(df)), # a list of two groups of measure variables
            variable = "Month", # The name of the additional variable
            value = c("R", "A")) # The names of the grouped variables

setorder(res, ID, -L1, L2, Year) ## Reordering the data to match the desired output
res[, Month := month.abb[Month]] ## You don't really need this part as you already have the months numbers

#       ID L1 L2 Year Month  R   A
#  1: 1234 89 65 2003   Jan 11  76
#  2: 1234 89 65 2003   Feb 34  86
#  3: 1234 89 65 2003   Mar  6  30
#  4: 1234 89 65 2003   Apr  7  76
#  5: 1234 89 65 2003   May  8  43
#  6: 1234 89 65 2003   Jun 90  67
#  7: 1234 89 65 2003   Jul 65  13
#  8: 1234 89 65 2003   Aug 54  98
#  9: 1234 89 65 2003   Sep  3  67
# 10: 1234 89 65 2003   Oct 22   0
# 11: 1234 89 65 2003   Nov 55 127
# 12: 1234 89 65 2003   Dec 66  74
# 13: 1234 45 76 2004   Jan 67   3
# 14: 1234 45 76 2004   Feb 87   2
# 15: 1234 45 76 2004   Mar 98  65
# 16: 1234 45 76 2004   Apr  5  78
# 17: 1234 45 76 2004   May  4  44
# 18: 1234 45 76 2004   Jun  3  53
# 19: 1234 45 76 2004   Jul 77  67
# 20: 1234 45 76 2004   Aug  8  98
# 21: 1234 45 76 2004   Sep 99  79
# 22: 1234 45 76 2004   Oct 76  53
# 23: 1234 45 76 2004   Nov 56  23
# 24: 1234 45 76 2004   Dec  4  65

Installation instructions:

library(devtools)
install_github("Rdatatable/data.table", build_vignettes = FALSE)
3
votes

This is an inelegant solution, but I'm going to post it just to show how problems can be solved with basic tools without relying on high level functions when the task doesn't necessarily require them. I think that the more tools you have, the more you can approach correctly to problems. Here we are:

 #extract the data part
 data<-t(as.matrix(df[,5:28]))
 #build the data.frame cbinding the needed columns
 res<-cbind(df[rep(1:nrow(df),each=12),1:4],  #this repeats the first 4 columns 12 times each
       Month=month.abb, #the month column
       R=as.vector(data[1:12,]), # the R column, obtained from the first 12 rows of data
       A=as.vector(data[13:24,])) #as above
 rownames(res)<-NULL #just to remove the row names