0
votes

I have a wide dataset in which each row (an individual) provides up to three observations for three different dates. Each observation consists of a date, a description and number of minutes. Individuals may provide as many observations as they wish, and may appear in more than one row with additional observations.

Test data are here:

library(RCurl)
fwt <-     getURL("https://raw.githubusercontent.com/bac3917/Cauldron/master/fwt.csv")
fwt<-read.csv(text=fwt)

Converting columns in proper format:

library(lubridate)
fwt$date1<-as.Date(fwt$date1, format='%m/%d/%Y')
fwt$date2<-as.Date(fwt$date2, format='%m/%d/%Y')
fwt$date3<-as.Date(fwt$date3, format='%m/%d/%Y')
# condense dataset; 3 sets of columns into 1
cols <- names(fwt) %in%     c("naecy1_2","naecy1_1","naecy1_3","naecy1_4","naecy1_5","naecy1_6",
          "naecy2_2","naecy2_1","naecy2_3","naecy2_4","naecy2_5","naecy2_6",
          "naecy3_2","naecy3_1","naecy3_3","naecy3_4","naecy3_5","naecy3_6")


fwt[cols]<-lapply(fwt[cols], as.numeric) #convert to numeric all
fwt[is.na(cols)]<-0

Essentially there are three sets of date/description/minutes that need to be stacked into a long format. I'd like the data to look like this when restructured:

Name   Date  NAECY1  NAECY2  NAECY3  NAECY4  NAECY5  NAECY6

I've tried reshape2 and tidyr but cannot figure this one out. Ideas, anyone?

Thank you...

1

1 Answers

1
votes

Here's a quick solution:

cols <- c("name", "date%d","descr%d", "naecy%d_1", "naecy%d_2", "naecy%d_3", "naecy%d_4", "naecy%d_5", "naecy%d_6")
cols_renamed <- c("Name   Date Descr  NAECY1  NAECY2  NAECY3  NAECY4  NAECY5  NAECY6") %>% strsplit("\\W+") %>% unlist

new_fwt <- lapply(1:3, function(i) {
  df <- fwt[,sprintf(cols, i)]
  colnames(df) <- cols_renamed
  df
}) %>% do.call(rbind, .)