2
votes

I have a data frame containing longitudinal data that looks like this

test

    Names     hr1    hr2    hr3    hr4      workhr_bin
      41     80     76     70     60               7
      42     80     74     75     NA               8
      43     85     NA     60     65               6
      44     NA     NA     NA     60               3
      45     80     70     NA     NA               8
      46     NA     NA     NA     60               3

Variables for hr1, hr2, hr3, hr4 include reported hours of service at repeated timed intervals for the subjects under the "Name" column. The "workhr_bin" column includes bins obtained from using the quantile function. There are ten bins total, 1:10.

I am trying to generate multiple spaghetti plots of hours faceted by bins. Essentially it should yield 10 plots, one plot for data in bin 1, another plot for data bin 2 etc.

I tried doing :

head(melt(test[,c(2:6)]))

But I end up with the workhr_bin variable gone and a file like this instead:

  variable value
1   hours1    80
2   hours1    80
3   hours1    85
4   hours1    NA
5   hours1    80
6   hours1    NA

I've also tried

melt(test, id.var = "Names")

and I end up with the workhr_bin variable also gone

   Names variable value
    41   hr1      80.00
    42   hr1      80.00
    43   hr1      85.00

I tried using lattice and ggplot2 but for some reason I cannot get my data into the right format to generate spaghetti 10 plots representative of samples within each bin.

Essentially I need a data frame with:

Names    variable     value    workhr_bin
    41   hr1      80.00             7
    42   hr1      80.00             8
    43   hr1      85.00             6

Then I would like to be able to create a bin-faceted multicolored spaghetti plots with "variable" on the x.axis (consisting of hr1, hr2, hr3, hr4) and the corresponding "value" on the y-axis. enter image description here

1

1 Answers

5
votes

You should study the documentation of reshape2::melt.

DF <- read.table(text="    Names     hr1    hr2    hr3    hr4      workhr_bin
      41     80     76     70     60               7
      42     80     74     75     NA               8
      43     85     NA     60     65               6
      44     NA     NA     NA     60               3
      45     80     70     NA     NA               8
      46     NA     NA     NA     60               3", header=TRUE)

library(reshape2)
DF_melt <- melt(DF, id.vars=c("Names", "workhr_bin"))
#make time numeric
DF_melt$variable <- as.numeric(gsub("hr", "", DF_melt$variable))

library(ggplot2)
p <- ggplot(DF_melt, aes(x=variable, y=value, color=factor(Names))) +
  geom_line() +
  geom_point() +
  facet_grid(workhr_bin ~ .)

print(p)

(This results in warnings from ggplot2 since there is not enough data. I assume your real dataset won't have this problem.)