I am looking for an R
function to find first and last variables in the vector similar to min
for minimal and max
for maximal. I know I can calculate the length of the vector, and go from there but it does not work for what I need it for.
I have a following dataset (in reality much larger):
a<-(c("2013-02-25","2013-03-13","2013-04-24","2013-05-12","2013-07-12","2013-08-11","actual_exam_date"))
b<-c(300,230,400,NA,NA,NA,"2013-04-30")
c<-c(NA,260,410,420,NA,NA,"2013-05-30")
d<-c(300,230,400,NA,370,390,"2013-08-30")
df<-as.data.frame(rbind(b,c,d))
colnames(df)<-a
rownames(df)<-(c("student 1","student 2","student 3"))
df$student_id <- row.names(df)
library('reshape2')
df2 <- melt(df, id.vars = c('student_id','actual_exam_date'),
variable.name = 'pretest_date',
value.name = 'pretest_score')
df2 <- df2[!is.na(df2$pretest_score),]
df2$actual_exam_date <- as.Date(df2$actual_exam_date)
df2$pretest_date <- as.Date(df2$pretest_date)
df2$days_before_exam <- as.integer(df2$actual_exam_date - df2$pretest_date)
df2$pretest_score <- as.numeric(df2$pretest_score)
df2
The way I was able to calculate the maximum scores for each student was this:
aggregate(pretest_score ~ student_id, df2, max)
Now I am looking to identify the first and last pretest scores for each student, to calculate the difference between them. Is there a way to do it using aggregate?
max
withhead
ortail
inaggregate(pretest_score ~ student_id, df2, max)
in the above example, it produces three, not one column of pretest scores..not sure why – Oposum