Functions for finding first and last variables in the vector

Question

I am looking for an R function to find first and last variables in the vector similar to min for minimal and max for maximal. I know I can calculate the length of the vector, and go from there but it does not work for what I need it for.

I have a following dataset (in reality much larger):

a<-(c("2013-02-25","2013-03-13","2013-04-24","2013-05-12","2013-07-12","2013-08-11","actual_exam_date"))
b<-c(300,230,400,NA,NA,NA,"2013-04-30")
c<-c(NA,260,410,420,NA,NA,"2013-05-30")
d<-c(300,230,400,NA,370,390,"2013-08-30")
df<-as.data.frame(rbind(b,c,d))
colnames(df)<-a
rownames(df)<-(c("student 1","student 2","student 3"))
df$student_id <- row.names(df)
library('reshape2')
df2 <- melt(df, id.vars = c('student_id','actual_exam_date'),
                variable.name = 'pretest_date',
                value.name = 'pretest_score')
df2 <- df2[!is.na(df2$pretest_score),]
df2$actual_exam_date <- as.Date(df2$actual_exam_date)
df2$pretest_date <- as.Date(df2$pretest_date)
df2$days_before_exam <- as.integer(df2$actual_exam_date - df2$pretest_date)
df2$pretest_score <- as.numeric(df2$pretest_score)
df2

The way I was able to calculate the maximum scores for each student was this:

aggregate(pretest_score ~ student_id, df2, max)

Now I am looking to identify the first and last pretest scores for each student, to calculate the difference between them. Is there a way to do it using aggregate?

@Jilber if I subsetitute max with head or tail in aggregate(pretest_score ~ student_id, df2, max) in the above example, it produces three, not one column of pretest scores..not sure why — Oposum
Now that I think about my answer I do not know why I guessed that the first and the last should be equal to the min and max... Even though you should be more precise about first and last. "First" and "last" in relation to the date or first and last row of the dateset? — SabDeM

David Arenburg David Arenburg · Accepted Answer · 2016-01-25T22:43:04

Just another suggestion using data.table which allows you both order (if needed) and get the result without specifying anonymous functions and in a single line

library(data.table)
setDT(df2)[order(pretest_date), diff(pretest_score[c(1, .N)]), keyby = student_id]
#    student_id  V1
# 1:  student 1 100
# 2:  student 2 160
# 3:  student 3  90

Functions for finding first and last variables in the vector

5 Answers