I am fitting a linear model to this data:
data <- data.frame(Student_ID =c(1,1,1,2,2,3,3,3,3,3,4,4,4,5,6,6,7,7,7,8,8),
Years_Attended = c(1991,1992,1995,1992,1993,1991,1992,1993,1994,1995,1993,1994,1995,1995,1993,1995,1990,1995,2000,1995,1996),
Class = c("A","A","A","A","A","A","A","A","A","A","B","B","B","B","B","B","C","C","C","C","C"),
marks = c(50,55,46,44,60,66,67,80,91,90,70,75,76,77,77,82,89,88,88,64,65))
The purpose is to create a new column that determines change in marks. I call this column marks.change and I fit the model as follows:
data2 <- data %>% group_by(Student_ID) %>% summarise(
Good.marks = length(marks[!is.na(marks)]),
marks.change = ifelse(Good.marks>1,
summary(lm(marks ~ Years_Attended))$coefficients[2, 1], 0),
Student_ID = unique(Student_ID),
Class = unique(Class),
)
This code works fine. However, as opposed to considering all the years at once, I would like to fit the model above (i.e., the part where I say “marks.change =…”) for every interval in years then averaging them. Meaning I would like to fit the model between 1991 and 1992 only then move to 1992 and 1993, then move to 1993 and 1994 etc up to the final year and then putting the average of these calculations in a new column called marks.change.part2
Is there an easier way to automate this?