1
votes

I spilt the data set into train and test as following:

splitdata<-split(sb[1:nrow(sb),], sample(rep(1:2, as.integer(nrow(sb)/2))))
test<-splitdata[[1]]
train<-rbind(splitdata[[2]])

sb is the name of original data set, so it is 50/50 train and test.

Then I fitted a glm using the training set.

fitglm<-  glm(num_claims~year+vt+va+public+pri_bil+persist+penalty_pts+num_veh+num_drivers+married+gender+driver_age+credit+col_ded+car_den, family=poisson, train)

now I want to predict using this glm, say the next 10 observations.

I have trouble to specify the newdata in predict(),

I tried:

pred<-predict(fitglm,newdata=data.frame(train),type="response", se.fit=T)

this will give a number of predictions that is equal to the number of samples in training set.

and finally, how to plot these predictions with confidence intervals?

Thank you for the help

1
A reproducible example tinyurl.com/reproducible-000 would be strongly preferred. Also, you say what you tried, but you don't say what went wrong: did you get an error (if so, what was it)? Were the results wrong/not what you expected (if so, how do you know)? - Ben Bolker
+ you do mean to predict on the test not train data I presume? and what do you mean by "the next 10 observations"? - Stephen Henderson

1 Answers

5
votes

If you are asking how to construct predictions on the next 10 in the test set then:

pred10<-predict(fitglm,newdata=data.frame(test)[1:10, ], type="response", se.fit=T)