I'm working on a project that would show the potential influence a group of events have on an outcome. I'm using the glmnet() package, specifically using the Poisson feature. Here's my code:
# de <- data imported from sql connection
x <- model.matrix(~.,data = de[,2:7])
y <- (de[,1])
reg <- cv.glmnet(x,y, family = "poisson", alpha = 1)
reg1 <- glmnet(x,y, family = "poisson", alpha = 1)
**Co <- coef(?reg or reg1?,s=???)**
summ <- summary(Co)
c <- data.frame(Name= rownames(Co)[summ$i],
Lambda= summ$x)
c2 <- c[with(c, order(-Lambda)), ]
The beginning imports a large amount of data from my database in SQL. I then put it in matrix format and separate the response from the predictors.
This is where I'm confused: I can't figure out exactly what the difference is between the glmnet() function and the cv.glmnet() function. I realize that the cv.glmnet() function is a k-fold cross-validation of glmnet(), but what exactly does that mean in practical terms? They provide the same value for lambda, but I want to make sure I'm not missing something important about the difference between the two.
I'm also unclear as to why it runs fine when I specify alpha=1 (supposedly the default), but not if I leave it out?
Thanks in advance!
plot(reg)
. – Rolandfit$lambda.min
and use it with thes=lambda.min
parameter in all calls topredict()
,coef()
etc. – smci