3
votes

I have a dataset which has about 1000 features and about 30,000 rows. Most of the data is 0's. I am currently storing this information in a sparse matrix. Now what I would like to do is perform column wise logistic regression - each feature vs the dependent variable.

My question is how do you perform logistic regression on sparse matrices.I stumbled into the glmnet package but that requires minimum 2 columns. Here is some sample code

require(glmnet)
x = matrix(rnorm(100*1),100,1)
y = rnorm(100)
glmnet(x,y)

This gives me an error. I was wondering if there is any other package that I might have missed?

Any help will be appreciated. Thanks all

1
why not use lappy? Would that be of use? - LyzandeR
I think my question was not clear. I have rephrased it. Apologies for the confusion - Abhi
It can happen with a sparse.model.matrix but I don't know if this is what you are looking for. - LyzandeR

1 Answers

2
votes

This is more a workaround than a solution. You can add a column with 1s (cbind(1, x)) to the one-column matrix. This new column will be used for estimating the intercept. Therefore, you have to use the argument intercept = FALSE.

glmnet(cbind(1, x), y, intercept = FALSE)