0
votes

I'm working on a data set and want to use some of following variables to predict "Operatieduur". All the predictors have been factorized.

LogicFit <- train(Operatieduur ~ Anesthesioloog + Aorta_chirurgie + Benadering +
                    Chirurg + Operatietype, data = TrainData,
                  method="glm", family="binomial")

Here I use "train" function from caret package to make a logistic fitting with glm. When I ran this code I got the error message:

1: model fit failed for Resample01: parameter=none Error in eval(family$initialize) : y values must be 0 <= y <= 1

I googled it and found that the reason is that the resopnse "Operatieduur" is a continuous numerical value(it's a duration time). So how should I modify the function to use the predictors(they are all categorical values) to predict a continuous numerical value? Can logistic function do that?

1
If your predictors were continuous it might make sense to fit a logistic curve using nls() (i.e. assume that there is a minimum value (possibly fixed to 0) and a maximum value (usually not fixed to 1), and that increases in any of the predictors lead to sigmoid increasing curves. Can you give us more context/tell us more about your variables?Ben Bolker

1 Answers

2
votes

Logistic regression predicts categories, not numerical variables. If you want to predict a continuous numerical variable (even using categorical variables), use normal regression. Depending on the number of categories of your predictor variables, you may want to consider one hot/dummy encoding.