0
votes

Sorry for the newbie question I am new to R and couldn't find the answer anywhere. I am using the caret package. I have a data of 3000 Observation sampled from a bigger data set. I am trying to train the NB classifier on this data using the following code.

model_nb_2002= train(trainingdata_2002$CLA_2.CANCELER ~., data=trainingdata_2002, method="nb",trControl=fitCtrl, metric="Accuracy")

but always get the error message.

In eval(expr, envir, enclos) : model fit failed for Fold10.Rep05: usekernel=FALSE, fL=0 Error in NaiveBayes.default(x, y, usekernel = param$usekernel, fL = param$fL, : Zero variances for at least one class in variables: NUM_0.HH_IM_HAUS10, NUM_0.HH_IM_HAUS12, NUM_0.HH_IM_HAUS13, NUM_0.HH_IM_HAUS137, NUM_0.HH_IM_HAUS14, NUM_0.HH_IM_HAUS15, NUM_0.HH_IM_HAUS16, NUM_0.HH_IM_HAUS17, NUM_0.HH_IM_HAUS18, NUM_0.HH_IM_HAUS19, NUM_0.HH_IM_HAUS20, NUM_0.HH_IM_HAUS21, NUM_0.HH_IM_HAUS22, NUM_0.HH_IM_HAUS23, NUM_0.HH_IM_HAUS24, NUM_0.HH_IM_HAUS25, NUM_0.HH_IM_HAUS26, NUM_0.HH_IM_HAUS27, NUM_0.HH_IM_HAUS28, NUM_0.HH_IM_HAUS29, NUM_0.HH_IM_HAUS30, NUM_0.HH_IM_HAUS31, NUM_0.HH_IM_HAUS32, NUM_0.HH_IM_HAUS33, NUM_0.HH_IM_HAUS34, NUM_0.HH_IM_HAUS35, NUM_0.HH_IM_HAUS36, NUM_0.HH_IM_HAUS37, NUM_0.HH_IM_HAUS38, NUM_0.HH_IM_HAUS39, NUM_0.HH_IM_HAUS40, NUM_0.HH_IM_HAUS41, NUM_0.HH_IM_HAUS42, NUM_0.HH_IM_HAUS43, NUM_0.HH_IM_HAUS44, NUM_0.HH_IM_HAUS45, NUM_0.HH_IM_HAUS46, NUM_0.HH_IM_HAUS47, NUM_0.HH_IM_HAUS49, NUM_0.HH_IM_HAUS52, NUM_0. [... truncated]

I have no idea which attribute is causing this problem. If I understand it right there is an attribute causing this Problem because it misses the variance needed for the prediction. Any help would be appreciated.

2
The formula argument doesn't need an explicit call to variables through the $ operator. Try train(CLA_2.CANCELER ~ ., ...). Can you show at least the summary of your data? - Roman Luštrik
@RomanLuštrik Here is the summary </br> - mesba713

2 Answers

0
votes

@Roman Here is the summary

  ID        NOM_N.PAYMENT_TYPE NUM_0.POWER_CONSUMPTION
 Min.   :   39   Min.   :1.000      Min.   :    0          
 1st Qu.:10053   1st Qu.:1.000      1st Qu.: 1304          
 Median :20409   Median :2.000      Median : 2170          
 Mean   :19849   Mean   :1.738      Mean   : 2597          
 3rd Qu.:29781   3rd Qu.:2.000      3rd Qu.: 3452          
 Max.   :38580   Max.   :3.000      Max.   :31062          

 NUM_0.HH_IM_HAUS NUM_0.GEWERBE_IM_HAUS ORD_R.REGIOTYP
 1      :711      0      :1309          ?:136         
 2      :304      1      : 283          1: 22         
 ?      :136      ?      : 136          2:156         
 3      :135      2      :  49          3:295         
 4      :125      3      :  16          4:363         
 5      : 97      4      :   3          5:263         
 (Other):293      (Other):   5          6:566         
 ORD_R.KAUFKRAFT ORD_R.STRTYP ORD_R.BEBAU  ORD_P.STATUS
 2      :373     ?: 136       ?:158       1      :250  
 3      :282     1:1000       1:999       2      :248  
 4      :263     2: 125       2:361       3      :229  
 1      :262     3: 349       3:246       4      :226  
 5      :204     4: 180       4: 30       5      :192  
 ?      :136     5:  11       5:  7       6      :148  
 (Other):281                              (Other):508  
 ORD_P.BONITAET  ORD_P.ANTDT  ORD_P.ALTERSTR ORD_P.FAMILIEN
 1      :338    9      :373   4      :594    7      :259   
 2      :289    8      :320   5      :534    6      :243   
 4      :198    7      :254   6      :232    5      :232   
 3      :195    6      :233   3      :213    8      :207   
 5      :170    5      :170   ?      :136    4      :161   
 ?      :136    ?      :136   2      : 48    2      :156   
 (Other):475    (Other):315   (Other): 44    (Other):543   
  ORD_P.PKW_DI ORD_P.PKW_LEIST ORD_P.KLBUS ORD_P.GEBRAUCHT
 10     :454   1      :592     ?:374       6      :245    
 9      :269   2      :401     1:779       4      :223    
 8      :210   3      :252     2:436       5      :217    
 7      :158   4      :160     3:212       2      :215    
 6      :152   ?      :136                 3      :205    
 ?      :136   5      : 89                 7      :192    
 (Other):422   (Other):171                 (Other):504    
 ORD_P.GELAEND ORD_P.PSYCHONOMICS_1 ORD_P.PSYCHONOMICS_2
 ?:136         ?:388                ?:367               
 1:848         1:342                1:427               
 2:354         2:227                2:192               
 3:294         3:254                3:178               
 4:169         4:232                4:248               
               5:358                5:389               

 ORD_P.PSYCHONOMICS_3 ORD_P.PSYCHONOMICS_4
 ?:388                ?:386               
 1:154                1:293               
 2:276                2:263               
 3:282                3:256               
 4:368                4:282               
 5:333                5:321               

 ORD_P.PSYCHONOMICS_5 ORD_P.PSYCHONOMICS_6
 ?:367                ?:386               
 1:541                1:287               
 2:271                2:249               
 3:203                3:354               
 4:180                4:231               
 5:239                5:294               

 ORD_P.PSYCHONOMICS_7 ORD_P.PSYCHONOMICS_8  ORD_P.PHARM1
 ?:388                ?:388                7      :365  
 1:331                1:425                ?      :343  
 2:277                2:355                6      :261  
 3:253                3:259                1      :216  
 4:298                4:195                5      :200  
 5:254                5:179                4      :159  
                                           (Other):257  
  ORD_P.PHARM2  ORD_P.PHARM3  ORD_P.PHARM4  ORD_P.PHARM5
 1      :663   7      :566   ?      :343   ?      :343  
 2      :383   ?      :343   1      :329   1      :329  
 ?      :343   1      :289   4      :311   2      :213  
 3      :147   6      :156   5      :300   6      :208  
 4      :106   3      :154   2      :201   5      :205  
 6      : 64   2      :115   3      :137   3      :201  
 (Other): 95   (Other):178   (Other):180   (Other):302  
  ORD_P.PHARM6 CLA_2.CANCELER
 7      :534   1: 165        
 6      :427   2:1636        
 ?      :343                 
 5      :176                 
 4      : 97                 
 3      : 81                 
 (Other):143
-1
votes

convert the variables into factor instead of numeric. if possible to read the data as factors instead of numeric.

trainingdata_2002$NOM_N.PAYMENT_TYP <- as.factor(trainingdata_2002$NOM_N.PAYMENT_TYP