I am building gbm model using h2o. The training data is randomly split into 70% development data and 30% in-time validation data. The training data has 1.4% bad rate and I also need to assign weight for each observation (data has a weight column). Observation is: the model built with weight has much higher performance on development data (DEV) compared to the model built without weight (VAL). Model built with weight has big performance difference between development and in-time validation data. For instance, model build with weight shows below top 10% capture rate
DEV: 56%
Validation: 25%
While model build without weight shows below top 10% capture rate:
DEV: 35%
Validation: 23%
Seems use weight in this case helped on model performance on both development and in-time validation data. Wondering how exactly is weight used in the h2o? With weight used in the model building, does the bigger performance difference of the model on DEV and VAL illustrate higher instability of the gbm model building in h2o?
Blue curve is the DEV, orange curve is for VAL>
. For no weight case, log loss for DEV and VAL started from the same point. While for weighted case, log loss for DEV and VAL started from two different points. How to interpret this log loss chart, why weight in h2o gbm created such different in log loss function output?