I have a dataset, where each instance has a different weight. In my case the weights differ up to orders of magnitude. I want to train a Regressor type model that will consider these weights. As part of my research, I've tried the following models: 'BOOSTED_TREE_CLASSIFIER', 'BOOSTED_TREE_REGRESSOR', 'DNN_CLASSIFIER', 'DNN_REGRESSOR'. In the documentation (XGB Models, DNN Models) there seems to be no way to define a column for Weights. (It is possible to define "class weights" for classifiers, which is a totally different thing).
As part of my research, instead of using weights I created an up-sampled dataset. However, this approach has two huge disadvantages:
- It drastically increases the dataset size, which translates to costs of models training.
- Due to significant differences of weights, I am forced to remove samples with low weights - to keep the up-sampled dataset of "reasonable sizes" (about 10-20 times larger than the original dataset). As a result an important piece of the signal is lost.
I see most of DNN and XGBoost libraries out there do support SAMPLE_WEIGHT parameter - both for Classification and Regressor model types (here is an example).
Is there a way to use SAMPLE WEIGHT with XGBoost and DNN models (Regressor/Classifier model types) in BigQuery ML?