SageMaker XGBoost hyperparameter tuning versus XGBoost python package

Question

I am trying to do hyperparameter tuning of xgboost model. I started with AWS Sagemaker Hyperparameter Tuning, with the following parameter range:

xgb.set_hyperparameters(eval_metric='auc',
                        objective='binary:logistic',
                        early_stopping_rounds=500,
                        rate_drop=0.1,
                        colsample_bytree=0.8,
                        subsample=0.75,
                        min_child_weight=0)

hyperparameter_ranges = {'eta': ContinuousParameter(0.01, 0.3),
                         'lambda': ContinuousParameter(0.1, 2),
                         'alpha': ContinuousParameter(0.5, 2),
                         'max_depth': IntegerParameter(5, 10),
                         'num_round': IntegerParameter(500, 2000)}

objective_metric_name = 'validation:auc'

tuner = HyperparameterTuner(xgb,
                            objective_metric_name,
                            hyperparameter_ranges,
                            max_jobs=10,  
                            max_parallel_jobs=3,
                            tags=[{'Key': 'Application', 'Value': 'cxxx'}])

And get a best model with the following set of hyperparameters:

{
  "alpha": "1.4009334471163981",
  "eta": "0.05726016655019904",
  "lambda": "1.2070623852474922",
  "max_depth": "7",
  "num_round": "1052"
}

Out of curiosity, I hooked up these hyperparameters into xgboost python package, as such:

xgb_model = xgb.XGBClassifier(max_depth = 7,
                          silent = False,
                          random_state = 42,
                          n_estimators = 1052,
                          learning_rate = 0.05726016655019904,
                          objective = 'binary:logistic',
                          verbosity = 1,
                          reg_alpha = 1.4009334471163981,
                          reg_lambda = 1.2070623852474922,
                          rate_drop=0.1,
                          colsample_bytree=0.8,
                          subsample=0.75,
                          min_child_weight=0
                        )

I retrained the model and realized the results I got from the latter is better than that from SageMaker. xgboost (auc of validation set): 0.766 SageMaker best model (auc of validation set):0.751

I wonder why SageMaker perform so poorly? If SageMaker usually perform worse than xgboost python package, how do people usually do xgboost hyperparameter tuning? Thanks for any hints!

Are you using same training set and testing set for both SM xgboost and python xgboost? — Varsha
Yes, I am using the same training and testing set for both xgboost. — CathyQian

Eric Kim Eric Kim · Accepted Answer · 2020-02-27T21:00:06

My first guess is that you are using a different version of XGBoost. Which image are you using? The script mode enabled open source XGBoost uses 0.90.

SageMaker XGBoost hyperparameter tuning versus XGBoost python package

1 Answers