0
votes

I'm following up after this tutorial and wondering if is it possible to combine the script mode with Hyperparameter tuning. If I'm trying to do so, the HPT.fit() runs my script again and again (the main() function and then _xgb_train()) but I don't know how to pass the hyperparameter the algorithm chose for me to the train function. Any idea?

2

2 Answers

0
votes

Not sure if I got the problem completely it will be helpful if you could paste the code you followed for HPO setup. Meanwhile you can take a look at the below example where it explains how to setup Hyperparameter tuning. Although this doesn't use script mode but the process should remain the same.

https://github.com/aws/amazon-sagemaker-examples/blob/main/hyperparameter_tuning/xgboost_direct_marketing/hpo_xgboost_direct_marketing_sagemaker_python_sdk.ipynb

0
votes

Yes, it is possible to use the script mode in hyperparameter tuning jobs. The tutorial Hyperparameter Tuning with the SageMaker TensorFlow Container provides a concrete example of how that works.

Overall the steps look like the following, and I will quote examples from the tutorial to clarify my answer:

  1. Define an estimator and point to your training script (this is what makes it a script mode)
estimator = TensorFlow(
    entry_point="train.py",
    source_dir="code",  # directory of your training script
    role=role,
    framework_version="2.3.1",
    model_dir="/opt/ml/model",
    py_version="py37",
    instance_type="ml.m5.4xlarge",
    instance_count=1,
    volume_size=250,
    hyperparameters={
        "batch-size": 512,
        "epochs": 4,
    },
)
  1. Define the range(s) of the hyperparameter(s) you want to optimize
from sagemaker.tuner import ContinuousParameter, HyperparameterTuner

hyperparamter_range = {"learning-rate": ContinuousParameter(1e-4, 1e-3)}
  1. Setup the optimization objective metric name, type and metric definition

The metric_definitions contains one or multiple regular expressions that SageMaker uses to parse the training log and extract the metric on which to optimize:

objective_metric_name = "average test loss"
objective_type = "Minimize"
metric_definitions = [
    {
        "Name": "average test loss",
        "Regex": "Test Loss: ([0-9\\.]+)",
    }
]
  1. Use all the above to setup and run a tuning job:
tuner = HyperparameterTuner(
    est,
    objective_metric_name,
    hyperparamter_range,
    metric_definitions,
    max_jobs=3,
    max_parallel_jobs=3,
    objective_type=objective_type,
)

tuner.fit(inputs=channels)

The tutorial I linked to above gives a reproducible example on how all these steps work together. You can customize it to fit your needs.