2
votes

https://cloud.google.com/ml/docs/concepts/training-overview mentions the following:

If your trainer application has any dependencies that are not already on the default virtual machines that Cloud ML uses, you must package them and upload them to a Google Cloud Storage location as well.

What is "already on the default virtual machines that Cloud ML uses"? I couldn't find this info anywhere.

Incidentally, are there any published specs of the machine types here? https://cloud.google.com/ml/reference/rest/v1beta1/projects.jobs#ScaleTier

1

1 Answers

4
votes

What is pre-installed on the CloudML machines is in the process of being documented. In the meantime, this is an informal list of packages with their versions:

numpy==1.10.4
pandas==0.17.1
scipy==0.17.0
scikit-learn==0.17.0
sympy==0.7.6.1
statsmodels==0.6.1
oauth2client==2.2.0
httplib2==0.9.2
python-dateutil==2.5.0
argparse==1.2.1
six==1.10.0
PyYAML==3.11
wrapt==1.10.8
crcmod==1.7
google-api-python-client==1.5.1
python-json-logger==0.1.5
gcloud==0.18.1
subprocess32==3.2.7
wheel==0.30.0a0
WebOb==1.6.2
Paste==2.0.3
tornado==4.3
grpcio==1.0.1
requests==2.9.1
webapp2==3.0.0b1
bs4==0.0.1
Pillow==3.4.1
nltk==3.2.1
python-snappy==0.5
google-cloud-dataflow==0.5.1
google-cloud-logging==0.22.0

In terms of published specs of machine types, those are not available.