0
votes

I would like to run a quite big docker image (~6 GB). I can create the docker image from a config file using Google Cloud Platform cloudshell

gcloud builds submit --timeout=36000 --tag gcr.io/docker-ml-dl-xxxx/docker-anaconda-env-ml-dl

This works perfectly fine and I can see the buidl is succesfull
https://console.cloud.google.com/cloud-build/

I can also see my image in the Registry Container:
https://console.cloud.google.com/gcr/images/docker-ml-dl-xxxxx

so far so good. The issue is when I try to run this image from cloudshell:

xxxxx@cloudshell:~ (docker-ml-dl-xxxxx)$ docker run gcr.io/docker-ml-dl-xxxxx/docker-anaconda-env-ml-dl
Unable to find image 'gcr.io/docker-ml-dl-xxxx/docker-anaconda-env-ml-dl:latest' locally
latest: Pulling from docker-ml-dl-xxxx/docker-anaconda-env-ml-dl
993c50d47469: Pull complete
c71c2bfd82ad: Pull complete
05fbbe050330: Pull complete
5586ce1e5329: Pull complete
1faf1ec50c57: Pull complete
fda25b84aec7: Pull complete
b5b4ca70f42c: Extracting [=======================>                           ]    708MB/1.522GB
0088935a1845: Download complete
36f80eb6aa84: Download complete
b08b38d2d4a3: Download complete
5ae3364fe2cf: Download complete
25da48fc753b: Downloading [==================================================>]  5.857GB/5.857GB
302cfeb76ade: Download complete
1f6d69ed4c84: Download complete
58c798a01f92: Download complete
docker: write /var/lib/docker/tmp/GetImageBlob997013344: no space left on device.
See 'docker run --help'.

Ok so my docker image is to big to be run from cloudshell.
Is this correct ?

What are the other/best option ? (to be 100% I can run the docker image on my Mac)

  • creating a custom VM
  • with 10 GB storage
  • install all software needed on this VM: docker gcloud ...

I need to devellop and run Machine Learning and Deep Learning code (this is the exploration phase, not the deployment phase with kubernetes).

Is this the best work on the cloud ?

1
6GB is ridiculously huge for a docker image. Does it really need to be that big?SiHa
Question has nothing to do with machine-learning - kindly do not spam the tag (removed).desertnaut
yes, this is huge. I have a acaconda env with a lot of packages and config file (spacy, nltk). Clearly this need to be optimize but this is not the question for the moment. Yes, this is not a ML topic but I want to know the best way to do exploration work on ML on the cloud. Focus is in ML, not sowtare dev on a 300 MB RedHat docker image. This why i added the tag but this was not a good. Thanks for the clean-up.Dr. Fabien Tarrade
Cloud Shell provides you with 5 GB of total disk space. Why are you trying to run a container in Cloud Shell?John Hanley
What is your proposal to develop a DL NLP code that need an Anaconda python env including a lot of ML packages like Tensorflow Keras, SHAP, LIME (and seems to be few GB, don't asked me why and I am looking into it as well) and for which the data are stored in BigQuery (200 GB). I am not talking about running the code in production. I am new on the Cloud. I know how I will do that on an Hadoop Data Lake. Looking forward for best pratice and proposal.Dr. Fabien Tarrade

1 Answers

1
votes

The docker image is too big to run on Cloud Shell. You might run it on Kubernetes or Compute Engine instead, but since you're still in the early stages and you've already said you can run the tools you need locally, then this might not be necessary for your needs. Looking into the future, when you're more concerned with performance, you might want to consider a solution such as Cloud ML Engine or BigQuery ML.