0
votes

I am trying to follow this blog post from Google using their new CLOUDML tools.

https://cloud.google.com/blog/big-data/2016/12/how-to-train-and-classify-images-using-google-cloud-machine-learning-and-cloud-dataflow

Running from within their provided docker instance

docker pull gcr.io/cloud-datalab/datalab:local
docker run -it -p "127.0.0.1:8080:8080" \
  --entrypoint=/bin/bash \
  gcr.io/cloud-datalab/datalab:local

starting from: root@9e93221352d8:~/google-cloud-ml/samples/flowers#

To the run the first preprocessing step:

Assign appropriate values.

PROJECT=$(gcloud config list project --format "value(core.project)")
JOB_ID="flowers_${USER}_$(date +%Y%m%d_%H%M%S)"
BUCKET="gs://${PROJECT}-ml"
GCS_PATH="${BUCKET}/${USER}/${JOB_ID}"
DICT_FILE=gs://cloud-ml-data/img/flower_photos/dict.txt

Preprocess the eval set.

python trainer/preprocess.py \
  --input_dict "$DICT_FILE" \
  --input_path "gs://cloud-ml-data/img/flower_photos/eval_set.csv" \
  --output_path "${GCS_PATH}/preproc/eval" \
  --cloud

returns

(27042c30421ec530): Workflow failed. Causes: (70e56dda0121e0fa): One or more access checks for temp location or staged files failed. Please refer to other error messages for details. For more information on security and permissions, please see https://cloud.google.com/dataflow/security-and-permissions.

Heading to the console, the logs read:

(531d956bf99b5f27): Staged package cloudml.latest.tar.gz at location 'gs://api-project-773889352370-ml/flowers__20170106_123249/preproc/staging/flowers-20170106-123312.1483705994.201001/cloudml.latest.tar.gz' is inaccessible.

I tried again authenticating with

gcloud beta auth application-default login

and getting the key from the browser. Nothing seems wrong there.

I have successfully run the MNIST cloud learning tutorial, so there is no authentication issues communicating with google compute engine.

I can confirm the path to my bucket is correct:

root@9e93221352d8:~/google-cloud-ml/samples/flowers# echo ${GCS_PATH}
gs://api-project-773889352370-ml//flowers__20170106_165608

but no folder flowers__20170106_165608 is ever created (due to permissions).

Does Dataflow need seperate credentials? I went to the console and made sure my account is open to the dataflow API. Anything beyond

root@9e93221352d8:~/google-cloud-ml/samples/flowers# gcloud config list
Your active configuration is: [default]

[component_manager]
disable_update_check = True
[compute]
region = us-central1
zone = us-central1-a
[core]
account = ####<- scrubbed for SO, its correct.
project = api-project-773889352370

Edit: To show that the service accounts tab on the console. enter image description here

Edit: Accepted answer below. I'm accepting this answer because Jeremy Lewi is correct. The problem is not that dataflow does have permissions, but because the GCS object was never created. Going into the preprocess logger you can see

enter image description here

The tutorial google has shown is probably not well configured for the free tier, i'm guessing it distributes to too many instances and exceeds the CPU quota. If i cannot solve, I will open a correctly framed question.

1
Does the GCS object gs://api-project-773889352370-ml/flowers__20170106_123249/preproc/staging/flowers-20170106-123312.1483705994.201001/cloudml.latest.tar.gz exist? If it exists check its permissions using gsutil getacl gs://api-project-773889352370-ml/flowers__20170106_123249/preproc/staging/flowers-20170106-123312.1483705994.201001/cloudml.latest.tar.gz. Do the Dataflow service accounts <project-number>@cloudservices.gserviceaccount.com and <project-number>[email protected] have access? - Jeremy Lewi
No, it does not exist, the object should be created, presumably, from trainer/preprocess.py. My working hypothesis is that the prior step is supposed to write such a file, it can't because of permissions, and error is thrown when it looks for it. - bw4sz
Edited to show the service accounts tab from cloud console. Is something else needed? - bw4sz
Yes; it looks like the package should be downloaded from gs://cloud-ml/sdk/cloudml.latest.tar.gz and then uploaded to GCS when preprocess.py runs the Dataflow pipeline. The logs from preprocess.py should provide output indicating that the package is being uploaded and possible errors. You might need to add the line logging.getLogger().setLevel(logging.INFO) to preprocess.py right before the call to main to get more verbose logging. - Jeremy Lewi

1 Answers

0
votes

Please see the information about service accounts at the link provided by the error message. I suspect the service account is not authorized correctly to view the staged file.