I am having no trouble sshing into a Google Cloud compute engine VM, but am unable to ssh into the master node of a Google Cloud Dataproc cluster.
Specifically,
gcloud compute ssh my-vm
works just fine, while
gcloud compute ssh mycluster-m
fails with error message:
[email protected]: Permission denied (publickey).
ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].
The compute engine VM and the Dataproc cluster are in the same project. I understand from the error message it is something related to the ssh keys, but I am not sure how to fix it - I checked the ssh keys in the project via cloud console, and it is correct, and tried the usual gcloud auth login
to reset gcloud project login details.
Any hints on how to fix this?
Edits: I am trying to ssh from my machine, not the cloud console- that's a good point, I will try that and see if that is possible. But in the end I want to use this to connect to a Jupyter notebook from my local computer, so that does not solve the issue of being unable to SSH from my machine to the VM.
Concerning the command to create the Dataproc cluster, I use tools from the hail dataproc python library, but these are basically just convenience shells for the gcloud compute commands, and this is what is failing. But the command I used to create the Dataproc cluster was:
gcloud beta dataproc clusters create \
test \
--image-version=1.4-debian9 \
--properties=^|||^spark:spark.task.maxFailures=20|||spark:spark.driver.extraJavaOptions=-Xss4M|||spark:spark.executor.extraJavaOptions=-Xss4M|||spark:spark.speculation=true|||hdfs:dfs.replication=1|||dataproc:dataproc.logging.stackdriver.enable=false|||dataproc:dataproc.monitoring.stackdriver.enable=false|||spark:spark.driver.memory=41g \
--initialization-actions=gs://hail-common/hailctl/dataproc/0.2.53/init_notebook.py \
--metadata=^|||^WHEEL=gs://hail-common/hailctl/dataproc/0.2.53/hail-0.2.53-py3-none-any.whl|||PKGS=aiohttp>=3.6,<3.7|aiohttp_session>=2.7,<2.8|asyncinit>=0.2.4,<0.3|bokeh>1.1,<1.3|decorator<5|dill>=0.3.1.1,<0.4|gcsfs==0.2.1|humanize==1.0.0|hurry.filesize==0.9|nest_asyncio|numpy<2|pandas>0.24,<0.26|parsimonious<0.9|PyJWT|python-json-logger==0.1.11|requests>=2.21.0,<2.21.1|scipy>1.2,<1.4|tabulate==0.8.3|tqdm==4.42.1|google-cloud-storage==1.25.* \
--master-machine-type=n1-highmem-8 \
--master-boot-disk-size=100GB \
--num-master-local-ssds=0 \
--num-preemptible-workers=0 \
--num-worker-local-ssds=0 \
--num-workers=2 \
--preemptible-worker-boot-disk-size=40GB \
--worker-boot-disk-size=40GB \
--worker-machine-type=n1-standard-8 \
--initialization-action-timeout=20m \
--labels=creator=my_name \
--max-idle=10m