3
votes

I am following below link to create a custom image of dataproc-version 1.5.21-debian10 https://cloud.google.com/dataproc/docs/guides/dataproc-images

according to this link if I try below customization script

#! /usr/bin/bash

apt-get -y update <-- This ends in error command not found

apt install python3-pip -y <-- E: Unable to locate package

python3.7 -m pip install numpy <-- /usr/bin/python3.7: No module named pip

instead if I try pip install numpy it installs the package in python2.7

Please suggest what can I do?

2

2 Answers

4
votes

Dataproc 1.5 images use Conda and Python 3 by default. To install packages in Conda environment you should use Conda's conda binary not system one:

/opt/conda/miniconda3/bin/conda install numpy

Note that it's discouraged to use Pip to install packages in Conda environment, but you still can do this if necessary:

/opt/conda/miniconda3/bin/pip install numpy
0
votes

You should use pip3 instead of pip to use the Python 3.7 env.

pip3 install numpy