1
votes

im using EMR and wanted to use jupyter(ipython) so i added to the cluster the bootstrap action: s3://elasticmapreduce.bootstrapactions/ipython-notebook/install-ipython-notebook

I performed the port tunelling to access jupyter from my local host and works fine, but it is asking for a login password, tried empty, tried hadoop, but no luck, does any body knows what is the jypyter password?

1

1 Answers

0
votes

I ran into this problem as well when I used the same bootstrap action. I tried adding in Args=[--password, jupyter] which I also could not get working. That was from this aws forum:

Name='Install Jupyter notebook',Path="s3://aws-bigdata-blog/artifacts/aws-blog-emr-jupyter/install-jupyter-emr5.sh",Args=[--r,--julia,--toree,--torch,--ruby,--ds-packages,--ml-packages,--python-packages,'ggplot nilearn',--port,8880,--password,jupyter,--jupyterhub,--jupyterhub-port,8001,--cached-install,--notebook-dir,s3://<your-s3-bucket>/notebooks/,--copy-samples]

What I did instead was to follow these instructions for installing anaconda directly in the EMR instance using the CLI. If you follow the first part you should be able to get it up and running. To summarize here:

  • ssh into your master emr instance using the .pem file you saved
  • once there's you'll want to install anaconda using super user priveledges: sudo wget http://repo.continuum.io/archive/Anaconda3-4.1.1-Linux-x86_64.sh. Then bash Anaconda3–4.1.1-Linux-x86_64.sh
  • Make sure you're using the anaconda version of python: which python
  • If you're not, specify your source: source .bashrc
  • Now make a jupyter config file: jupyter notebook --generate-config
  • cd into the jupyter folder: cd ~/.jupyter/
  • update the config file: vi jupyter_notebook_config.py
  • In the config file add the following lines:

    c = get_config() c.NotebookApp.ip = '*' c.NotebookApp.open_browser = False c.NotebookApp.port = 6789 <---pick whichever port you want

  • exit out of the config editor and run jupyter via: jupyter notebook

  • this should run a notebook with no active kernels (for now). But it will give you the token you're looking for: http://localhost:6789/?token=xxxxxx

  • Leave this running, and open a new terminal window. Now you'll want to tunnel to the EMR instance per this aws blog post (make the port the same as the one you specified in the config file). ssh -o ServerAliveInterval=10 -i <<credentials.pem>> -N -L 8192:<<master-public-dns-name>>:8192 hadoop@<<master-public-dns-name>>

  • Opening localhost:6789 in the browser should prompt you with the jupyter page to enter your password or token. Enter the token that was generated in the above step and you should be good to go.

Hope this helps! There might be a less convoluted way, but this is what ended up working for me.