2
votes

I am using this tutorial spark cluster on yarn mode in docker container to launch zeppelin in spark cluster in yarn mode. However I am stuck at step 4. I can't find conf/zeppelin-env.sh in my docker container to put further configuration. I tried putting these conf folder of zeppelin but yet now successful. Apart from that zeppelin notebook is also not running on localhost:9001.

I am very new to distributed system, it would be great if someone can help me start zeppelin on spark cluster in yarn mode.

Here is my docker-compose file to enable zeppelin talk with spark cluster.

version: '2'
services:
sparkmaster:
  build: .
  container_name: sparkmaster
ports:
  - "8080:8080"
  - "7077:7077" 
  - "8888:8888"
  - "8081:8081"
  - "8082:8082"
  - "5050:5050"
  - "5051:5051"
  - "4040:4040"
zeppelin:
  image: dylanmei/zeppelin
  container_name: zeppelin-notebook
env_file:
  - ./hadoop.env
environment:
  ZEPPELIN_PORT: 9001
  CORE_CONF_fs_defaultFS: "hdfs://namenode:8020"
  HADOOP_CONF_DIR_fs_defaultFS: "hdfs://namenode:8020"
  SPARK_MASTER: "spark://spark-master:7077"
  MASTER: "yarn-client"
  SPARK_HOME: spark-master
  ZEPPELIN_JAVA_OPTS: >-
    -Dspark.driver.memory=1g
    -Dspark.executor.memory=2g
ports:
  - 9001:9001
volumes:
  - ./data:/usr/zeppelin/data
  - ./notebooks:/usr/zeppelin/notebook
1

1 Answers

1
votes

this is the dockerfile you used to launch the standalone spark cluster.

But there is no Zeppelin instance inside the container, so you have to use Zeppelin on your local machine.

Please download and use it.