1
votes

I'm trying to get the Hue docker image from gethue/hue, but it seems to ignore the configuration I give him and always look for HDFS on localhost instead of the docker container I ask him to look for.

Here is some context:

  1. I'm using the following docker compose to launch a HDFS cluster:
 hdfs-namenode:
    image: bde2020/hadoop-namenode:1.1.0-hadoop2.7.1-java8
    hostname: namenode
    environment:
      - CLUSTER_NAME=davidov
    ports:
      - "8020:8020"
      - "50070:50070"
    volumes:
      - ./data/hdfs/namenode:/hadoop/dfs/name
    env_file:
      - ./hadoop.env


  hdfs-datanode1:
    image: bde2020/hadoop-datanode:1.1.0-hadoop2.7.1-java8
    depends_on:
      - hdfs-namenode
    links:
      - hdfs-namenode:namenode
    volumes:
      - ./data/hdfs/datanode1:/hadoop/dfs/data
    env_file:
      - ./hadoop.env

This launches images from BigDataEurope, which are already properly configured, including:

- the activation of webhdfs (in /etc/hadoop/hdfs-site.xml):
  - dfs.webhdfs.enabled set to true
- the hue proxy user (in /etc/hadoop/core-site.xml): 
  - hadoop.proxyuser.hue.hosts set to *
  - hadoop.proxyuser.hue.groups set to *

The, I launch hue following their instructions:

First, I launch a bash prompt inside the docker container:

docker run -it -p 8888:8888 gethue/hue:latest bash

Then, I modify desktop/conf/pseudo-distributed.ini to point to the correct hadoop "node" (in my case a docker container with the address 172.30.0.2:

[hadoop]

  # Configuration for HDFS NameNode
  # ------------------------------------------------------------------------
  [[hdfs_clusters]]
    # HA support by using HttpFs

    [[[default]]]
      # Enter the filesystem uri
      fs_defaultfs=hdfs://172.30.0.2:8020

      # NameNode logical name.
      ## logical_name=

      # Use WebHdfs/HttpFs as the communication mechanism.
      # Domain should be the NameNode or HttpFs host.
      # Default port is 14000 for HttpFs.
      ## webhdfs_url=http://172.30.0.2:50070/webhdfs/v1

      # Change this if your HDFS cluster is Kerberos-secured
      ## security_enabled=false

      # In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
      # have to be verified against certificate authority
      ## ssl_cert_ca_verify=True

And then I launch hue using the following command (still inside the hue container):

./build/env/bin/hue runserver_plus 0.0.0.0:8888

I then point my browser to localhost:8888, create a new user ('hdfs' in my case), and launch the HDFS file browser module. I then get the following error message:

Cannot access: /user/hdfs/.
HTTPConnectionPool(host='localhost', port=50070): Max retries exceeded with url: /webhdfs/v1/user/hdfs?op=GETFILESTATUS&user.name=hue&doas=hdfs (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 99] Cannot assign requested address',))

The interesting bit is that it still tries to connect to localhost (which of course cannot work), even though I modified its config file to point to 172.30.0.2.

Googling the issue, I found another config file: desktop/conf.dist/hue.ini. I tried modifying this one and launching hue again, but same result.

Does any one know how I could correctly configure hue in my case?

Thanks in advance for your help.

Regards,

Laurent.

1

1 Answers

0
votes

Your one-off docker run command is not on the same network as the docker-compose containers.

You would need something like this, replacing [projectname] with the folder you started docker-compose up in

docker run -ti -p 8888:8888 --network="[projectname]_default" gethue/hue bash

I would suggest using Docker Compose also for the Hue container and volume mount for a INI files under desktop/conf/ that you can specify simply

fs_defaultfs=hdfs://namenode:8020

(since you put hostname: namenode in the compose file)

You'll also need to uncomment the WebHDFS line for your changes to take affect

All INI files are merged in the conf folder for Hue