4
votes

What is the correct way of setting up static public ip (or ip range) for Databricks workspace in Azure? What would be the simplest working solution?

I would like to be able to whitelist Databricks ip in ftp server (running outside of azure), which will be accessed by some jobs. Databricks is already running within VNET, so I tried following scenarios:

  1. NAT gateway - when associate gateway with public subnet clusters fail to start with error: "Network Configuration Failure" and more details"[Nat gateway] cannot be deployed on subnet containing Basic SKU Public IP addresses or Basic SKU Load Balancer. NIC".
  2. Using firewall and rout table - as described here - this works partially (I could not get python packages installed - SSLError(SSLError("bad handshake: SysCallError(-1, 'Unexpected EOF')"))). The problem is, that it's quite pricey ~ 1€ per hour.
  3. Rout traffic through NVA - as described here - I did not manage to get it working - it seems also a bit too complicated for my simple deployment.
2

2 Answers

1
votes

I've also tried #1 nd #2 without any luck.

The problem with #3 and IP Forwarding is that although you can create a 3rd "NVA" subnet (connected to the Internet with NAT) in Azure DataBricks VNET and you can get packets from databricks-public subnet forwarded to that "NVA" subnet, these packets will not go any further, since Azure NAT attached to "NVA" subnet would not accept them (since they originate from another subnet)

You can spin up a VM with Windows Server acting as NVA router, but that architecture just looks ridiculous for that simple requirement. Nevertheless, that's how it can work somewhat:

  1. Databricks VNET has 3 subnets: databricks-private (10.139.64.0/18), databricks-public (10.139.0.0/18) and a new one called "NVA" (10.139.128.0/24)
  2. Azure NAT with a public IP is attached to "NVA" subnet
  3. Windows Server VM with 2 network interfaces created: 1 interface attached to "NVA" subnet (for example, static IP 10.139.128.4) and 2nd interface attached to "databricks-public" subnet (for example, static IP 10.139.0.4)
  4. Remote Access with "Router" role installed and configured in Windows Server VM. Interface with "NVA" IP (10.139.128.4) is selected as connected to the Internet when configuring NAT in "Routing and Remote Access"
  5. Azure routing table is created and attached to "databricks-public" subnet. Among others, it has a route 0.0.0.0/0 -> 10.139.0.4 (Virtual Appliance)

The packet flow in the final set-up looks like this:

  1. databricks-public subnet (worker VM) sends a packet to the Internet
  2. Instead of just going to the Internet directly, a forced route 0.0.0.0/0 to 10.139.0.4 (Windows Server VM) is used
  3. Windows VM translates (NAT) packet originating from 10.139.0.0/18 subnet to a packet originating from 10.139.128.0/24 subnet
  4. Azure NAT picks up that packet and performs NAT one more time (assigning a public IP as an origin of the packet)

The last step (and azure NAT) is not super required, but I just didn't want to have the VM directly exposed to the Internet

0
votes

According to Product team from Microsoft:

  • support for NAT gateway is on their roadmap - no ETA so far (seems like databricks deployment is using basic public IPs and NAT gateway supports only standard public IPs),
  • firewall is officiallyrecommended solution and NVA is plausible workaround.