4
votes

Background

  • I have created an Airflow webserver using a Composer Environment within Google Cloud Platform. i.e. 3 nodes, composer-1.10.0-airflow-1.10.6 image version, machine type n1-standard-1.
  • I have not yet configured any networks for this environment.
  • The Airflow works fine for simple test DAGs, i.e.:

Airflow webserver

The problem

  • I wrote a ping_ip DAG for determining whether a physical machine (i.e. my laptop) is connected to the internet. (Code: https://pastebin.com/FSBPNnkP)
  • I tested the python used to ping the machine locally (via response = os.system("ping -c 1 " + ip_address)) and it returned 0, aka Active Network.
  • When I moved this code into an Airflow DAG, the code ran fine, but this time returned 256 for the same IP address.

Here's the DAG code in a pastebin: https://pastebin.com/FSBPNnkP

Here are the Airflow Logs for the triggered DAG pasted above:

[2020-04-28 07:59:35,671] {base_task_runner.py:115} INFO - Job 2514: Subtask ping_ip 1 packets transmitted, 0 received, 100% packet loss, time 0ms
[2020-04-28 07:59:35,673] {base_task_runner.py:115} INFO - Job 2514: Subtask ping_ip [2020-04-28 07:59:35,672] {logging_mixin.py:112} INFO - Network Error.
[2020-04-28 07:59:35,674] {base_task_runner.py:115} INFO - Job 2514: Subtask ping_ip [2020-04-28 07:59:35,672] {python_operator.py:114} INFO - Done. Returned value was: ('Network Error.', 256)
  • I guess I have Networking issues for external IPs in my server.
  • Does anybody know how to ping an external IP from within an Airflow Service managed by GCP?
  • The end goal is to create a DAG that prompts a physical machine to run a python script. I thought this process should start with a simple sub-DAG that checks to see if the machine is connected to the internet. If I'm going about this the wrong way, please lemme know.
3
Can you normally ping your laptop from external services? - esqew
I think the right approach might be to look into DNS Services for pods. i.e. cloud.google.com/solutions/… - user2992169
@user2992169 I'd look into firewall rules first - allow ping on the device via specific port and then update your script to use that port? stackoverflow.com/questions/17903859 - sgoley

3 Answers

1
votes

What worked for me is removing the response part. Here's the code:

import os
def ping_ip():
    ip_address = "8.8.8.8"  # My laptop IP
    response = os.system("ping -c 1 " + ip_address)

    if response == 0:
        pingstatus = "Network Active."
    else:
        pingstatus = "Network Error."
    print("\n *** Network status for IP Address=%s is : ***" % ip_address)
    print(pingstatus)

    return pingstatus

print(ping_ip())
1
votes

Let me give my opinion.

Composer by default uses the default network that contains a firewall rule that allow ICMP protocol (ping). So, I think any public IP should be reached out, for example, when PYPI packages are installed you usually don't configure anything special, the PYPI repositories are accessible.

A machine that has Internet access not necessarily means that it has a public IP, e.g. it can be under NAT or any other network configuration (network is not my expertise). To ensure you are specifying the public address of your Internet connection, you can use tools like https://www.myip.com/, where you will see the Public IP (e.g 189.226.116.31) and your Host IP (e.g. 10.0.0.30), if you get something similar, you will need to use the public one.

If you are using the Host IP, it is possible that it is working locally as that IP is reachable from the same private network you are in, the traffic is not going outside the network. But in the case of Composer where your DAG was uploaded, the nodes are completely outside of your local network.

I didn't find what the ping code 256 could mean, but if you are using the correct public IP, you can try increasing the timeout of the response with -W, it is probably only taking more time to reach out the IP.

0
votes

The VMs created by Composer are unlikely to have "ping" installed. These are standard images. I think you are basically invoking the Linux "ping" command and it fails because it is not installed in the node/vm. So you need to change your implementation to "ping" the server another way.

You can SSH to the Composer node VMs and install "ping" and then rerun the DAG. But even if it works I would not consider it a clean solution and it will not scale. But it is okay to do this for a pilot.

Lastly, if your goal is to execute a Python script have you thought of using a Python Operator from within a DAG. If you want to somehow decouple the execution of Python script from the DAG itself, an alternative is to use a PubSub + CloudFunction combination.

Other probable causes for being unable to reach External IPs is misconfigured firewall rules. To fix this you must:

  • Define an egress firewall rule to enable ping to your destination IP and attach the firewall rule to a "tag".
  • Make sure you attach the same "tag" to the VMs/nodes created for Composer.