Hi I am not sure if anyone has come across this situation before. I have both Azure and AWS environment. I have a Spark cluster running on Azure Databricks. I have a python/pyspark script that I want to run on the Azure Databricks Spark cluster. In this script I want to write some data into a AWS Redshift cluster which I plan to do using the psycopg2 library. Where can I find the IP address of the Azure Databricks Spark cluster so that I can whitelist it in the security group of the AWS Redshift cluster. I think at the moment I cannot write to the AWS Redshift cluster because the script is running on Azure Databricks Spark cluster and the AWS Redshift cluster does not recognize this request coming from Azure Databricks Spark cluster.
1
votes
1 Answers
0
votes
I have similar use case to connect from Azure Databricks to AWS RDS. Need to whitelist the Azure Databricks IPs in the AWS Security group connected to RDS. Databricks associate cluster with Dynamic Ip so it changes each time a cluster is restarted.
I am trying to get this solution
- Create a public IP address in the Azure portal
- Associate a public IP address to a virtual machine
https://docs.microsoft.com/en-us/azure/virtual-network/associate-public-ip-address-vm#azure-portal
Currently getting error that I do not have permission to update the databricks associated VNet.
This is the simplest solution I could come up with. If this doesnt work, next option is to try Site to Site Connection to set up tunnel between Azure and AWS. This would allow all the dynamic IPs to be authorised for read and write operations on AWS.