0
votes

What I am trying to do?

Glue-Athena-like process.

  1. Data in S3
  2. AWS Glue (create metadata tables)
  3. Tables can be queried using Athena via boto3 (python library)

Problem I am facing in Azure Cloud

~Trying to replicate the above process using Azure Synapse Analytics~

  1. Data in linked Azure Storage container
  2. Azure Data Factory (create external tables)
  3. How to make T-SQL queries on the external tables using python?

Is there any python library to make T-SQL calls to the external tables created in Azure Synapse workspace?

1

1 Answers

0
votes

Yes. PyODBC works with Synapse. It's not perfect but I use it.

https://docs.microsoft.com/en-us/azure/azure-sql/database/connect-query-python

Note that installing it can be a bit tricky. You need the Python package, but also the ODBC driver and the apt package unixodbc-dev.

Here is the part of my dockerfile that does it on Ubuntu 18.04

RUN apt update && apt install -y libpq-dev unixodbc-dev apt-transport-https ca-certificates

RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add - \
&& curl https://packages.microsoft.com/config/ubuntu/18.04/prod.list >> /etc/apt/sources.list.d/mssql-release.list \
&& apt update && ACCEPT_EULA=Y apt install -y msodbcsql17