We are developing Tableau dashboards and deploying the workbooks on a EC2 windows instance in AWS. One of the data source is the company SQL server inside firewall. The server is managed by IT and we only have read permission to one of the databases. Now the solution is to build workbook on Tableau desktop locally by connecting to the company SQL server. Before publishing the workbooks to Tableau server, the data are extracted from data sources. The static data got uploaded with workbooks when published.
Instead of linking to static extracted data on Tableau server, we would like to set up a database on AWS (e.g. Postgresql), probably on the same instance and push the data from company SQL server to AWS database.
There may be a way to push directly from SQL server to postgres on AWS. But since we don't have much control of the server plus the IT folks are probably not willing to push data to external, this will not be an option. What I can think of is as follows:
- Set up Postgres on AWS instance and create the tables with same schemas as the ones in SQL server.
- Extract data from SQL server and save as CSV files. One table per file.
- Enable file system sharing on AWS windows instance. So the instance can read files from local file system directly.
- Load data from CSV to Postgres tables.
- Set up the data connection on Tableau Server on AWS to read data from Postgres.
I don't know if others have come across a situation like this and what their solutions are. But I think this is not a uncommon scenario. One change would be to have both local Tableau Desktop and AWS Tableau Server connect to Postgres on AWS. Not sure if local Tableau could access Postgres on AWS though.
We also want to automate the whole process as much as possible. On local server, I can probably run a Python script as cron job to frequently export data from SQL server and save to CSVs. On the server side, something similar will be run to load data from CSV to Postgres. If the files are big, though, it may be pretty slow to import data from CSV to postgres. But there is no better way to transfer files from local to AWS EC2 instance programmatically since it is Windows instance.
I am open to any suggestions.