I’d be grateful for some help with automating Tableau server data extract refreshes. This seems straightforward if your data source is a database server but I want to refresh from a csv file on the Tableau server.
Our setup is that we have a Unix data server which produces two csv files each week (1GB total size) which we zip, transfer to Tableau desktop and use to create tableau data extracts (tde files). In Tableau desktop, we then use “Edit data source” to point our workbook at the new tde files, and we publish the workbook to the Tableau server (running Windows) as a twbx. (And as a further complication, we also use one of the tde files from the previous week as a third data source).
We want to automate this process, so ideally I would run a script on the Tableau server to copy the zip file from the data server to the Tableau server, unzip, extract and point the workbook on the server to the new tdes. This doesn’t seem to be possible, so what is my best option?
Here are some approaches I’ve considered:
Run scripts on Tableau desktop to recreate what we do manually (using the tableau data extract command line utility to do the extracts and tabcmd to republish the workbook). This requires someone to turn on a laptop at the right time each week, so is not really automation.
Run scripts on the Tableau server to copy the files over then run Tableau data extract command line utility on Tableau server to produce the tde – doesn’t work because that utility is part of Tableau desktop, not Tableau server.
As 2 but use tabcmd to produce the new tdes on Tableau server – doesn’t work because tabcmd doesn’t have the “—original-file” option that Tableau desktop does.
Write a program that uses the Tableau data extract API to produce the tde, and run this on the Tableau server. I’m not sure how much effort that would be but more importantly, the tde doesn’t appear to be available on the server as a tde (it is stored in another form).
Restructure how the data is treated on our data server: we have MySQL running on that server, and could put the csv data into that db, then run a conventional extract refresh with tabcmd on the Tableau server. I think this would work, but would presumably mean we are transferring 1GB data (or more?) through the SQL queries, rather than the 50 MB zip. Also we’d have to create a "last week" snapshot of the data for our 3rd data source). The data doesn’t lend itself to incremental updates - old records can change as well as new records being added. This is all more of a workaround than I would like to do.
Run a database server on the Tableau server, and use scripts to put the csvs in there each week, then use that database as the data source. Even more of a crazy work around, just because Tableau doesn’t want to play with csvs.
Am I missing something? Is there an easier way? I’ve ruled out 2,3 and 4 above, but would 1, 5 or 6 actually work?
We're using Tableau server 8.2.
Many thanks to anyone who can help.