1
votes

We are finally moving from Excel and .csv files to databases. Currently, most of my Tableau files are connected to large .csv files (.twbx).

Is there any performance differences between PostgreSQL and MySQL in Tableau? Which would you choose if you were starting from scratch?

Right now, I am using pandas to join files together and creating a new .csv file based on the join.(Example, I take a 10mil row file and drop duplicates and create a primary key, then I join it with the same key on a 5mil row file, then I export the new 'Consolidated' file to .csv and connect Tableau to it. Sometimes the joins are complicated involving dates or times and several columns).

I assume I can create a view in a database and then connect to that view rather than creating a separate file, correct? Each of my files could instead be a separate table which should save space and allow me to query dates rather than reading the whole file into memory with pandas.

Some of the people using the RDMS would be completely new to databases in general (dashboards here are just Excel files, no normalization, formulas in the raw data sheet, etc.. it's a mess) so hopefully either choice has some good documentation to lesson the learning curve (inserting new data and selecting data mainly, not the actual database design).

2

2 Answers

2
votes

Both will work fine with Tableau. In fact, Tableau's internal data engine is based on Postgres.

Between the two, I think Postgres is more suitable for a central data warehouse. MySQL doesn’t allow certain SQL methods such as Common Table Expressions and Window Functions.

Also, if you’re already using Pandas, Postgres has a built-in Python extension called PL/Python.

However, if you’re looking to store a small amount of data and get to it really fast without using advanced SQL, MySQL would be a fine choice but Postgres will give you a few more options moving forward.

1
votes

As stated, either database will work and Tableau is basically agnostic to the type of database that you use. Check out https://www.tableau.com/products/techspecs for a full list of all native (inbuilt & optimized) connections that Tableau Server and Desktop offer. But, if your database isn't on that list you can always connect over ODBC.

Personally, I prefer postgres over mysql (I find it really easy to use psycopg2 to write to postgres from python), but mileage will vary.