Spotfire and BigQuery

Question

I am quite puzzled by BigQuery connector on Spotfire. It is taking !extremely! long time to import my dataset in-memory.

my configuration: spotfire on AWS windows instance (8vCPU - 32Go RAM). dataset 50Go >100M rows on BigQuery.

Yes - I should use in-database for such large dataset and push the queries to BigQuery and use Spotfire only for display, but that is not my question today ????

Today i am trying to understand how the import works and why it is taking so long. this import job started 21hrs ago and it is still not finished. The resources of the server are barely used (CPU, Disk, Network).

Testing done:

I tried importing data from Redshift and it was much faster (14min for 22Go)
I checked resources used during import: network speed (Redshift ~ 370Mbs, BQ ~ 8Mbs for 30min), CPU (Redshift ~ 25%, BQ < 5%), RAM (Redshift & BQ ~ 27Go), Disk write (Redshift 30Mbs, BQ 5MBs)

I really don't understand what is Spotfire actually doing for all this time while importing dataset from BQ in memory. There seems to be no use of server resources and there is no indication of status apart from time running.

Any Spotfire experts have any insights on what's happening? Is the connector to BigQuery actually not to be used for In-memory analysis - what is the actual implementation limiting factor?

Thanks! ????

Thomas Blomberg Thomas Blomberg · Accepted Answer · 2021-04-13T15:26:19

We had an issue which is fixed in the Spotfire versions below:

TS 10.10.3 LTS HF-014 TS 11.2.0 HF-002

Please also vote and comment on the idea of using the Storage API when extracting data from BigQuery:

https://ideas.tibco.com/ideas/TS-I-7890

Thanks,

Thomas Blomberg Senior Product Manager TIBCO Spotfire

Spotfire and BigQuery

3 Answers