I have noticed consistently slow BigQuery performance (between 30 seconds to 1 minute response time) when query more 400 million rows from multiple sharded tables.
I have run the queries 3 times in different time of day (afternoon, late evening and morning), notice response time has been consistently slow. The query uses a group-by string field that may have a lot of unique values and then sort by sum of another integer value in descending order and finally return only the top 10.
I have done performance timing testing on same schema and same query but storing all the data in one to five tables and noticed performance was always under 10 seconds.
What is the expected response time for querying dataset with 400 millions to 2 billions rows sharded in 7 to 90 sharded tables? Can sharding data in more tables cause slower query performance? FYI, each of the of the shard table has at least 24 million to 144 million rows. They are not very small tables.