count distinct concat in BigQuery

Question

I have tried PostgreSQL:count distinct (col1,col2,col3,col4,col5) in BigQuery :Count distinct concat(col1,col2,col3,col4,col5)

My scenario is I need to get same result as PostgreSQL in BigQuery

Though this scenario works on 3 columns ,I am not getting same value as PostgreSQL for 5 columns.

sample query:

 select col1,
count(distinct concat((col1,col2,col3,col4,col5)
from table A
group by col1

when I remove distinct and concat, simple count(col1,col2,col3,col4,col5) gives exact value as populated in PostgreSQL. But i need to have distinct of these columns. Is there any way to achieve this? and does bigquery concat works differently?

Occam's razor: the data is different. Do you know the answer in advance? If not did you prove the data in the separate databases is actually the same? — Belayer

Mikhail Berlyant Mikhail Berlyant · Accepted Answer · 2019-12-09T18:29:48

Below few options for BigQuery Standard SQL

#standardSQL
SELECT col1,
  COUNT(DISTINCT TO_JSON_STRING((col1,col2,col3,col4,col5)))
FROM A
GROUP BY col1

OR

#standardSQL
SELECT col1,
  COUNT(DISTINCT FORMAT('%T', [col1,col2,col3,col4,col5]))
FROM A
GROUP BY col1

count distinct concat in BigQuery

2 Answers