bigquery query count column with condition

Question

I have a BigQuery Table whose structure is given below:

file_id    |    file_name    |    error_time_stamp
_ _ _ _ _ _| _ _ _ _ _ _ _ _ | _ _ _ _ _ _ _ _ _ _ _ _ 

  1        |    abcd.txt     |     2015-08-19 19:29:22
  2        |    efgh.txt     |     2015-08-19 19:31:25
  1        |    abcd.txt     |     2015-08-19 19:32:20
  2        |    efgh.txt     |     2015-08-19 19:33:40
  2        |    efgh.txt     |     2015-08-19 19:34:36

I want to query on this table to find the unique count of file_id and sort descending. I use the query:

SELECT file_id, file_name, count(file_id) as counter FROM [dataset.tablename] group by file_id, file_name ORDER BY counter DESC LIMIT 1000

This query is working fine and giving desired result as below.

file_id    |    file_name    |    counter
_ _ _ _ _ _| _ _ _ _ _ _ _ _ | _ _ _ _ _ _

   2       |    efgh.txt     |     3
   1       |    abcd.txt     |     2

Now i want to find the counter based on the error_time_stamp is greater than 19:30:00 (this is in timestamp. datetime format given here is just for readablitiy. i use epoch long timestamp to query).

SELECT pp, ss, count(pp) AS counter FROM (SELECT file_id as pp, file_name as ss FROM [dataset.tablename] where error_time_stamp >= 1440076613) AS main_data group by pp, ss ORDER BY counter DESC LIMIT 1000

Now the expected count for file_id 1 is 1. but still i get 2. What is wrong in my query?

Note that the Epoch time you wrote is not 19:30... no matter which gmt you are.. Your query seems fine.. try make it like Indian suggested with count case when.. — Balinti
Your Epoch time stamp evaluates to be 19:17:57, not 19:30... Use correct Epoch. — aadi
@Balinti the timestamp was just a sample not exactly what i am running. i calculate the timestamp in server with datetime.now() + timedelta(minutes=-30) to get the time before last 30 mins, since i want the report for last 30 mins only — Sriram

Jeremy Condit Jeremy Condit · Accepted Answer · 2015-08-20T19:15:08

BigQuery timestamps have microsecond resolution, so you may be inadvertently comparing against a very small/early timestamp.

I'd recommend using SEC_TO_TIMESTAMP or TIMESTAMP when filtering. For example, you could write:

WHERE error_time_stamp >= SEC_TO_TIMESTAMP(1440076613)

Or:

WHERE error_time_stamp >= TIMESTAMP('2015-08-19 19:30:00')

bigquery query count column with condition

2 Answers