0
votes

I have a BigQuery Table whose structure is given below:

file_id    |    file_name    |    error_time_stamp
_ _ _ _ _ _| _ _ _ _ _ _ _ _ | _ _ _ _ _ _ _ _ _ _ _ _ 

  1        |    abcd.txt     |     2015-08-19 19:29:22
  2        |    efgh.txt     |     2015-08-19 19:31:25
  1        |    abcd.txt     |     2015-08-19 19:32:20
  2        |    efgh.txt     |     2015-08-19 19:33:40
  2        |    efgh.txt     |     2015-08-19 19:34:36

I want to query on this table to find the unique count of file_id and sort descending. I use the query:

SELECT file_id, file_name, count(file_id) as counter FROM [dataset.tablename] group by file_id, file_name ORDER BY counter DESC LIMIT 1000

This query is working fine and giving desired result as below.

file_id    |    file_name    |    counter
_ _ _ _ _ _| _ _ _ _ _ _ _ _ | _ _ _ _ _ _

   2       |    efgh.txt     |     3
   1       |    abcd.txt     |     2

Now i want to find the counter based on the error_time_stamp is greater than 19:30:00 (this is in timestamp. datetime format given here is just for readablitiy. i use epoch long timestamp to query).

SELECT pp, ss, count(pp) AS counter FROM (SELECT file_id as pp, file_name as ss FROM [dataset.tablename] where error_time_stamp >= 1440076613) AS main_data group by pp, ss ORDER BY counter DESC LIMIT 1000

Now the expected count for file_id 1 is 1. but still i get 2. What is wrong in my query?

2
count(case when err_t_stamp >= ... then 1 end) - Pரதீப்
Note that the Epoch time you wrote is not 19:30... no matter which gmt you are.. Your query seems fine.. try make it like Indian suggested with count case when.. - Balinti
Your Epoch time stamp evaluates to be 19:17:57, not 19:30... Use correct Epoch. - aadi
@Balinti the timestamp was just a sample not exactly what i am running. i calculate the timestamp in server with datetime.now() + timedelta(minutes=-30) to get the time before last 30 mins, since i want the report for last 30 mins only - Sriram
@Indian in which query should i add this? on 1st or 2nd ? - Sriram

2 Answers

1
votes

BigQuery timestamps have microsecond resolution, so you may be inadvertently comparing against a very small/early timestamp.

I'd recommend using SEC_TO_TIMESTAMP or TIMESTAMP when filtering. For example, you could write:

WHERE error_time_stamp >= SEC_TO_TIMESTAMP(1440076613)

Or:

WHERE error_time_stamp >= TIMESTAMP('2015-08-19 19:30:00')
0
votes

On the first hand, the timestamp you are providing here: 1440076613 corresponds to the 20th and not TIMESTAMP('2015-08-19 19:30:00').

On the second hand, Bigquery might be interpreting the epoch timestamp in either microseconds or nanoseconds which would mean all dates are higher.

If you want to get all in the same query. You could do the following:

SELECT 
    file_id, 
    file_name, 
    count(file_id) as counter
    SUM(IF(error_time_stamp >= TIMESTAMP('2015-08-19 19:30:00'), 1, 0)) as counter_over_19_30
FROM [dataset.tablename] 
GROUP BY file_id, file_name 
ORDER BY counter_over_19_30 DESC 
LIMIT 1000