1
votes

I am using BigQuery to query Google Analytics and pull out Transaction ID's by time. I am having difficulty creating a usable datetime field with the columns provided. Using the query below, the closest I have been able to get is "2019-02-11T19:00:00" Which does not include the minutes. (seconds are unimportant)

SELECT PARSE_DATETIME('%Y%m%d%H', newtime) AS datetime, orderid
FROM
(SELECT CONCAT(date, newhour, newminute) AS newtime
, Orderid
FROM
(SELECT date
, CAST(h.hour AS STRING) AS NewHour
, CAST(h.minute AS STRING) AS Newminute
, h.transaction.transactionId AS Orderid
FROM `analyticsdata*` LEFT JOIN UNNEST(hits) as h
WHERE h.transaction.transactionId!= ''))

The problems I have faced so far: The hits.minute column returns a single digit if < 10 causing the parse to fail on any fields with only 13 digits. I am unable to left pad as the hits field is a nested array.

Below is the data as it appears without formatting:

| Date | Hour | Minute | Order|

|20191010 | 12 | 7 | 12345|

How I would like it to appear:

| datetime | Order |

|2019-10-10T12:07:00 | 12345 |

2
Please provide sample data and desired results. - Gordon Linoff

2 Answers

2
votes

I always use visitstarttime*1000 + hits.time to get the session timestamp in milliseconds and add the milliseconds since start of the session. You can feed this into TIMESTAMP_MILLIS() to get a UTC timestamp which you can re-format into a string with a timezone of your choice.

Here is a running example:

SELECT 
  fullVisitorId -- identify user
  ,visitstarttime -- identify session per user 
  ,hitnumber -- identify hit

  ,TIMESTAMP_MILLIS(visitstarttime*1000 + time) as hitTimeUTC -- type: timestamp
  ,FORMAT_TIMESTAMP('%c in %Z', TIMESTAMP_MILLIS(visitstarttime*1000 + time), 'Europe/Berlin') as hitTimeBerlin -- type: string

  ,DATETIME(parse_date('%Y%m%d',date),time(hour,minute,0)) localTime

FROM `bigquery-public-data.google_analytics_sample.ga_sessions_20170420`, UNNEST(hits) AS h
WHERE 
  ecommerceAction.action_type='6' -- is purchase
LIMIT 100

It's important to note that hits.hour and hits.minute are in local time! Meaning the time you set in the view settings in the Google Analytics interface. So you have to use it carefully if you want to compare over data from different time zones.

0
votes

You need to include the minute:

SELECT PARSE_DATETIME('%Y%m%d%H%M', newtime) AS datetime, orderid

Note: This assumes that the hour and minute values are zero-padded, so 01, not 1.

You may simply want:

SELECT DATETIME_ADD(PARSE_DATETIME('%Y%m%d', date), INTERVAL h.hour * 60 + h.minute minute) AS datetime