0
votes

I'm trying to use the public GitHub dataset on BigQuery to count events - PushEvents, in this case - on a per repository basis over time.

SELECT COUNT(*)
FROM [githubarchive:github.timeline]
WHERE type = 'PushEvent' 
    AND repository_name = "account/repo"
GROUP BY pushed_at
ORDER BY pushed_at DESC

Basically just retrieve the count for a specified repo and event type, group the count by date and return the list. BigQuery validates the following, but then fails the query with a:

Field 'pushed_at' not found.

As far as I can tell from GitHub's PushEvent documentation, however, pushed_at is an available field. Anybody have examples of related queries that execute properly? Any suggestions as to what's being done incorrectly here?

1

1 Answers

1
votes

The field is called repository_pushed_at, and you also probably meant to include it in the SELECT list, i.e.

SELECT repository_pushed_at, COUNT(*)
FROM [githubarchive:github.timeline]
WHERE type = 'PushEvent' 
    AND repository_name = "account/repo"
GROUP BY repository_pushed_at
ORDER BY repository_pushed_at DESC