2
votes

In Google Big Query, if I run a query against a view, it tells me how much data was scanned to answer it. (BQ bills as a function of this).

However is there a way to see how much data I have in the BQ environment in general without querying any of the many views present in the project? My goal is to measure the amount of data in BQ on a daily basis.

1
do you want to get the size of data set or storage size for each single table ?Tushar
To compliment Mikhail's answer, there is also this nifty little tool - bqdu.infoGraham Polley
@Tushar, goal is to measure data set sizes day by day. As a related issue, is there a query that will show me all objects housed under a Project? I can see them in the UI on the left (a bunch of 8-digit integer identifiers in my case) but was wondering if I could fetch them using a programmed query.Mark Ginsburg

1 Answers

4
votes

Hope below will give you an idea on how to quickly check your inventory for specific dataset
You can extend this logic to union multiple datasets and doing whatever aggregation you are interested in

#legacySQL
SELECT table_id,
    DATE(creation_time/1000) AS creation_date,
    DATE(last_modified_time/1000) AS last_modified_date,
    row_count,
    size_bytes,
    CASE
        WHEN type = 1 THEN 'table'
        WHEN type = 2 THEN 'view'
        WHEN type = 3 THEN 'external'
        ELSE '?'
    END AS type,
    TIMESTAMP(creation_time/1000) AS creation_time,
    TIMESTAMP(last_modified_time/1000) AS last_modified_time,
    dataset_id,
    project_id
FROM [project.dataset1.__TABLES__],   
     [project.dataset2.__TABLES__],
     [project.dataset3.__TABLES__],
     [project.dataset4.__TABLES__],
     [project.dataset5.__TABLES__]

Depends on size of datasets (in terms of number of tables in them) at some point above query can start complaining. so you might need to batch your stats. hope this helps