1
votes

I have a bigquery table whose data is loaded from AVRO files on GCS. This is NOT an external table.

One of the fields in every AVRO object is created (date with a long type) and I'd like to use this field to partition the table.

What is the best way to do this?

Thanks

1
Why is AVRO important? Is this a federated table in GCS, or is this statement of the question true: "This is NOT an external table" ?Felipe Hoffa

1 Answers

1
votes

Two issues that prevent from using created as a partition column:

  • The avro file defines the schema during loading time. There is only one option to partition at this step: select Partition By Ingestion Time, however, most probably will include another field for this purpose.

  • The field created is long. This value seems to contain a Datetime. If it was Integer you will be able to use Integer Range partitioned tables somehow. But in this case, you would need to convert the long value into a Date/Timestamp to use date/timestamp partitioned tables.

So, from my opinion you can try:

  1. Importing the data as it is into a first table.
  2. Create a second empty table partitioned by created of type TIMESTAMP.
  3. Execute a query reading from the first table and applying a timestamp function on created like TIMESTAMP_SECONDS (or TIMESTAMP_MILLIS) to transform the value to a TIMESTAMP, so each value you insert will be partioned.