0
votes

I have a partitioned table (has date partitions for one year) in my prod-environment and i want to copy all these partitions in another partitioned table residing in my dev-environment. How can i do this in one go ?

I have tried this command but it only copies one partition at a time.

bq cp -a ':._20180605' :.

I can chose multiple tables at a time from source table separated by comma but then if I don't specify destination table date partition as stated below then it creates new table with the same name ..rather then appending data in previously created partitioned table

bq cp -a ':._20180605' :.

How can I copy all the partitions from source to destination table in one go? Is there any query or bq command to do that ?

1
Your source tables look like daily tables, one table per day, instead of BigQuery partitioned tables, cloud.google.com/bigquery/docs/partitioned-tables. Is your destination table a BigQuery partitioned table? - Hua Zhang
If you want the replica of table1 present in one dataset to another then, simply copy the table. No need to create the destination table beforehand. OR if you want to forcefully overwrite the data in destination table then use -f flag in bq command. - Priya Agarwal

1 Answers

0
votes

According to the documentation for Copying multiple source tables: "Source tables must be specified as a comma-separated list."

Therefore I can suggest you 3 alternatives:

First: In case your partitioned tables contains a column with the partition date, you can create a bq command to select the partition dates you want, save the list in a file and execute it with a bash command. Therefore, you would be able to append all the tables to the desired destination table. The syntax would be as the following:

bq query --format=csv --nouse_legacy_sql '
SELECT
  CONCAT('bq cp -a <sourceproj>:<dataset>.<table>$', partition_name, ' <testproj>:<dataset>.Partitioned_Destination_Table')
FROM (
  SELECT
    DISTINCT FORMAT_DATETIME('%Y%m%d',
      CAST(_PARTITIONDATE AS datetime)) partition_name
  FROM
    `<sourceproj>.<dataset>.<table>`
  WHERE
    _PARTITIONTIME >= "start_date"
    AND _PARTITIONTIME < "end_date")' > output.csv

Then, execute the file with all the bq copy commands.

bash output.csv

You can read more about bq command lines here.

Second: In the shell script make a list of all timestamps partitions and query then, appending each table in a already created partitioned destination table. The syntax would be:

tables=("20200107"  "20200106"  "20200105"  "20200104")

   for val in ${tables[*]}; do
        bq cp -a <project1>:<dataset1>.<table1_$val> <project2>:<dataset2>.<Partitioned_Destination_Table>
   done

Third: If your data set contains only the tables you want to copy, you can copy a the whole dataset, you can read more about it here, it would be as the following:

 bq mk --transfer_config --project_id=[PROJECT_ID] --data_source=[DATA_SOURCE] --target_dataset=[DATASET] --display_name=[NAME] --params='[PARAMETERS]'

I hope it helps.