1
votes

I have a table in Google BigQuery(GBQ) with almost 3 million records(rows) so-far that were created based on data coming from MySQL db every day. This data inserted in GBQ table using Python pandas data frame(.to_gbq()).

What is the optimal way to sync changes from MySQL to GBQ, in this direction, with python.

1
Can you add some code? Please check here on how to ask good questions: stackoverflow.com/help/how-to-ask - rainer
I have not developed the code yet, I am figuring out where I should start, I am looking for a general solution not very detail. - Matu Mlkz

1 Answers

1
votes

Several different ways to import data from MySQL to BigQuery that might suit your needs are described in this article. For example Binlog replication:

This approach (sometimes referred to as change data capture - CDC) utilizes MySQL’s binlog. MySQL’s binlog keeps an ordered log of every DELETE, INSERT, and UPDATE operation, as well as Data Definition Language (DDL) data that was performed by the database. After an initial dump of the current state of the MySQL database, the binlog changes are continuously streamed and loaded into Google BigQuery.

Seems to be exactly what you are searching for.