I am in my new company as a data engineer working on building google cloud platform (GCP) batch ETL pipelines. My team's data scientist has recently passed me a data model (.py file written with python 3.6).
The data model has a main function that I can call and get a dataframe as an output, I intend to append this dataframe to a bigquery table. Is there anyway that I can just import this main function and integrate it into a pipeline using apache beam (Dataflow), without having to re-code the data model into PTransform? Or would I be better off just using cloud scheduler and cloud functions to achieve what I want?
I am a complete beginner with dataflow and apache beam, so any help or links to guides would be greatly appreciated!