1
votes

According to https://cloud.google.com/dataproc/docs/concepts/connectors/bigquery the connector uses BigQuery Storage API to read data using gRPC. However, I couldn't find any Storage API/gRPC usage in the source code here: https://github.com/GoogleCloudDataproc/spark-bigquery-connector/tree/master/connector/src/main/scala

My questions are: 1. could anyone show me the source code where uses storage API & gprc call? 2. Does Dataset<Row> df = session.read().format("bigquery").load() work through GBQ storage API? if not, how to read from GBQ to Spark using BigQuery Storage API?

1
do you mean how to use it? if so, looking at the examples and examples folder is always the natural choice. If not I misunderstood your question, which I guess is the caseUninformedUser

1 Answers

4
votes
  1. Spark BigQuery Connector uses only BigQuery Storage API for reads, you can see it here, for example.

  2. Yes, Dataset<Row> df = session.read().format("bigquery").load() works through BigQuery Storage API.