1
votes

We are using google-php-client-api in order to stream web sites page views logs into a table with 9 columns. (formed of basic data types as

  • cookieid(string),
  • domain(string),
  • site_category(string),
  • site_subcategory(string),
  • querystring(string),
  • connectiontime(timestamp),
  • flag(boolean),
  • duration(integer),
  • remoteip(string))

After 10 hours or running the scripts, we observed that bigquery api usage (for insertAll methods) became 300K but during that time 35K rows were only recorded to the table...

When we looked to the google cloud console, approximately 299K of this 300K api usage returned "success codes"; what i mean the streaming seemed to work well.

What we didn't understand, after 299K successful requests, how only 35K rows should be inserted to the table?

Is this a problem caused because of the google-php-client-api or bigquery didn't save the sent data to the table yet?

If the second is true, how much time do we need to see the actual (all of the) rows sent to bigquery?

Code used for streaming data:

    $rows = array();
    $data = json_decode($rawjson);
    $row = new Google_Service_Bigquery_TableDataInsertAllRequestRows();
    $row->setJson($data);
    $row->setInsertId(strtotime('now'));
    $rows[0] = $row;

    $req = new Google_Service_Bigquery_TableDataInsertAllRequest();
    $req->setKind('bigquery#tableDataInsertAllRequest');
    $req->setRows($rows);

    $this->service->tabledata->insertAll($projectid, $datasetid, $tableid, $req);

Thank you in advance,

Cihan

1
Yes, but i see that there is 1 day limitation. What i mean, according to SO rules, i see that at the moment, i should wait 1 more hour before the SO system allows me to click the tick.Cihan Fethi Hızar
You wait one more day. The idea is to learn the process. Thanks.Pentium10
I wrote "1 more hour" because when i was writing the above comment, 23 hours of "1 day limitation" already passed. Thank you, Best.Cihan Fethi Hızar

1 Answers

3
votes

We resolved this issue. We saw that it was caused because of this code line:

$row->setInsertId(strtotime('now'));

As we have at least 10-20 requests per second; because of this "insertID", sent to BigQuery, which is depending on the current timestamp; BigQuery was saving only 1 request per second and was rejecting all of other requests without saving them to the table.

We removed this line, now the numbers are coherents.