I am digging into the user bucket feature in google analytics in order to know what client ids are in the treatment and control group in my campaign experiments in google ad. client id is a custom dimension with index 27 in my google analytics setting. I refer the develop guide here https://ga-dev-tools.appspot.com/dimensions-metrics-explorer/
I am trying to fetch the (date,client_id,user_bucket,user) values using google analytics but it seems that the api only gives 50% of the total data.
Here is the request code to check the (date, user) and it is aligned with the number on GA UI, which is pretty good.
return (
analytics.reports()
.batchGet(
body={
"reportRequests": [
{
"viewId": VIEW_ID,
"pageSize": "100000",
"pageToken": pageToken,
"dateRanges": [
{"startDate": dateRange[0], "endDate": dateRange[1]}
],
"metrics": [
{"expression": "ga:users"},
],
"dimensions": [
{"name": "ga:date"},
],
}
]
}
)
.execute()
)
Output
However, when I add the cliend_id and user_bucket, the number is cut off by 50%.
return (
analytics.reports()
.batchGet(
body={
"reportRequests": [
{
"viewId": VIEW_ID,
"pageSize": "100000",
"pageToken": pageToken,
"dateRanges": [
{"startDate": dateRange[0], "endDate": dateRange[1]}
],
"metrics": [
{"expression": "ga:users"},
],
"dimensions": [
{"name": "ga:date"},
{"name": "ga:dimension27"},
{"name": "ga:userBucket"},
],
}
]
}
)
.execute()
)
The result output is
And aggregated the client_id to date level, which is not aligned with the previous user number. Plus, I cannot figure out why the ga_user has the constant value 2 (I think it should be 1). Thanks!


