1
votes

I am new to bigquery. Have a couple of rookie questions --

Is there any way to do a select top x * query, laid out kind of like the preview pane in table details? It can be a lot easier to understand when you can visually see the data and structure.

You can create a unique visit ID by concatenating VisitId and FullVisitorId. Why doesn't this match count of sessions? How do unique visits differ from sessions definitionally?

Thanks!

1
You can do "Limit" on your query, but I wouldn't do that. Limit actually scans your full table anyway, so it will cost you the WHOLE table's worth of processing data. About the session thing, I'm unclear... why would you concatenate both? Visit ID isn't unique already?Patrice
Hi Patrice :) So there is no good recommended way to view full results of a where statement? Here is the standard GA export schema: support.google.com/analytics/answer/3437719?hl=en This is what it says for visit id: "An identifier for this visit. This is part of the value usually stored as the _utmb cookie. This is only unique to the visitor. For a completely unique visit id, you should use a combination of fullVisitorId and visitId." However, when I test count(*) rows for sessions and count distinct concac vistor+visit id, they don't match (sessions higher).amysong
Hey ^^. hmmm that does seem weird. Can you shoot your queries so I can see what's up? The way you word it seems like distinct and sessions should be identical.Patrice
When I run select count(distinct string(visitid)+fullvisitorid) from [mydata.ga_sessions_20150125] I get 1.2 million rows, and doing a count(*) yields 1.3 million rows. The london helmets data actually does match (63 unique visits and 63 sessions), but that is much lower sample size than my data set. I was also under the impression that unique visits and sessions would be identical -- but since they're not, I'm wondering what google's definitional difference of them is.amysong
Oddly enough, when I run select string(visitid)+fullvisitorid as a, count(*) as records from [mydata.ga_sessions_20150125] group by a order by records desc it gives me the correct rows of records (1.3M), and the records column has all 1's. I feel like I must be missing something obvious!amysong

1 Answers

3
votes

COUNT(DISTINCT field[, N]) is a statistical approximation. For counts less than N, it is exact.

To get an exact count for large values, use count(*) on a group each by, however this may be a much slower query.

See COUNT documentation at https://cloud.google.com/bigquery/query-reference.