0
votes

I'm using Google Data Studio to create a report analyzing specific referral sites. My data source is my site Google Analytics.

I want to replace the Full Referrer (e.g. of the format webaddress.com/page-name-one) with a text only value (i.e Page name one), so that it's clearer to see in the report which page is which in my charts and tables.

I've used the below formulae in the calculated fields, but none of them seem to change Full Referrer to match what I need it to. Data studio recognizes them all as valid formulae too.

I've anonymised my examples, but it has the same principles. I've tried:

REGEXP_REPLACE(Full Referrer,"[webaddress\\.com\\/page\\-name\\-one].*","Page name one")

REGEXP_REPLACE(Full Referrer, 'webaddress.com/page-name-one', 'Page name one')

REGEXP_REPLACE(Full Referrer, 'webaddress\\.com\\/page\\-name\\-one', 'Page name one')

REGEXP_REPLACE(Full Referrer, 'name', 'Page name one')

REGEXP_REPLACE(Full Referrer, 'page-name-one', 'Page name one')

REGEXP_REPLACE(Full Referrer, 'page\\-name\\-one', 'Page name one')
1

1 Answers

1
votes

In testing this on one of my own GA data sources, I was able to achieve this using one of your patterns:

REGEXP_REPLACE(Full Referrer,'webaddress.com/page-name-one','Page name one')

It should be noted, however, that the . should be properly escaped (either by \ or wrapping it in a character class like [.]; see re2 syntax for details). Because you have to double-backslash, I also prefer to use something Data Studio borrowed from BigQuery (sort of an undocumented feature), which is the regular expression string type (r"" or r''). When using this, you only have to single-backslash (unless you want a literal backslash):

REGEXP_REPLACE(Full Referrer,r'webaddress\.com/page-name-one','Page name one')

Because you're using REGEXP_REPLACE, anything before or after your match string will still exist after the replacement—meaning that for a Full Referrer of "m.facebook.com/l", REGEXP_REPLACE(Full Referrer,r'facebook\.com','FB') would return "m.FB/l"

So your pattern above will match the value anywhere in the string, which likely isn't what you want. To anchor it to the beginning, use the ^ (start of string) assertion:

REGEXP_REPLACE(Full Referrer,r'^webaddress\.com/page-name-one','Page name one')

If you want to only match that exact value of Full Referrer (i.e. not including any additional path levels), make sure to use the $ (end of string) assertion as well:

REGEXP_REPLACE(Full Referrer,r'^webaddress\.com/page-name-one$','Page name one')

Keep in mind that if you're doing this in the data source as a calculated field, you aren't actually changing the original metric—you're working on a copy of it. So you need to replace Full Referrer with whatever you named your calculated field in the data source.

Often you're wanting to do this for a bunch of sites or pages, so you can use CASE and REGEXP_MATCH to handle all this logic in a single field:

CASE
  WHEN REGEXP_MATCH(Full Referrer,r'^webaddress\.com/page-name-one$') THEN 'Page name one'
  WHEN REGEXP_MATCH(Full Referrer,r'^site2\.com/example$') THEN 'S2 Example'
  ELSE Full Referrer
END

These matches are done in order, so you can even match a specific page or pages, and then still provide a different value for anything on that domain that you didn't match:

CASE
  WHEN REGEXP_MATCH(Full Referrer,r'^site\.com/$') THEN 'Site - Home'
  WHEN REGEXP_MATCH(Full Referrer,r'^site\.com/about$') THEN 'Site - About'
  WHEN REGEXP_MATCH(Full Referrer,r'^site\.com/') THEN 'Site - (other)'
  ELSE Full Referrer
END

You can also use the ELSE if you want to bucket all of the unmatched values into an "other" grouping instead of just leaving the original value.

Another thing to remember is that due to shared fields in GA, things like Source (utm_source) also show up in Full Referrer, so you could be seeing values there that you wouldn't normally expect. Often you can get rid of these by also filtering to only the Default Channel Grouping of "Referral".

If your patterns still aren't matching, please update the question with some additional details such as what the output actually is, whether there's an error message, etc.—and also whether you're doing this as a calculated field in the data source or the "Create Field" button on a single chart.