1
votes

I used Google Data Studio to combine a large website sitemap containing the Page URL with the Google Analytics performance data for the same Page URLs going back five years. I am attempting now to identify the navigational depth of each of those URLs based on their length and the number of forward slashes in the URL (/). Navigational depth being 0 for the home page and 1 for pages beneath it, etc.

I have a segment of code I've been working on to do this:

LENGTH(REGEXP_REPLACE(Page, "[^\\/]/g", "")) - 1

Briefly, if I understand the code correctly I've put together, I am seeking to identify all characters in the Page value that are not forward slashes (/) and using REGEXP_REPLACE, substitute them with (nothing), leaving essentially only the forward slashes remaining and then counting their length to determine how many there are and thus the navigational depth minus 1 to remove the first in all Page URLs. /g is a global variable that I've included because it doesn't work otherwise in an REGEX creators I've been using to test the code. The double escaped \\ is a particularity of Google Data Studio as best as I can tell.

The field in question contains data that looks like this:

/testing-data/page-123/page-321/name-of-page/

My code should result in the result 4 (5-1) but is currently resulting in 1.

2

2 Answers

0
votes

The Calculated Field below does the trick where / represents the phrase that's being searched for:

DIVIDE((LENGTH(Page) - LENGTH(REGEXP_REPLACE(Page, "/", ""))) , LENGTH("/"))-1

To break down the above, using pseudocode (and the string /testing-data/page-123/page-321/name-of-page/):

  • DIVIDE((LENGTH(All Characters) - LENGTH(All Characters Excluding /)) , LENGTH(of /)-1

In regards to the calculation, the values at each stage would be:

  • DIVIDE((45 - 40) , 1)-1

Editable Google Data Studio Report (Embedded Google Sheets Data Source) and a GIF to elaborate:

0
votes

It turned out that I the following variant of the code I was using worked:

LENGTH(REGEXP_REPLACE(Page, "[^/]", "")) - 1

The problem I encountered that led to the value being set at 1 was apparently not because of the calculation but due to the setting on the Calculated field as it was being brought in under the Blended Data section of Google Data Studio. It had been carrying it over as "Count Unique" instead of a simple Sum data type. As a result, it was calculating the number properly in the REGEX equation, just not in the table I was viewing.

I attempted to run the code you provided @Nimantha, and that worked as well. So,

There are two answers:

DIVIDE((LENGTH(All Characters) - LENGTH(All Characters Excluding /)) , LENGTH(of /)-1

and

LENGTH(REGEXP_REPLACE(Page, "[^/]", "")) - 1

Thanks for kicking me in the right direction. Once I realized two logical code snippets couldn't be the reason, I looked elsewhere for the cause. Much appreciated.