2
votes

Just started using Google Data Studio, and ran into a problem.

We have a custom downloads report, which displays the full URL of file, and then file name; for example:

https://website.com/-/media/rak/files/productdocuments/filename.pdf

We also have files in a different location; for example:

https://website.com/-/media/rak/files/rak-library/filename.pdf

I'm trying to combine both to display in the downloads report, but with only the file name showing - we want to strip out the separate website URLs.

From trawling various websites, including this one, I've got the below CASE statement to partially work. It will run the first line, but does not seem to want to run the second line.

The Event Action field is what is displaying the full URL and file name.

I'm not 100% sure what the .* does, but it seems like a partial match, not exact? When I didn't have this in, it could not find either URLs, so presumed that REGEXP_MATCH is looking for an exact match, and I needed to not have the file name included in the path.

--

Main function

CASE
  WHEN REGEXP_MATCH(Event Action, '[https://website.com/-/media/rak/files/productdocuments/].*' ) THEN Remove TXT 01
  WHEN REGEXP_MATCH(Event Action, '[https://website.com/-/media/rak/files/rak-library/].*' ) THEN Remove TXT 02
  ELSE Event Action
END

Remove TXT 01

REGEXP_REPLACE(Event Action, 'https://website.com/-/media/rak/files/productdocuments/', '' )

Remove TXT 02

REGEXP_REPLACE(Event Action, 'https://website.com/-/media/rak/files/rak-library/', '' )

--

I'm not 100% sure if it is user error, but if anyone can supply an idea as to how to get this to work, I'd be very grateful!

1

1 Answers

0
votes

It can be achieved by first creating the Extract_Text formula below, and then creating the Main Formula:

Formula 1: Extract_Text

REGEXP_EXTRACT(Event Action, "/([\\w-]+\\.\\w+)\\]?$")

Formula 2: Main Formula

CASE
  WHEN REGEXP_MATCH(Event Action, "\\[?https://website\\.com/-/media/rak/files/(productdocuments|rak-library)/[\\w-]+\\.\\w+\\]?") THEN Extract_Text
  ELSE Event Action
END

Formula Breakdown

  1. Extract_Text: Captures a range of characters a-zA-Z0-9_- after the last / in the sequence: [characters].[characters]
  2. Main Formula: If the pattern doesn't match, then values from the Event Action field are extracted.

Created a Google Data Studio Report (Google Sheets Embedded) to demonstrate.
EDIT: The formula now captures - to extract product-name-01.pdf

enter image description here