0
votes

Having a hard time getting google sheets =importxml formula to return the value inside the <itunes:email> tag inside an RSS feed.

Let's use this feed, for example: https://feeds.soundcloud.com/users/soundcloud:users:27444293/sounds.rss

When using this formula:

=IMPORTXML("https://feeds.soundcloud.com/users/soundcloud:users:27444293/sounds.rss","//channel")

I'm getting everything inside the <channel> tag (as expected). What's interesting is, the tag is nested inside the <itunes:owner> tag along with <itunes:name> -- so google sheets will put those two values in one cell. It looks like this: "WKXL - NH Talk [email protected]" because the <itunes:name> tag is "WKXL - NH Talk" and the <itunes:email> tag is "[email protected]".

When I try to isolate this using this formula: =IMPORTXML("https://feeds.soundcloud.com/users/soundcloud:users:27444293/sounds.rss","//channel/itunes:owner")

I get an error that "Imported content is empty."

Is there a way to isolate JUST the <itunes:email> value? Not very educated with the Xpath element here.

Any help is greatly appreciated!!

2

2 Answers

3
votes

The <itunes:owner> is in an XML namespace, which is denoted by itunes: and associated to a URI (you can find that URI in the XML document).

Normally, you would declare that namespace before using XPath, but IMPORTXML does not have any facility for that. So what you can do is refer to the element by its local name, i.e. owner, ignoring the namespace.

//channel/*[local-name() = 'owner']

*[local-name() = 'owner'] can be read as "any element (*) whose local-name() is owner". You'd have to do this work-around for every XML element that is in a namespace.

0
votes

Try

=IMPORTXML("https://feeds.soundcloud.com/users/soundcloud:users:27444293/sounds.rss","//channel//*[local-name()='owner']/*[local-name()='email']")