2
votes

Attempting to use ImportXML in Google Sheets to retrieve a specific piece of text but am having trouble after searching my ass off for answers.

Hoping someone here can help correct the problem.

The page is: https://www.afi.com.au And the text I'm wanting to import is in the pink box:

enter image description here

Here's where I'm at with the code, I've attempted to retrieve the XPath but it doesn't like what I have, I'm sure someone here will spot the prob in a heartbeat...

=IMPORTXML("http://www.afi.com.au","//body[@class='entry-homepage type-homepage']/div[@class='page page-home']/div[@class='page__item']/div[@class='js-evo7-component']/div[@class='hero']/div[@class='hero__inner-root rellax']/div[@class='hero__inner']/div[@class='container']/div[@class='grid']/div[@class='grid__item one-third palm-one-whole']/div[@class='hero__share-price']/div[@class='price-number']//text()")
2
@Wristy Manchego Although the question has already been closed, can I propose a workaround for retrieving the value you need as other answer?Tanaike
Yeah, of course mate. I’m still hunting a solution to import this value into a sheet.Wristy Manchego
@Wristy Manchego Thank you for replying. I proposed a modified formula as an answer. Could you please confirm it? If I misunderstood your question and that was not the result you want, I apologize.Tanaike
Thanks for this. I certainly will when I next get a chance. On initial viewing, it looks like you’re on the money.Wristy Manchego
@Wristy Manchego Thank you for replying. If that was useful, I'm glad.Tanaike

2 Answers

1
votes

How about this workaround? In this workaround, the data is retrieved using a xpath, and the value is retrieved using a regular expression. It seems that the retrieved data is updated when the page is retrieved. So I used this method. The modified formula is as follows. Please think of this as just one of several answers.

Sample formula:

In this sample formula, http://www.afi.com.au is put in the cell "A1".

=REGEXEXTRACT(IMPORTXML(A1,"//div[@class='js-evo7-component']/@data-config"),"netAssetBacking"":{""price"":""([\d.]+)")
  1. Retrieve data using the xpath of //div[@class='js-evo7-component']/@data-config with IMPORTXML().
  2. Retrieve the value using the regular expression of netAssetBacking"":{""price"":""([\d.]+) with REGEXEXTRACT().

Result:

enter image description here

References:

0
votes

that won't be possible. the piece of information you try to scrape is controlled by JavaScript and Google Sheets can't read JS at all. you can test this simply just by disabling JS on the given website:

0