0
votes

I try to import data into a google spreadsheet using importxml, but every time I try I get the same error: ”The imported xml content cannot be parsed”. (The original message is german, I translated.)

The data I want do import is located in the h1 of a amazon product page. As an example I use the following: https://www.amazon.de/Harry-Potter-verwunschene-Special-Rehearsal/dp/3551559007/.

Do you understand why the following does not work?

=importXml("https://www.amazon.de/Harry-Potter-verwunschene-Special-Rehearsal/dp/3551559007/"; "//h1/span[1]")

I appreciate every helping hand!

1
Stackoverflow is for programming questions. Google spreadsheet questions are more suited for WebApps. - MikeJRamsey56

1 Answers

0
votes

SO it's now being loaded by javascript so if you take that last number on the end of your URL 3551559007 and append it onto this URL string https://www.amazon.de/gp/search-inside/service-data?method=getBookData&asin= you will see the raw JSON where the data is - and you can parse it from there:

https://www.amazon.de/gp/search-inside/service-data?method=getBookData&asin=3551559007

so in this case you can basically us this formula to get out the title:

=REGEXREPLACE(CONCATENATE(IMPORTDATA(A1)),"(^.*title:"")(.*)(""type:""html.*)","$2")

enter image description here

or you can find the import json custom function and set that all up - the above will be quicker.