2
votes

I am a newbie to importxml and I am having trouble scraping Product data using importxml to Google Spreadsheet.

the image element on the webpage is as the following:

<div class="pd-img"><img src="https://img-trendyol.mncdn.com/Assets/ProductImages/oa/47/4778846/1/1032019101285_2_org.jpg" alt="" style="width: 78px; height: 114px; min-width: 78px; min-height: 114px;"></div>

when I try to import "//div[contains(@class,'pd-img')]/img/@src" it doesn't return the image link at all

after reading the page source I figured out that this xml query:

"//div/img/@src"

would return the link it gave me the link but duplicated and with the 4 previous statements all together (6 total cells) Product link I am working with: https://www.trendyol.com/u-s-polo-assn/erkek-gomlek-g081sz004-000-855736-p-4778846?fbclid=IwAR1pOVpTNOyelKsgVpTQZJ0FRrb_37R-HlI_gm0XWb_ka9RaPGTO8JZZpZc

what I explicitly need is an importxml function that will only return the product image solely from the product page.

1

1 Answers

0
votes

try maybe:

=QUERY(IMPORTXML(
 "https://www.trendyol.com/u-s-polo-assn/erkek-gomlek-g081sz004-000-855736-p-4778846?fbclid=IwAR1pOVpTNOyelKsgVpTQZJ0FRrb_37R-HlI_gm0XWb_ka9RaPGTO8JZZpZc", 
 "//div/img/@src"), 
 "where Col1 starts with 'http' limit 1", 0)

0