Web Scraping Return Empty Value Using Xpath in Scrapy

Question

Really need the help from this community.

My question is that when I used the code

========================================================================= response.xpath("//div[contains(@class,'check-prices-widget-not-sponsored')]/a/div[contains(@class,'check-prices-widget-not-sponsored-link')]").extract() enter image description here

to extract the vendor name in scrapy shell, the output is empty. I really did not know why that happened, and it seems to me that the problem might be the website info is updating dynamically?

The url for this web scraping is: https://cruiseline.com/cruise/7-night-bahamas-florida-new-york-roundtrip-32860, and what I need is the Vendor name and Price for each vendor. Besides the attached pic is the screenshot of "the inspect".

Really appreciate the help!

gangabass gangabass · Accepted Answer · 2018-02-11T11:04:44

You need to always check HTML source code in your browser (usually with Ctrl+U).

This way you'll find that information you want is embedded inside Javascript variables using JSON:

var partnerPrices = [{"pool":"9a316391b6550eef969c8559c14a380f","partner":"ncl.com","priority":0,"currency":"USD","data":{"32860":{"2018-02-25":{"Inside":579,"Suite":1199,"Balcony":699,"Oceanview":629},....
var sponsored_partners = [{"code":"CDCNA","name":"cruises.com","value":"cruises.com","logo":"\/images\/partner-logo-cruises-sm.png","logo_sprite":"partner-logo-cruises-com"},...

So you need to import json, parse response.body (using re or another method) and next json.loads() parsed JSON strings to iterate through two arrays.

Web Scraping Return Empty Value Using Xpath in Scrapy

========================================================================= response.xpath("//div[contains(@class,'check-prices-widget-not-sponsored')]/a/div[contains(@class,'check-prices-widget-not-sponsored-link')]").extract() enter image description here

1 Answers