1
votes

Really need the help from this community.

My question is that when I used the code

========================================================================= response.xpath("//div[contains(@class,'check-prices-widget-not-sponsored')]/a/div[contains(@class,'check-prices-widget-not-sponsored-link')]").extract() enter image description here

to extract the vendor name in scrapy shell, the output is empty. I really did not know why that happened, and it seems to me that the problem might be the website info is updating dynamically?

The url for this web scraping is: https://cruiseline.com/cruise/7-night-bahamas-florida-new-york-roundtrip-32860, and what I need is the Vendor name and Price for each vendor. Besides the attached pic is the screenshot of "the inspect".

Really appreciate the help!

1

1 Answers

1
votes

You need to always check HTML source code in your browser (usually with Ctrl+U).

This way you'll find that information you want is embedded inside Javascript variables using JSON:

var partnerPrices = [{"pool":"9a316391b6550eef969c8559c14a380f","partner":"ncl.com","priority":0,"currency":"USD","data":{"32860":{"2018-02-25":{"Inside":579,"Suite":1199,"Balcony":699,"Oceanview":629},....
var sponsored_partners = [{"code":"CDCNA","name":"cruises.com","value":"cruises.com","logo":"\/images\/partner-logo-cruises-sm.png","logo_sprite":"partner-logo-cruises-com"},...

So you need to import json, parse response.body (using re or another method) and next json.loads() parsed JSON strings to iterate through two arrays.