I'm 99% sure something is going on with my hxs.select on this website. I cannot extract anything. When I run the following code, I don't get any error feedback. title or link doesn't get populated. Any help?
def parse(self, response):
self.log("\n\n\n We got data! \n\n\n")
hxs = HtmlXPathSelector(response)
sites = hxs.select('//div[@class=\'footer\']')
items = []
for site in sites:
item = CarrierItem()
item['title'] = site.select('.//a/text()').extract()
item['link'] = site.select('.//a/@href').extract()
items.append(item)
return items
Is there a way I can debug this? I also tried to use the scrapy shell command with an url but when I input view(response) in the shell it simply returns True and a text file opens instead of my Web Browser.
>>> response.url
'https://qvpweb01.ciq.labs.att.com:8080/dis/login.jsp'
>>> hxs.select('//div')
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'NoneType' object has no attribute 'select'
>>> view(response)
True
>>> hxs.select('//body')
Traceback (most recent call last):
File "", line 1, in
AttributeError: 'NoneType' object has no attribute 'select'
response.bodylook like? - Blenderprint sitesand see what is printed during crawling. - alecxe