0
votes

I have a scrapy spider that crawl two quantities for each item. The problem is that I have to use the float method, so when it happens that one of the fields crawled is empty, I get an error, and the spider stops crawling elements in that page, and goes directly to the next page.

Is there any possibility to tell scrapy to keep crawling after an error? This is the code of my spider. Thanks!

def parse(self, response):
    for sel in response.xpath('//li[@class="oneclass"]'):
        item = exampleItem()
        item['quant1'] = float(sel.xpath('a/div/span[@class="exampleclass"]/span[@class="amount"]/text()'))
        item['quant2'] = float(sel.xpath('div[@class="otherexampleclass"]/input/@max'))
        yield item
1

1 Answers

3
votes

You could wrap it in a try/except block:

def parse(self, response):
    for sel in response.xpath('//li[@class="oneclass"]'):
        try:
            item = exampleItem()
            item['quant1'] = float(sel.xpath('a/div/span[@class="exampleclass"]/span[@class="amount"]/text()'))
            item['quant2'] = float(sel.xpath('div[@class="otherexampleclass"]/input/@max'))
            yield item
        except:
            print "could not crawl {}".format(sel)