I am doing this simple scrapy crawler tutorial given on scrapy official website but getting some errors. I am doing this thing first time so completely unknown about all this. I need to implement web crawler in my application and i found scrapy to accomplish my needs so started with the tutorial and ended upon the error i have pasted below. Can any one please explain me whats wrong with the code..?
THIS IS MY CRAWLER CODE
from scrapy.spider import Spider
class DmozSpider(Spider):
name="dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
filename = response.url.split("/")[-2]
open(filename, 'wb').write(response.body)
THIS IS THE ERROR I AM GETTING
2014-02-04 10:45:51+0530 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080 2014-02-04 10:45:51+0530 [dmoz] DEBUG: Crawled (200) http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/> (referer: None)
ERROR: Spider error processing http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/> Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 1178, in mainLoop self.runUntilCurrent() File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 800, in runUntilCurrent call.func(*call.args, **call.kw) File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 362, in callback self._startRunCallbacks(result) File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 458, in _startRunCallbacks self._runCallbacks() --- --- File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 545, in _runCallbacks current.result = callback(current.result, *args, **kw) File "/usr/local/lib/python2.7/dist-packages/scrapy/spider.py", line 56, in parse raise NotImplementedError exceptions.NotImplementedError: