3
votes

I am trying to run scrapy with DjangoItem. When i run crawl my spider, I get the 'ExampleDotComItem does not support field: title' error. I have created multiple projects and tried to get it to work but always get the same error. I found this tutorial and downloaded the source code, and after running it; I get the same error:

Traceback (most recent call last):
File "c:\programdata\anaconda3\lib\site-packages\twisted\internet\defer.py",line 654, in _runCallbacks current.result = callback(current.result, *args, **kw) File "C:\Users\A\Desktop\django1.7-scrapy1.0.3-master\example_bot\example_bot\spiders\example.py", line 12, in parse return ExampleDotComItem(title=title, description=description) File "c:\programdata\anaconda3\lib\site-packages\scrapy_djangoitem__init__.py", line 29, in init super(DjangoItem, self).init(*args, **kwargs)
File "c:\programdata\anaconda3\lib\site-packages\scrapy\item.py", line 56, in init self[k] = v
File "c:\programdata\anaconda3\lib\site-packages\scrapy\item.py", line 66, in setitem (self.class.name, key)) KeyError: 'ExampleDotComItem does not support field: title'

Project structure:

├───django1.7-scrapy1.0.3-master
   ├───example_bot
   │   └───example_bot
   │       ├───spiders
   │       │   └───__pycache__
   │       └───__pycache__
   └───example_project
       ├───app
       │   ├───migrations
       │   │   └───__pycache__
       │   └───__pycache__
       └───example_project
           └───__pycache__

My Django Model:

from django.db import models

class ExampleDotCom(models.Model):
    title = models.CharField(max_length=255)
    description = models.CharField(max_length=255)

    def __str__(self):
        return self.title

My "example" Spider:

from scrapy.spiders import BaseSpider
from example_bot.items import ExampleDotComItem

class ExampleSpider(BaseSpider):
    name = "example"
    allowed_domains = ["example.com"]
    start_urls = ['http://www.example.com/']

    def parse(self, response):
         title = response.xpath('//title/text()').extract()[0]
         description = response.xpath('//body/div/p/text()').extract()[0]
         return ExampleDotComItem(title=title, description=description)

Items.py:

from scrapy_djangoitem import DjangoItem
from app.models import ExampleDotCom

class ExampleDotComItem(DjangoItem):
    django_model = ExampleDotCom

pipelines.py:

class ExPipeline(object):
    def process_item(self, item, spider):
        print(item)
        item.save()
        return item

settings.py:

import os
import sys

DJANGO_PROJECT_PATH = '/Users/A/DESKTOP/django1.7-scrapy1.0.3-master/example_project'
DJANGO_SETTINGS_MODULE = 'example_project.settings' #Assuming your django application's name is example_project

sys.path.insert(0, DJANGO_PROJECT_PATH)
os.environ['DJANGO_SETTINGS_MODULE'] = DJANGO_SETTINGS_MODULE
BOT_NAME = 'example_bot'



import django
django.setup()
SPIDER_MODULES = ['example_bot.spiders']

ITEM_PIPELINES = {
    'example_bot.pipelines.ExPipeline': 1000,
}
1

1 Answers

1
votes

Can you show your Django model? This is likely occurring because title isn't defined on your ExampleDotCom model.

If it is there, perhaps you need to run your Django migrations?