I have created a spider that crawls news. I want to run that spider and schedule it too. It is within a django project. It is such that the spider has will crawl the data and put it into the database, which will be used by django to display the same data. Here's my spider
`class NewsSpider(CrawlSpider): name = "news"
start_urls = ['https://zeenews.india.com/latest-news']
def start_requests(self):
urls = ['https://zeenews.india.com/latest-news']
for url in urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
item = NewsScraperItem()
data = response.css('div.sec-con-box')
item['headlines'] = data.css('h3::text').extract_first()
item['content'] = data.css('p::text').extract_first()
return item`
items.py: `import scrapy from scrapy_djangoitem import DjangoItem from news.models import LatestNews
class NewsScraperItem(DjangoItem): # define the fields for your item here like: # name = scrapy.Field() django_model = LatestNews`