0
votes

For a single scrapy project, I've developed multiple spiders and stored them in the same spider folder.

I run all my scrapers through cmd with the command : scrapy crawl spidername

However, I've noticed that it complies all the spiders in that project or folder despite them being stored in different python files with different file names, different spider names, different class names and different output names.

I've noticed this happens since I generate a csv within the code and despite running just one spider, the output csv of all other spiders are generated as well but with a file size of 0 kb . Another instance is that if I run spider1 and I achieve the required output csv for spider1 ; I then run spider2, the output csv file of spider1 gets re-written and becomes empty.

Is there any way I can stop this from happening?

My csv code:

csvfile = open('test2.csv', 'w')
  printHeader = True
  def to_csv(self, item):
    if self.csvfile:
      strWrite = ''
      if self.printHeader:
        strWrite +='Item_Type,Image_File,'
        strWrite +='abc'
        strWrite +='xyz'
        self.printHeader = False
1
Are you using Spider.__init__ to create the files? - nramirezuy
@nramirezuy Not sure what you mean but I create a project by using the command : scary startproject exampleproject - quasarseeker

1 Answers

0
votes

I fixed this adding a condition to the csv generator

  csvfile = None
  printHeader = True
  def to_csv(self, item):
    if self.printHeader:
      self.csvfile = open('test2.csv','w')
    if self.csvfile:

      strWrite = ''