I have a CsvWriter class that is intended to write one single common output file...
import csv
class CsvWriter:
def __init__(self, outputFileName):
self.outputFile = outputFileName
def m_harvestFromAllCsvFiles(): # <-- NOTE THIS METHOD
with open(self.outputFile, 'wb') as outputCsvFile:
wr = csv.writer(outputCsvFile, delimiter =",", quoting=csv.QUOTE_ALL)
# Load 3 CSV files into memory...
readerA = CsvReader("fileA.csv")
readerB = CsvReader("fileB.csv")
readerC = CsvReader("fileC.csv")
readerA.m_getValues(wr, "attributeA", "attributeZ")
readerB.m_getValues(wr, "attributeF", "attributeG")
readerC.m_getValues(wr, "attributeM", "attributeS")
I also have a CsvReader class that allows me to read and store the contents of a CSV file in memory (self.csvFileArray), for each instance (there can be more than one instance)...
import csv
class CsvReader:
def __init__(self, inputFileName):
self.nounDef = nounDef
self.csvFileArray = []
self.csvHeader = []
self.csvHeaderDictionary = {}
with open(inputFileName, 'rU') as csvFile:
for idx, row in enumerate(csv.reader(csvFile, delimiter=',')):
if idx == 0:
self.csvHeader = row
self.csvFileArray.append(row)
for idx, key in enumerate(self.csvHeader):
self.csvHeaderDictionary[key] = idx
...
def m_getValues(csvWriter, attributeList): <<-- NOTE THIS METHOD
...
In short, the goal is to open up different CSV files using the CsvReader, where each file that is read into memory may have different attributes, and then harvest specific attributes from each.
It is too slow to query each CsvReader from the CsvWriter class and then write the common output file from the CsvWriter. The steps would be...
1. CsvWriter asks the first CsvReader for its data
1a. First CsvReader collects the data into a structure and returns it back to the CsvWriter
2. CsvWriter receives writes the data to the common output file
3. CsvWriter asks the second CsvReader for its data
3a. Second CsvReader collects the data into a structure and returns it back to the CsvWriter
4. CsvWriter receives and writes the data to the common output file
5. CsvWriter asks the third CsvReader for its data
5a. Third CsvReader collect the data into a structure and returns it back to the CsvWriter
6. CsvWriter receives and writes the data to the common output file
It would be MUCH faster to just pass the writer object (w/ file handle) "self.wr" to the CsvReader and have it write the common file, directly...
1. CsvWriter tells CsvReader A to write data directly to common output file
2. CsvWriter tells CsvReader B to write data directly to common output file
3. CsvWriter tesll CsvReader C to write data directly to common output file
MY QUESTION: Given that the writer object (the file handle) is instantiated in the CsvWriter instance object, is it safe in Python to pass the writer object to other instances/objects, where it was not created (e.g. the CsvReader instances)? [Refer to CsvReader method "m_getValues(csvWriter, attributeList)" that receives the writer as an argument.] If it's not safe, why not and what's the proper way to handle this problem?
CsvWriter.__init__after thewithblock and so this wouldn't work. - user8651755