I receive a well formated csv file, with double-quotes around text fields that contain commas.
Alas, I need to load it into SQL Server, which, as far as I have learned (please tell me how I am wrong here) cannot handle quote-enclosed fields that contain the delimiter.
So, I would like to write a python script which will a) convert the file to pipe-delimited, and b) strip whatever pipes exist in the fields (my sense is that commas are more common, so I'd like to save them, plus I also have some numeric fields that might, at least in the future, contain commas).
Here is the code that I have to do a:
import csv
import sys
source_file=sys.argv[1]
good_file=sys.argv[2]
bad_file=sys.argv[3]
with open(source_file, 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
with open(good_file, 'w') as new_file:
csv_writer = csv.DictWriter(new_file, csv_reader.fieldnames, delimiter='|')
headers = dict( (n,n) for n in csv_reader.fieldnames)
csv_writer.writerow(headers)
for line in csv_reader:
csv_writer.writerow(str.replace(line, '|', ' '))
How can I augment it to do b?
ps--I am using python 2.6, IIRC.