I am into learning Python, with a C- language background. Sorry, if my problem is 'naive' or 'too simple' or 'didn't worked enough'.
In the below code, I want to practice for future problems, the removal of specific rows by the 'set' data-structure. But, first of all: it fails to match the removal set contents.
Also, the second issue: is the error in o/p. This can be checked by making the indented block work instead.
The trimmed data file is : marks_trim.csv
"Anaconda Systems Campus Placement",,,,,,
"Conducted on:",,,"30 Feb 2011",,,
"Sno","Math","CS","GK","Prog","Comm","Sel"
1,"NA","NA","NA",4,0,0
import csv, sys, re, random, os, time, io, StringIO
datfile = sys.argv[1]
outfileName = sys.argv[2]
outfile = open(outfileName, "w")
count = 0
removal_list = set()
tmp = list()
i=0
re_pattern = "\d+"
with open(datfile, 'r') as fp:
reader1 = csv.reader(fp)
for row in reader1:
if re.match(re_pattern, row[0]):
for cols in row:
removal_list.add(tuple(cols)) #as tuple is hashable
print "::row>>>>>>",row
print "::removal_list>>>>>>>>",removal_list
convert = list(removal_list)
print "<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>"
print convert
f = open(datfile, 'r')
reader2 = csv.reader(f)
print ""
print "Removal List Starts"
print removal_list
print "Removal List Ends\n"
new_a_buf = io.BytesIO() # StringIO.StringIO() : both 'io' & StringIO' work
writer = csv.writer(new_a_buf)
rr =""
j = 0
for row in reader2:
if row not in convert: # removal_list: not used as list not hashable
writer.writerow(row) #outfile.write(new_a_buf)
'''
#below code using char array isn't used as it doesn't copy structure of csv file
for cols in row: #at indentation level of "if row not in convert", stated above
if cols not in convert: # removal_list: not used as list not hashable
for j in range(0,len(cols)):
rr+=cols[j] #at indentation level of "if cols not in convert:"
outfile.write(rr) # at the indentation level of 'if'
print "<<<<<<<<<<<<<<<<", rr
f = open(outfile, 'r')
reader2 = csv.reader(f)
'''
new_a_buf.seek(0)
reader2 = csv.reader(new_a_buf)
for row in reader2:
print row
Problem/Issue:
The common error (i.e. using char array / csv.writer object) in the o/p is also giving the rows to be deleted, i.e. by occurrence in removal_list.
However, in the approach using char array for retrieving left-out rows, the error is :
Traceback (most recent call last):
File "test_list_in_set.py", line 51, in
f = open(outfile, 'r')
TypeError: coercing to Unicode: need string or buffer, file found