0
votes

I am into learning Python, with a C- language background. Sorry, if my problem is 'naive' or 'too simple' or 'didn't worked enough'.

In the below code, I want to practice for future problems, the removal of specific rows by the 'set' data-structure. But, first of all: it fails to match the removal set contents.

Also, the second issue: is the error in o/p. This can be checked by making the indented block work instead.

The trimmed data file is : marks_trim.csv

"Anaconda Systems Campus Placement",,,,,,

"Conducted on:",,,"30 Feb 2011",,,

"Sno","Math","CS","GK","Prog","Comm","Sel"

1,"NA","NA","NA",4,0,0


import csv, sys, re, random, os, time, io, StringIO

datfile = sys.argv[1] 

outfileName = sys.argv[2]

outfile = open(outfileName, "w")

count = 0

removal_list = set()

tmp = list()

i=0

re_pattern = "\d+" 

with open(datfile, 'r') as fp:

    reader1 = csv.reader(fp)
    for row in reader1:
        if re.match(re_pattern, row[0]):
             for cols in row:  
                    removal_list.add(tuple(cols)) #as tuple is hashable

print "::row>>>>>>",row

print "::removal_list>>>>>>>>",removal_list

convert = list(removal_list)

print "<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>"

print convert

f = open(datfile, 'r')

reader2 = csv.reader(f)

print ""

print "Removal List Starts"

print removal_list

print "Removal List Ends\n"

new_a_buf =  io.BytesIO() # StringIO.StringIO() : both 'io' & StringIO' work

writer = csv.writer(new_a_buf)

rr =""

j  = 0

for row in reader2:

    if row not in convert:   # removal_list: not used as list not hashable
        
          writer.writerow(row)  #outfile.write(new_a_buf)
        
         '''
 #below code using char array isn't used as it doesn't copy structure of csv file
    
    for cols in row:  #at indentation level of "if row not in  convert", stated above

          if cols not in convert:   # removal_list: not used as list not hashable

              for j in range(0,len(cols)):

                   rr+=cols[j]  #at indentation level of "if cols not in convert:"
        
         outfile.write(rr)  # at the indentation level of 'if'

         print "<<<<<<<<<<<<<<<<", rr

 
f = open(outfile, 'r')

reader2 = csv.reader(f)


       '''

new_a_buf.seek(0)

reader2 = csv.reader(new_a_buf)

for row in reader2:

      print row 

Problem/Issue:

The common error (i.e. using char array / csv.writer object) in the o/p is also giving the rows to be deleted, i.e. by occurrence in removal_list.

However, in the approach using char array for retrieving left-out rows, the error is :

Traceback (most recent call last):

File "test_list_in_set.py", line 51, in

f = open(outfile, 'r')

TypeError: coercing to Unicode: need string or buffer, file found

2
try question to make short and brief.. - Riad
What Riad said. And please fix the formatting / indenting of your code samples. - PM 2Ring

2 Answers

1
votes

I didn't read through all that code - but it mostly doesn't seem relevant. The error is to do with opening a file: open takes a filename, but you are passing it outfile, which is already a file. You should close that file first then pass outfileName to open.

0
votes

Got it, sadly myself. Apart from the change of not storing indl. cols, change the removal_list to an array, and then appending to the array using > removal_list.append( row )

Hits!