0
votes

I have a .csv file laid out like this:

name1 name2 name3
value1 value2 value3
value4 value5 value6
value7 value8 value9

I need to find a way in Python3 to create a dictionary where the keys are the head names (name1, name2, name3) and the values are the sum of all the values underneath eg. name1 : (value1 + value4 + value7).

So far I've come up with:

def sumColumns(columnfile):  
    import csv  
    with open(columnfile) as csvfile:  
        rdr = csv.reader(csvfile)  
        output = {}  
        head = next(rdr)  
        total = 0  
        for column in head:  
            for row in rdr:  
                total += int(row[head.index(column)])  
            output[column] = total  
            total = 0  
        return output

I end up returning a dictionary with correct headers but something is going wrong with the sum that I can't pinpoint. One column gets summed and the rest are 0.

1
I think your problem is that rdr is an iterator. That means you can't re-iterate through it for each of the three times for row in rdr: is called - only the first. I'll post a solution in a sec.Alec
Also just a suggestion that you could initialize the "total" variable to 0 at the start of the outer for-loop and eliminate the repetition of that line of code.Jeffrey Swan

1 Answers

1
votes

Definitely not my prettiest piece of code. But here it is. Basically, just store all the information in a list of lists, then iterate over it from there.

def sumColumns1(columnfile):
    import csv
    with open(columnfile) as csvfile:
        r = csv.reader(csvfile)
        names = next(r)
        Int = lambda x: 0 if x=='' else int(x)
        sums  = reduce(lambda x,y: [ Int(a)+Int(b) for a,b in zip(x,y) ], r)
        return dict(zip(names,sums))

In an expanded form (or one that doesn't have reduce - before someone complains):

def sumColumns1(columnfile):
    import csv
    with open(columnfile) as csvfile:
        r = csv.reader(csvfile)
        names = next(r)
        sums = [ 0 for _ in names ]
        for line in r:
            for i in range(len(sums)):
                sums[i] += int(0 if line[i]=='' else line[i])
        return dict(zip(names,sums))

Gives me the correct output. Hopefully someone comes up with something more pythonic.