1
votes

I'm trying to read from a csv file using .DictReader and based on the value of what I've read from a header (key), I'd like to write to a new csv file using .DictWriter. I'm getting an error that says ValueError: could not convert string to float.

From what I understand, the DictReader would get a list of string instead of just string so it would not be able to cast directly to what the csv.DictReader is reading. So I tried to iterate the casting through the list. It is still giving me some error.

First code:

import csv

with open('report.csv', 'r') as openfile:               #open report
    csv_reader = csv.DictReader(openfile, delimiter='\t')

#writing to a new file start
    #sets up the output file output.csv
    with open('output.csv', 'w') as new_file:

        #hardcoding the filename
        fieldnames = csv_reader.fieldnames
        fieldnames = ['header1', 'header2', 'header3']

        #setting the parameters for the output file
        csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames, delimiter='\t', extrasaction='ignore')
        csv_writer.writeheader()

        for line in csv_reader:           #checking every line we are reading
                headerval= line['header2']
                if float(header2val) >= 200:   #check condition 
                        csv_writer.writerow(line)        #writes if true

then I tried iterating the cast (not sure if this is correct)


import csv

with open('report.csv', 'r') as openfile:               #open report
    csv_reader = csv.DictReader(openfile, delimiter='\t')

#writing to a new file start
    #sets up the output file output.csv
    with open('output.csv', 'w') as new_file:

        #hardcoding the filename
        fieldnames = csv_reader.fieldnames
        fieldnames = ['header1', 'header2', 'header2']

        #setting the parameters for the output file
        csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames, delimiter='\t', extrasaction='ignore')
        csv_writer.writeheader()

          for line in csv_reader:       #checking every line we are reading
                  for checkval in line['header2']: #iterate the casting
                          headerval= float(checkval)
                          if headerval >= 200:     #check condition
                                 csv_writer.writerow(line) #writes if  true

First Code Error message: "TypeError float() argument must be a string or a number" The expected output comes out to be correct. The header and the values printed to the output.csv satistfies the condition in Code 1.

Second Code Error message: "ValueError: could not convert string to float" In Code 2, only the headers are printed.

Edit: report.csv

    header1 header2 header3 header4 header5 
1   30.35   true    true    false
2   20.35   false   true    false
3   50.35   true    true    false
4   10.35   true    true    false
5   20.35   true    true    false
6   70.35   false   true    false
7   85.26   false   true    false
8   83.39   true    true    false
9   172.11  true    true    false
10  184.99  false   true    false
11  146.11  true    true    false
12  230.28  false   true    false
13  124.42  false   true    false
14  416.15  true    true    false
15  257.27  false   true    false
16  263.39  true    true    false
17  295.0   true    true    false
18  175.35  true    true    false
19  275.62  true    true    false
20  189.08  true    true    false
21  163.05  true    true    false
22  166.66  false   true    false
23  186.9   false   true    false
24  181.42  false   true    false
25  181.18  false   true    false
26  184.12  false   true    false
27  177.27  false   true    false
28  238.61  true    true    false
29  163.88  true    true    false
30  204.12  false   false   false
31  215.22  true    true    false
32  166.41  true    true    false
33  143.49  true    true    false
34  181.31  true    true    false
35  431.25  false   false   false
36  245.3   false   false   false
37  245.89  false   false   false
38  251.72  true    true    false
39  161.89  false   false   false
40  210.83  true    true    false
41  188.25  false   false   false
42  186.48  true    true    false
43  205.49  false   false   false
44  184.07  true    true    false
45  144.83  true    true    false
46  167.21  true    true    false
47  181.11  false   false   false
48  183.73  true    true    true
49  175.57  true    true    false
1
Try printing header2val before you convert it to float. See if it is actually a number of not - Sheldore
What lines are the two errors occurring on? - martineau
@martineau First Code Error message: "TypeError float() argument must be a string or a number" is in Line 28 which is --> if float(header2val) >= 200: #check condition Second Code Error message: "ValueError: could not convert string to float" in line 29 which is --> headerval= float(checkval) - Grace B
@snakecharmerb I tried to print before and after casting to float.... before casting I get '30.35' after casting I get just 30.35 without the single tick - Grace B
Grace: OK, that helps, but I can't reproduce the problem with the first code. Could you copy and paste a few lines from the beginning of the report.csv file into your question? By the way, shouldn't the line fieldnames = ['header1', 'header2', 'header2'] be fieldnames = ['header1', 'header2', 'header3']? - martineau

1 Answers

0
votes

I think the header of the report.csv file may be incorrectly formatted, which messes up reading it with a DictReader — so here's a way to workaround that and at least get the code in the first part of your question working. It hardcodes the fieldnames the reader should use and ignores the header line.

import csv

input_filename = 'report.csv'
output_filename = 'output.csv'

fieldnames = ['header1', 'header2', 'header3']  # Hardcode the fieldnames.

with open(input_filename, 'r', newline ='') as openfile:
    csv_reader = csv.DictReader(openfile, fieldnames=fieldnames, delimiter='\t')
    next(csv_reader)  # Skip badly formatted header.

    with open(output_filename, 'w', newline ='') as new_file:
        csv_writer = csv.DictWriter(new_file, fieldnames=fieldnames, delimiter='\t',
                                    extrasaction='ignore')
        csv_writer.writeheader()

        for line in csv_reader:
            header2val= line['header2']  # Get second column.
            if float(header2val) >= 200:  # Check value.
                csv_writer.writerow(line)