8
votes

I am trying to write a simple piece of code to read values from a CSV file with a max of 100 entries into an array of structs.

Example of a line of the CSV file:

1,Mr,James,Quigley,Director,200000,0

I use the following code to read in the values, but when I print out the values they are incorrect

for(i = 0; i < 3; i++) /*just assuming number of entries here to demonstrate problem*/
    {
    fscanf(f, "%d,%s,%s,%s,%s,%d,%d", &inArray[i].ID, inArray[i].salutation, inArray[i].firstName, inArray[i].surName, inArray[i].position, &inArray[i].sal, &inArray[i].deleted);
    } 

Then when I print out the first name, the values are all assigned to the first name:

for(j = 0; j < 3; j++) /* test by printing values*/
    {
    printf("Employee name is %s\n", inArray[j].firstName);
    } 

Gives ames,Quigley,Director,200000,0 and so on in that way. I am sure it's how i format the fscanf line but I can't get it to work.

Here is my struct I'm reading into:

typedef struct Employee
    {
    int ID;
    char salutation[4];
    char firstName[21];
    char surName[31];
    char position[16];
    int sal;
    int deleted;
    } Employee;
2
%s is greedy, I think, and it reads a full word... It finds the %d, the integer part, then the ,, and then it has to read a string. , is valid in a string, so it reads until the end of the line (there is no space until then), not until the first comma... And the rest remains empty. (From this answer)ppeterka
You've got a firstN and a firstName in the posting - which is it? Can you post the struct as well?doctorlove
Corrected the var name and added structDawson

2 Answers

17
votes

This is because a string %s can contain the comma, so it gets scanned into the first string. There's no "look-ahead" in the scanf() formatting specifier, the fact that the %s is followed by a comma in the format specification string means nothing.

Use character groups (search the manual for [).

const int got = fscanf(f, "%d,%[^,],%[^,],%[^,],%[^,],%d,%d", &inArray[i].ID,
                       inArray[i].salutation, inArray[i].firstName,
                       inArray[i].surName, inArray[i].position, &inArray[i].sal, 
                       &inArray[i].deleted);

And learn to check the return value, since I/O calls can fail! Don't depend on the data being valid unless got is 7.

To make your program read the entire file (multiple records, i.e. lines), I would recommend loading entire lines into a (large) fixed-size buffer with fgets(), then using sscanf() on that buffer to parse out the column values. That is much easier and will ensure that you really do scan separate lines, calling fscanf() in a loop will not, since to fscanf() a linefeed is just whitespace.

2
votes

Might as well post my comment as an answer:

%s reads a full word by default.

It finds the %d, the integer part, then the ,, and then it has to read a string. , is considered valid in a word (it is not a whitespace), so it reads until the end of the line (there is no whitespace until then), not until the first comma... And the rest remains empty. (From this answer)

You have to change the separator with specifying a regex:

fscanf(f, "%d,%[^,],%[^,],%[^,],%[^,],%d,%d", &inArray[i].ID, inArray[i].salutation, inArray[i].firstName, inArray[i].surName, inArray[i].position, &inArray[i].sal, &inArray[i].deleted);

Instead of %s, use %[^,], which means "grab all chars, and stop when found a ,".

EDIT

%[^,]s is bad, it would need a literal s after the end of the scanset... Thanks @MichaelPotter

(From Changing the scanf() delimiter and Reading values from CSV file into variables )