1
votes

I'm new here. This is my first post! So I wrote code in C to take in a comma separated text file and read it into a 2D array. I used strtok() for that. It worked. Below is the code:

#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>

int main (int argc, char *argv[])
{
    FILE *data = fopen(argv[1], "r");

    if (data == NULL)
    {
        printf("Could not open 11.txt\n");
        return 1;
    }

    char table[20][20][3];
    char buffer[60];

    int i = 0;
    while (fscanf(data, "%s", buffer) != EOF)
    {
        int j = 0;
        for (char *s = strtok(buffer, ","); s != NULL; s = strtok(NULL, ","))
        {
            for (int k = 0; k < strlen(s) + 1; k++)
            {
                table[i][j][k] = s[k];
            }
            j++;
        }
        i++;
    }
    printf("%s\n", table[19][0]);
    return 0;
}

The data I'm trying to read into the 2D array looks like:

08,02,22,97
49,49,99,40
81,49,31,73
52,70,95,23

It is a 20x20 matrix with numbers separated by commas. The above program works fine (I'm printing out an element of this 2D array to check if the program is working). But when the numbers are separated by spaces:

08 02 22 97 
49 49 99 40 
81 49 31 73 
52 70 95 23 

and when I replace the "," with " " in the strtok() function I get a seg fault. I'm at a loss for why this is the case. Thank you for the help!

EDIT: The bug has been fixed! @Vlad From Moscow very correctly pointed out that fcanf() is not the correct function to use to read into a buffer a string with white space. He suggested instead to use fgets() which can read white space. I was still facing a seg fault because the first token returned by strtok() was a pointer to NULL. I'm not sure why its doing that because when I fed strtok() an array with the same string without using fgets() in a while loop as shown, there were no issues:

char str[] = "08 02 22 97";

So to fix this I put a condition in the for loop to skip to the next iteration if the strtok() returned a NULL pointer. The second issue was that my buffer wasn't large enough (spaces are 4 bytes compared to 1 byte for a char). Fixing these two issues I got the code to work!

Below is the corrected code:

#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>

int main (int argc, char *argv[])
{
    FILE *data = fopen(argv[1], "r");

    if (data == NULL)
    {
        printf("Could not open 11.txt\n");
        return 1;
    }

    char table[20][20][3];
    char buffer[61];

    int i = 0;
    while (fgets(buffer, sizeof(buffer), data) != NULL)
    {
        int j = 0;
        for (char *s = strtok(buffer, " "); s != NULL; s = strtok(NULL, " "))
        {
            if (s == NULL)
            {
                continue;
            }
            else
            {
                for (int k = 0; k < strlen(s) + 1; k++)
                {
                    table[i][j][k] = s[k];
                }
                j++;
            }
        }
        i++;
    }
    printf("%i\n", atoi(table[19][19]));
    return 0;
}
1
Possibly unrelated: the first line in the first example ends with <COMMA>97<ENTER> ... the first line in the 2nd example ends with <SPACE>97<SPACE><ENTER>pmg
@pmg I think i did that accidentally when sticking those numbers in here.Arjun Singh

1 Answers

4
votes

The function fscanf with the format specifier %s reads data until a white space character is encountered. So you can not use the function fscanf as you are using it in the while statement

while (fscanf(data, "%s", buffer) != EOF)

to read strings containing embedded white spaces.

Instead use the standard C function fgets.

Pay attention to that instead of this for loop

for (int k = 0; k < strlen(s) + 1; k++)
{
    table[i][j][k] = s[k];
}

you could use the standard string function strcpy as for example

strcpy( table[i][j], s );

Also this call

printf("%s\n", table[20][0]);

invokes undefined behavior because for an array declared like

char table[20][20][3];

the valid range of the first index is ]0, 20 ). That is you may not use the value 20 as the index.