0
votes

Please, after reading and trying to apply solutions found on stackOverflow the problem has not been solved.

Conditional jump or move depends on uninitialised value(s): Conditional jump or move depends on uninitialised value(s): Uninitialised value was created by a heap allocation.

Error popped up by Valgrind:

error

I'm trying to implement file reading line by line and dynamically realloc an array for them.

Error on line: result = realloc(result, currLen * sizeof(char *));


void readFile(char *fileName) {
    FILE *fp = NULL;
    size_t len = 0;

    int currLen = 2;
    char **result = calloc(currLen, sizeof(char *));

    fp = fopen(fileName, "r");
    if (fp == NULL)
        exit(EXIT_FAILURE);

    if (result == NULL)
        exit(EXIT_FAILURE);

    int i = 0;
    while (getline(&(*(result + i)), &len, fp) != -1) {
        if (i >= currLen - 1) {
            currLen *= 2;
            result = realloc(result, currLen * sizeof(char *));
        }
        ++i;
    }

    fclose(fp);

    for (int j = 0; j < currLen; ++j) {
        free(*(result + j));
    }

    free(result);
    result = NULL;
}

int main() {
    readFile("");

    exit(EXIT_SUCCESS);
}

1
On what line of code did that error appear?Mooing Duck
Fyi, &(*(result + i)) ==> result + i . The address-of-dereference is completely pointless. Furthermore, the expansion region of the pointers referenced by the pointer-to-pointer result are indeterminate in the expanded region after the realloc. Only the original sequence from the calloc will be null-initialized (e.g will still be even after the realloc, and ultimately populated with getline acting on null population semantics).WhozCraig
getline requires the first arg to be either a pointer to NULL or a pointer to a valid memory block. Since realloc does not init the exapnded part of the memory it contains uninitalised values and hence the error complaining the getline is using such values.kaylum
@MooingDuck thanks, updated the question.Ignatella
@WhozCraig I have tried to implement what you have said, but then i get Leak_DefinitelyLost on while (getline(result + i, &len, fp) != -1) ... line May i ask you to show how this should be implemented?Ignatella

1 Answers

2
votes

In the original posted code, result undergoes an initial allocation via calloc, which zero-initializes the content therein, and being pointers, null-initializes. Later on, when expanding the sequence via realloc, no such affordance is taken. In effect if the original array looked like this:

[ NULL, NULL ]

and after adding two elements, looks like this:

[ addr1, addr2 ]

the realloc kicks in and gives you this :

[ addr1, addr2, ????, ???? ]

Adding salt to the wound, getline also requires the length argument being reflective of the allocation size present in the line. But you're carrying over the length from the prior loop iteration, so not only is the pointer wrong after the first expansion, the length is never correct after the first invocation of getline (leading to your actual crash; the rest of the problems are just not something you saw yet).

Solving all of this

  1. Use a separate pointer and length for each iteration,
  2. Ensure they're properly initialized to null,0 before the getline call
  3. If you read the line successfully, then expand the line pointer buffer.
  4. Store the pointer, discard the length, and reset both to null,0 before the next iteration.

In practice, it looks like this:

#define _POSIX_C_SOURCE  200809L
#include <stdio.h>
#include <stdlib.h>

char **readFile(const char *fileName, size_t *lines)
{
    FILE *fp = fopen(fileName, "r");
    if (fp == NULL)
        exit(EXIT_FAILURE);

    // initially empty, no size or capacity
    char **result = NULL;
    size_t size = 0;
    size_t capacity = 0;

    size_t len = 0;
    char *line = NULL;
    while (getline(&line, &len, fp) != -1)
    {
        if (size == capacity)
        {
            size_t new_capacity = (capacity ? 2 * capacity : 1);
            void *tmp = realloc(result, new_capacity * sizeof *result);
            if (tmp == NULL)
            {
                perror("Failed to expand lines buffer");
                exit(EXIT_FAILURE);
            }

            // recoup the expanded buffer and capacity
            result = tmp;
            capacity = new_capacity;
        }

        result[size++] = line;

        // reset these to NULL,0. they trigger the fresh allocation
        //  and size storage on the next iteration.
        line = NULL;
        len = 0;
    }

    // if getline allocated a buffer on the failure case
    //  get rid of it (didn't see that coming).
    if (line)
        free(line);

    fclose(fp);

    *lines = size;
    return result;
}

int main()
{
    size_t count = 0;
    char **lines = readFile("/usr/share/dict/words", &count);
    if (lines)
    {
        for (size_t i = 0; i < count; ++i)
        {
            fputs(lines[i], stdout);
            free(lines[i]);
        }

        free(lines);
    }

    return 0;
}

On a stock Linux/Mac system /usr/share/dict/words contains about a quarter-million words in the English language. On my stock Mac, its 235886 (yours will vary). The callers gets the line pointer and the count, and is responsible for freeing the content therein.

Output

A
a
aa
aal
aalii
aam
Aani
aardvark
aardwolf
Aaron
Aaronic
Aaronical
Aaronite
Aaronitic
Aaru
.... a ton of lines omitted ....
zymotically
zymotize
zymotoxic
zymurgy
Zyrenian
Zyrian
Zyryan
zythem
Zythia
zythum
Zyzomys
Zyzzogeton

Valgrind Summary

==17709== 
==17709== HEAP SUMMARY:
==17709==     in use at exit: 0 bytes in 0 blocks
==17709==   total heap usage: 235,909 allocs, 235,909 frees, 32,506,328 bytes allocated
==17709== 
==17709== All heap blocks were freed -- no leaks are possible
==17709== 

Alternative: Let getline reuse its buffer

There is no guarantee the buffer getline allocates matches the line length efficiently. In fact, the only guarantee is, on successful execution, the function returns the number of chars including the delimiter (but not the terminator), and the memory holds that data. The actual allocation size could be considerably more than that, and that space is effectively wasted.

To demonstrate this, consider the following. The same code, but we do NOT reset the buffer on each loop, and rather than store its pointer directly, we store a strdup of the line. Note this only works if the line does not contain embedded null chars. This allows getline to reuse its buffer, and only expand if needed, for each read. We're responsible for making the actual copy of the line data (and we do using POSIX strdup). When executed there are still no leaks, but note the valgrind summary, specifically the number of bytes allocated in comparison to the number of bytes from the previous version above.

char **readFile(const char *fileName, size_t *lines)
{
    FILE *fp = fopen(fileName, "r");
    if (fp == NULL)
        exit(EXIT_FAILURE);

    // initially empty, no size or capacity
    char **result = NULL;
    size_t size = 0;
    size_t capacity = 0;

    size_t len = 0;
    char *line = NULL;
    while (getline(&line, &len, fp) != -1)
    {
        if (size == capacity)
        {
            size_t new_capacity = (capacity ? 2 * capacity : 1);
            void *tmp = realloc(result, new_capacity * sizeof *result);
            if (tmp == NULL)
            {
                perror("Failed to expand lines buffer");
                exit(EXIT_FAILURE);
            }

            // recoup the expanded buffer and capacity
            result = tmp;
            capacity = new_capacity;
        }

        // make copy here. let getline reuse 'line'
        result[size++] = strdup(line);
    }

    // free whatever was left
    if (line)
        free(line);

    fclose(fp);

    *lines = size;
    return result;
}

Valgrind Summary

==17898== 
==17898== HEAP SUMMARY:
==17898==     in use at exit: 0 bytes in 0 blocks
==17898==   total heap usage: 235,909 allocs, 235,909 frees, 6,929,003 bytes allocated
==17898== 
==17898== All heap blocks were freed -- no leaks are possible
==17898== 

The number of allocations is the same (which tells us getline allocated a large enough buffer up front to never need expansion), but the actual total allocated space is considerably more efficient, as now we are storing strings in buffers allocated to match their length; not whatever getline stood up as a read buffer.