0
votes

I wrote a spell-checker that uses a trie to store and check words for misspellings. It loads a textfile dictionary into memory, accepts word(s) to check for spelling errors and unloads the memory. The program compiles successfully, but when run produces a segmentation fault. Based on valgrind the problem seems to be the use of an uninitialised value in my insert function.

//If word not present, inserts word into trie
//If the word is a prefix of a node, marks the "leaf node" (end-of-word node)
void insert(struct node *root, const char *word)
{
int length = strlen(word);
int index = 0;

//start from root node
struct node *tempNode = root;

for(int i = 0; i < length; i++)
{
    //if the current letter in word is a letter
    if(word[i] != '\'')

        //convert the alphabet to it's respective index number
        index = CHAR_TO_INDEX(tolower(word[i]));

    else
        //assign index number 27 (for apostrophe)
        index = INPUT_SIZE;

    //create a new node if path doesn't exist (is NULL)
    if(!(tempNode->children[index]))
        tempNode->children[index] = getNode();

        //go to next node  
        tempNode = tempNode->children[index];
        }
    //mark last node as leaf
    tempNode->isWord = true;
}

insert (places word in trie) is called by load (moves words from dictionary txt file into memory):

bool load(const char *dictionary)
{
//initialise variables
char ch;
char word[LENGTH] = "";
int counter = 0;

struct node *root = getNode();

//open file to start inserting words
FILE *file = fopen(dictionary, "r");

//load words in dictionary into memory
while (EOF)
{
    while((ch = fgetc(file)) != '\n')
    {
        word[counter] = ch;
        counter++;
    }
    //whole word found, insert word and reset counter, increment word count
    insert(root, word);
    counter = 0;
    word_Count++;
}

//close all open files if EOF is reached, else loading has failed- return false
if(EOF == true)
    {
    fileLoaded = true;
    fclose(file);
    return true;
    }
else return false;
}

and getNode() which creates a new node initialised to NULL:

    //Returns new trie node initialised to NULL
    struct node *getNode(void)
    {
    //initialise new node
    struct node *newNode = NULL;

    newNode = malloc(sizeof(struct node));

    //proceed if enough memory to allocate
    if(newNode) 
    {
    //initialise pointers
    for(int i = 0; i < INPUT_SIZE; i++)
        newNode->children[i] = NULL;

    newNode->isWord = false;
    }
    else return false;

    return newNode;
    }

The definition of struct node :

    //Returns new trie node initialised to NULL
struct node *getNode(void)
{
    //initialise new node
    struct node *newNode = NULL;

    newNode = malloc(sizeof(struct node));

    //proceed if enough memory to allocate
    if(newNode) 
    {
        //initialise pointers
        for(int i = 0; i < INPUT_SIZE; i++)
            newNode->children[i] = NULL;

        newNode->isWord = false;
    }
    else return false;

    return newNode;
}

The error according to valgrind:

    Conditional jump or move depends on uninitialised value(s)
==9334==    at 0x4011DD: insert (dictionary.c:84)
==9334==    by 0x4014D1: load (dictionary.c:188)
==9334==    by 0x40095D: main (speller.c:40)
==9334==  Uninitialised value was created by a heap allocation
==9334==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9334==    by 0x4010CB: getNode (dictionary.c:40)
==9334==    by 0x4011E7: insert (dictionary.c:85)
==9334==    by 0x4014D1: load (dictionary.c:188)
==9334==    by 0x40095D: main (speller.c:40)
==9334== 
==9334== Use of uninitialised value of size 8
==9334==    at 0x4011D5: insert (dictionary.c:84)
==9334==    by 0x4014D1: load (dictionary.c:188)
==9334==    by 0x40095D: main (speller.c:40)
==9334==  Uninitialised value was created by a heap allocation
==9334==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==9334==    by 0x4010CB: getNode (dictionary.c:40)
==9334==    by 0x4011E7: insert (dictionary.c:85)
==9334==    by 0x4014D1: load (dictionary.c:188)
==9334==    by 0x40095D: main (speller.c:40)
==9334== 
==9334== Invalid read of size 8
==9334==    at 0x4011D5: insert (dictionary.c:84)
==9334==    by 0x4014D1: load (dictionary.c:188)
==9334==    by 0x40095D: main (speller.c:40)
==9334==  Address 0x91 is not stack'd, malloc'd or (recently) free'd
==9334== 
==9334== 
==9334== Process terminating with default action of signal 11 (SIGSEGV)
==9334==  Access not within mapped region at address 0x91
==9334==    at 0x4011D5: insert (dictionary.c:84)
==9334==    by 0x4014D1: load (dictionary.c:188)
==9334==    by 0x40095D: main (speller.c:40)

valgrind was run with --leak-check=full, --leak-check=full and --show-leak-kinds=all. I've tried referencing similar errors from previous posts but the difference in context makes it difficult to pinpoint what I should do. Line 84 is the one that reads if(!(tempNode->children[index])). This seems (to me) to be the root cause of the problem.

Thank you!

1
Your debugger might tell your moreJabberwocky
What is struct node? Is children an array of pointers?Elan
A minimal reproducible example might help here.Jabberwocky
I've added struct node's definition in. Thank you for asking. @ElanW.H Ang

1 Answers

0
votes
while((ch = fgetc(file)) != '\n')
{
    word[counter] = ch;
    counter++;
}
//whole word found, insert word and reset counter, increment word count
insert(root, word);

This is the part of your code that is causing Undefined Behavior.

Your string word is not null terminated but insert expects it to be (since the first thing it does is call strlen which expects a null terminated string).

Simple fix would be

word[counter] = '\0';

before call to insert assuming everything is in bounds.