3
votes

I'm very new to programming and I've been trying to finish the cs50 course without just copying other people's code and trying to understand why mine won't work. I'm currently stuck in pset5 (I have been for a couple of days now), the code compiles ok, but not works and valgrind returns:

Process terminating with default action of signal 11 (SIGSEGV)
==1697==  Access not within mapped region at address 0x0
==1697==    at 0x401A86: hash (dictionary.c:53)
==1697==    by 0x401B71: load (dictionary.c:78)
==1697==    by 0x4012BE: main (speller.c:40)

here's my code:

// Implements a dictionary's functionality

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
#include <strings.h>
#include "dictionary.h"

/*
** Global variable that indicates the number of words loaded in the dictionary
*/
unsigned int loadedWords = 0;

// Represents a node in a hash table
typedef struct node
{
    char word[LENGTH + 1];
    struct node *next;
}
node;

// Number of buckets in hash table
const unsigned int N = 54;

// Hash table
node *table[N];

// Returns true if word is in dictionary, else false
bool check(const char *word)
{
    if (table[hash(word)]->next == NULL)
    {
        return false;
    }
    else if (strcasecmp(word, table[hash(word)]->word) == 0)
    {
        return true;
    }
    else
    {
        return check(table[hash(word)]->next->word);
    }
}

// Hashes word to a number
unsigned int hash(const char *word)
{
    char lowerCaseWord[LENGTH + 1];
    for (int i = 0; word[i]; i++)
    {
        lowerCaseWord[i] = tolower(word[i]);
    }
    unsigned int hashNumber = ((int)word[0] + (int)word[1]) % 53;
    return hashNumber;
}

// Loads dictionary into memory, returning true if successful, else false
bool load(const char *dictionary)
{
    FILE *file = fopen(dictionary, "r");
    if (file == NULL)
    {
        printf("Could not open\n");
        return false;
    }
    char word[LENGTH + 1];
    while (fscanf(file, "%s", word) != EOF)
    {
        node *buffer = malloc(sizeof(node));
        if (buffer == NULL)
        {
            return false;
        }
        strcpy(buffer->word, word);
        buffer->next = table[hash(word)]->next;
        table[hash(word)]->next = buffer;
        loadedWords ++;
    }
    fclose(file);
    return true;
}

// Returns number of words in dictionary if loaded, else 0 if not yet loaded
unsigned int size(void)
{
    return loadedWords;
}

// Unloads dictionary from memory, returning true if successful, else false
bool unload(void)
{
    for (int i = 0; i < N - 1; i++)
    {
        while (table[i] != NULL)
        {
            node *tmp = table[i]->next;
            free(table[i]);
            table[i] = tmp;
        }
    }
    return true;
}

Could someone please point out why this won't work and/or how to fix this? I'm not looking for the complete solution to the problem, I'd just like to know what my error is or how my logic is wrong. Thanks.

2
I think you have to define a main function - Noman Gul
I'd just like to know what my error is. The way to find that out is to debug the code. Use a debugger to step through the code line by line as well as examine the state at the point the program crashes. - kaylum
On a side note, if malloc fails in load, you'll leak file. - Daniel Walker
Also, in the line unsigned int hashNumber = ((int)word[0] + (int)word[1]) % 53;, that should be lowerCaseWord. - Daniel Walker
Consider your bucket size of 54 against the holmes.txt file you will use as input that has 1137706 words. Ideally, you want to keep your hash-table load factor less than .7 (number of buckets filled / number of buckets). Multiply your buckets by a minimum of 1000 (better 2000) which will not preserve a load-factor of less than .7, but will dramatically reduce the number of list iterations from each bucket. - David C. Rankin

2 Answers

0
votes

regarding the OPs statement: the code compiles ok,

here is the output from the compiler. Which clearly shows that the posted code DOES NOT compile,

gcc -ggdb3 -Wall -Wextra -Wconversion -pedantic -std=gnu11 -c "untitled1.c" -o "untitled1.o" 

untitled1.c:26:7: error: variably modified ‘table’ at file scope
   26 | node *table[N];
      |       ^~~~~

untitled1.c: In function ‘check’:

untitled1.c:31:15: warning: implicit declaration of function ‘hash’ [-Wimplicit-function-declaration]
   31 |     if (table[hash(word)]->next == NULL)
      |               ^~~~

untitled1.c: At top level:

untitled1.c:46:14: error: conflicting types for ‘hash’
   46 | unsigned int hash(const char *word)
      |              ^~~~

untitled1.c:31:15: note: previous implicit declaration of ‘hash’ was here
   31 |     if (table[hash(word)]->next == NULL)
      |               ^~~~

untitled1.c: In function ‘hash’:

untitled1.c:51:28: warning: conversion from ‘int’ to ‘char’ may change value [-Wconversion]
   51 |         lowerCaseWord[i] = tolower(word[i]);
      |                            ^~~~~~~

untitled1.c:53:31: warning: conversion to ‘unsigned int’ from ‘int’ may change the sign of the result [-Wsign-conversion]
   53 |     unsigned int hashNumber = ((int)word[0] + (int)word[1]) % 53;
      |                               ^

untitled1.c:48:10: warning: variable ‘lowerCaseWord’ set but not used [-Wunused-but-set-variable]
   48 |     char lowerCaseWord[LENGTH + 1];
      |          ^~~~~~~~~~~~~

untitled1.c: In function ‘unload’:

untitled1.c:92:23: warning: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Wsign-compare]
   92 |     for (int i = 0; i < N - 1; i++)
      |                       ^

Compilation failed.

regarding:

char lowerCaseWord[LENGTH + 1];
for (int i = 0; word[i]; i++)
{
    lowerCaseWord[i] = tolower(word[i]);
}

the array lowerCaseWord[] is not used anywhere, so this block of code can be removed

0
votes

You keep accessing the next pointer in the first node in a list, before you know if that list is empty. Let me explain. table is an array of pointers. node *table[N]; This definition creates an array of N pointers to node. When you index into into table, you get back a pointer. So table[2] is a pointer, and it may or may not be NULL depending on whether anything has been added to that list. It is certainly NULL before load has added anything to it.

In check you have this line:

if (table[hash(word)]->next == NULL) // possible seg fault

What if the index hash returns leads to an empty list? Then this becomes NULL->next

Using the arrow operator dereferences the pointer, and dereferencing NULL is illegal. You want to check whether the current node-pointer is NULL, the next is irrelevant for now.

Using recursion here is also a bad idea. It would be faster, and more memory efficient to just loop over each node in the list.

You make the same mistake in your load function:

buffer->next = table[hash(word)]->next; // why next?
table[hash(word)]->next = buffer; // why next?

writing to buffer->next is correct here. You just called malloc to allocate space for it, so you write to the next pointer in that node. But then you take the value of table[hash(word)]->next instead of just table[hash(word)]

You appear to be confused about where the list starts. Trying to read next is never going to the first node in the list.