1
votes

I'm new to CS and could really use some help on the speller pset. I've got a basic outline that doesn't seem to seg fault, but I'm still having issues. It fails check50 in that:

  • it doesn't handle most basic words properly
  • spell-checking is not case insensitive
  • it doesn't handle substrings properly
  • and has memory errors

If I run a tester file through it, the counter only shows that there are 2 words in the dictionary, and so it spits out nearly every word in the document as misspelled (leading me to think there is an error in load, but I can't figure out where).

There's also a valgrind error. It displays the following:

HEAP SUMMARY: in use at exit: 1,300 bytes in 14 blocks. total heap use: 15 allocs, 1 frees, 5406 bytes allocated.

Any assistance would be much appreciated; I've been stuck on this for 3 days!

#include <stdbool.h>
#include <string.h>
#include <strings.h>
#include <stdio.h>
#include <stdlib.h>
#include "dictionary.h"

// Represents a node in a hash table
typedef struct node
{
    char word[LENGTH + 1];
    struct node *next;
}
node;

// Number of buckets in hash table
const unsigned int N = 26;

// Hash table
node *table[N];

// Returns true if word is in dictionary else false
bool check(const char *word)
{
    int numloc = hash(word);
    node *cursor = table[numloc];
    while (cursor != NULL)
    {
        if (strcasecmp(cursor->word, word) == 0)
        {
            return true;
        }
        cursor = cursor->next;
    }
    return false;
}

// Hashes word to a number
unsigned int hash(const char *word)
{
    unsigned long hash = 5381;
    int c;
    while ((c = *word++))
    {
        hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
    }
    return hash % N;
}

// Loads dictionary into memory, returning true if successful else false
int counter = 0;
bool load(const char *dictionary)
{
    FILE *dictionary = fopen(dictionary, "r");
    if (dictionary == NULL)
    {
        return false;
    }
    char tempword[LENGTH + 1];
    for (int i = 0; i < N; i++)
    {
        table[i] = NULL;
    }
    while (fscanf(dict, "%s", tempword) != EOF)
    {
        node *n = malloc(sizeof(node));
        if (n == NULL)
        {
            return false;
        }
        strcpy(n->word, tempword);
        int A = hash(tempword);
        if (table[A] == NULL)
        {
            table[A] = n;
            n->next = NULL;
        }
        else
        {
            n->next = table[A];
            table[A] = n;
        }
        counter++;
    }
    fclose(dictionary);
    return true;
}


// Returns number of words in dictionary if loaded else 0 if not yet loaded
unsigned int size(void)
{
    if (counter == 0)
    {
        unload();
        return 1;
    }
    else
    {
        return counter;
    }
}

// Unloads dictionary from memory, returning true if successful else false
bool unload(void)
{
    for (int i = 0; i < N; i++)
    {
        node *cursor = table[i];
        node *tmp = table[i];
        while (cursor != NULL)
        {
            cursor = cursor->next;
            free(tmp);
            tmp = cursor;
        }
        return true;
    }
    return false;
}
2
For case insensitive, you should hash only with upper or lower case letters (you choose which). tolower() or toupper() will help. - Michael Dorgan
Also, use the original commented hash * 33 instead of shift/adding. The compiler is plenty smart to use a shift/add if it is more efficient. - Michael Dorgan
Where's main and the rest of the code BTW? - Michael Dorgan
For every call to a heap allocation function: calloc() malloc() realloc() there must be a call to free() before the program exits, using the same pointer values as returned by the heap allocation functions. - user3629249
when compiling, always enable the warnings. then fix those warnings. ( for gcc, at a minimum use: -Wall -Wextra -Wconversion -pedantic -std=gnu11 ) Note: other compilers use different options to produce the same results. - user3629249

2 Answers

0
votes

what are the contents of dictionary.h? As it is, the posted code does not compile!

Here are the results of a run through the compiler, intermixed with some suggestions on how to fix the problems.

The compile statement:

gcc -ggdb3 -Wall -Wextra -Wconversion -pedantic -std=gnu11 -c "untitled1.c" -o "untitled1.o"

The output messages from the compiler:

untitled1.c:20:7: error: variably modified ‘table’ at file scope
   20 | node *table[N];
      |       ^~~~~

this is not C++.. Therefore suggest replacing: 

const unsigned int N = 26;

with

#define N 26

so the table[[] will be properly declared

untitled1.c: In function ‘check’:

untitled1.c:25:18: warning: implicit declaration of function ‘hash’ [-Wimplicit-function-declaration]
   25 |     int numloc = hash(word);
      |                  ^~~~

At this point in the code, the compiler does not know the prototype for the function: hash(). Suggest moving the hash() function before the function that contains this statement.

untitled1.c: At top level:

untitled1.c:39:14: error: conflicting types for ‘hash’
   39 | unsigned int hash(const char *word)
      |              ^~~~

untitled1.c:25:18: note: previous implicit declaration of ‘hash’ was here
   25 |     int numloc = hash(word);
      |                  ^~~~

untitled1.c: In function ‘hash’:

untitled1.c:45:37: warning: conversion to ‘long unsigned int’ from ‘int’ may change the sign of the result [-Wsign-conversion]
   45 |         hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
      |                                     ^

do NOT name a local variable the same as a function name. In the above case, the compiler thinks the code is recursively calling the function: hash()

untitled1.c:47:17: warning: conversion from ‘long unsigned int’ to ‘unsigned int’ may change value [-Wconversion]
   47 |     return hash % N;
      |            ~~~~~^~~

untitled1.c: In function ‘load’:

untitled1.c:54:11: error: ‘dictionary’ redeclared as different kind of symbol
   54 |     FILE *dictionary = fopen(dictionary, "r");
      |           ^~~~~~~~~~

dictionary is already declared as a character string. So use some other variable name here.

untitled1.c:52:23: note: previous definition of ‘dictionary’ was here
   52 | bool load(const char *dictionary)
      |           ~~~~~~~~~~~~^~~~~~~~~~

untitled1.c:54:30: warning: passing argument 1 of ‘fopen’ from incompatible pointer type [-Wincompatible-pointer-types]
   54 |     FILE *dictionary = fopen(dictionary, "r");
      |                              ^~~~~~~~~~
      |                              |
      |                              FILE * {aka struct _IO_FILE *}

In file included from untitled1.c:4:

/usr/include/stdio.h:246:14: note: expected ‘const char * restrict’ but argument is of type ‘FILE *’ {aka ‘struct _IO_FILE *’}
  246 | extern FILE *fopen (const char *__restrict __filename,
      |              ^~~~~

untitled1.c:60:23: warning: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Wsign-compare]
   60 |     for (int i = 0; i < N; i++)
      |                 
  ^

do not compare signed and unsigned values. It 'may' work, but you cannot depend on it working.

untitled1.c:64:19: error: ‘dict’ undeclared (first use in this function)
   64 |     while (fscanf(dict, "%s", tempword) != EOF)
      |                   ^~~~

the dict needs to be of type FILE * but no such variable was ever set via a call to fopen()

untitled1.c:64:19: note: each undeclared identifier is reported only once for each function it appears in

untitled1.c:72:17: warning: conversion to ‘int’ from ‘unsigned int’ may change the sign of the result [-Wsign-conversion]
   72 |         int A = hash(tempword);
      |                 ^~~~

untitled1.c: In function ‘size’:

untitled1.c:100:16: warning: conversion to ‘unsigned int’ from ‘int’ may change the sign of the result [-Wsign-conversion]
  100 |         return counter;
      |                ^~~~~~~

untitled1.c: In function ‘unload’:

untitled1.c:107:23: warning: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Wsign-compare]
  107 |     for (int i = 0; i < N; i++)
      |                       ^

Compilation failed.
0
votes

The reason you have memory leakage is because at the very last step of unloading, you haven't freed up the last node. Try out this version of a while loop. You want to have free(temp) as last line of the loop so it doesn't get initialized/takes up memory if cursor = NULL

bool unload(void)
{
    for (int i = 0; i < N; i++)
    {
        node *cursor = table[i];

        while (cursor != NULL)
        {
            node *tmp = cursor;
            cursor = cursor->next;
            free(tmp);
        }
        free(cursor);
        return true;
    }
    return false;
}