0
votes

I am new to C so I am having troubles with making a hash table and malloc-ing spaces.

I am doing an anagram solver. Right now I am still at the step where I create the hash table for this program. I am trying to test my insert function to see if it is working properly by calling the function once with some random arguments.

However, I kept getting segmentation faults, and I used valgrind to track down where it crashes.

Can you point out what am I missing?

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int hash(char *word)
{
   int h = 0;
   int i, j;

   char *A;
   char *a;
   // an array of 26 slots for 26 uppercase letters in the alphabet
   A = (char *)malloc(26 * sizeof(char));
   // an array of 26 slots for 26 lowercase letters in the alphabet
   a = (char *)malloc(26 * sizeof(char));

   for (i = 0; i < 26; i++) {
      A[i] = (char)(i + 65); // fill the array from A to Z
      a[i] = (char)(i + 97); // fill the array from a to z
   }

   for (i = 0; i < strlen(word); i++) {
      for (j = 0; j < 26; j++) {
         // upper and lower case have the same hash value
         if (word[i] == A[j] || word[i] == a[j]) {
            h += j; // get the hash value of the word
            break;
         }
      }
   }

   return h;
}

typedef struct Entry {
   char *word;
   int len;
   struct Entry *next;
} Entry;

#define TABLE_SIZE 20 // test number

Entry *table[TABLE_SIZE] = { NULL };

void init() {
   // create memory spaces for each element
   struct Entry *en = (struct Entry *)malloc(sizeof(struct Entry));

   int i;

   // initialize 
   for (i = 0; i < TABLE_SIZE; i++) {
      en->word = "";
      en->len = 0;
      en->next = table[i];
      table[i] = en;
   }
}

void insertElement(char *word, int len) {
   int h = hash(word);
   int i = 0;

   // check if value has already existed
   while(i < TABLE_SIZE && (strcmp(table[h]->word, "") != 0)) {

      // !!!! NEXT LINE IS WHERE IT CRASHES !!!

      if (strcmp(table[h]->word, word) == 0) { // found
         table[h]->len = len;

         return; // exit function and skip the rest
      }

      i++; // increment loop index 
   }

   // found empty element
   if (strcmp(table[h]->word, "") == 0) {
      struct Entry *en;

      en->word = word;
      en->len = len;
      en->next = table[h];
      table[h] = en;
   }
}

int main() {
   init(); // initialize hash table

   // test call
   insertElement("kkj\0", 2);

   int i;

   for ( i=0; i < 10; i++)
   {
      printf("%d: ", i);

      struct Entry *enTemp = table[i];

      while (enTemp->next != NULL)
      {
         printf("Word: %s, Len:%d)", enTemp->word, enTemp->len);
         enTemp = enTemp->next;
      }

      printf("\n");
   }

   return 0;
}
1
Suspect you want a copy: en->word = strdup(word);chux - Reinstate Monica
Note: This only makes 1 en struct Entry *en = (struct Entry *)malloc(sizeof(struct Entry));chux - Reinstate Monica
regarding the system function" malloc() 1) do not cast the returned value 2) always check (!=NULL) the returned value to assure the operation was successfuluser3629249
regarding this line: 'insertElement("kkj\0", 2);' the literal "kkj\0" is not formatted correctly. when defining a char array, such as this literal, the compiler will automatically append a '\0' so this literal will, in memory, be: 'k','k','j','\0','\0' which is not what is needed.user3629249

1 Answers

2
votes

It's not necessary to cast the return value from malloc, and doing so can mask other errors.

The following lines malloc memory which is never freed, so there's a memory leak in your hash function.

// an array of 26 slots for 26 uppercase letters in the alphabet
A = (char *)malloc(26 * sizeof(char));
// an array of 26 slots for 26 lowercase letters in the alphabet
a = (char *)malloc(26 * sizeof(char));

sizeof(char) is guaranteed to be 1, by definition, so it's not necessary to multiply by sizeof(char).

Your code also assume ascii layout of the characters, which is not guaranteed.

In the init() function, you have

// create memory spaces for each element
struct Entry *en = (struct Entry *)malloc(sizeof(struct Entry));

does not do what the comment says. It only allocates enough memory for one struct Entry. Perhaps you meant to put this inside the loop.

For a fixed table size you could also just have an array of struct Entry directly rather than an array of pointers to such. I.e.

struct Entry table[TABLE_SIZE] = { 0 };

And then you wouldn't need to malloc memory for the entries themselves, just the contents.

In your initialization loop

for (i = 0; i < TABLE_SIZE; i++) {
    en->word = "";
    en->len = 0;
    en->next = table[i];
    table[i] = en;
}

each en->next is set to itself, and all of the table elements are set to the same value. The first time through the loop, en->next is set to table[0], which at this point is NULL due to your static initializer. table[0] is then set to en.

The second time through the loop, en->next is set to table[1], which is also null. And en hasn't changed, it is still pointing to the result from the earlier malloc. table[1] is then set to en, which is the same value you had before. So, when you're done, every element of table is set to the same value, and en->next is NULL.

I haven't traced through the hash function, but I don't immediately see anything limiting the use of the hash value to possible indexes of table. When I tested it, "kkj\0" (btw, String literals in C are already null terminated, so the \0 isn't needed.) had a hash value of 29, which is outside the valid indexes of table. So you are accessing memory outside the limits of the table array. At that point all bets are off and pretty much anything can happen. A seg fault in this case is actually a good result, since it's immediately obvious something's wrong. You need to take the hash value modulo the table size to fix the array bounds issue, i.e. h % TABLE_SIZE.