1
votes

I've been busy trying to code a trie ordered tree data structure in C. My program reads in words in a sentence one at a time from a .txt, then it stores each word in a trie without duplicates. It then grabs all other words in that sentence and stores them in a subtrie of the word that was stored. For example if we had the following sentence: "contribute to open source. " My code does the following...

            root
  ab'c'defghijklmn'o'pqr's''t'uvwxyz
    'o'           'p'   'o''o'-subtrie-> "contribute", "open", "source"
    'n'           'e'   'u'
    't'           'n'   'r'
    'r'                 'c'-subtrie->"contribute", "to", "open", 
    'i'                     
    'b'
    'u'
    't'
    'e'-subtrie-> "to", "open", "source" 

I have successfully been able to insert words into the trie and also the subtries. I have thoroughly tested this so I am pretty confident everything works the way it was intended to. However, i cant seem to figure out the algorithem to print the trie and subtrie alphabetically.

Here's the struct i am using

typedef struct TrieNode
{
     // number of times this string occurs in the corpus
 int count;

// 26 TrieNode pointers, one for each letter of the alphabet
struct TrieNode *children[26];

// the co-occurrence subtrie for this string
struct TrieNode *subtrie;
} TrieNode;

here is the function i wrote to insert tries. The parameters are the root of the trie, a char array of the word i want to insert, size of the word i am inserting, z = -1 initially.

TrieNode *trieInsert(TrieNode *root, char *wordArray, int sizeOfWord, int z){

    z++;
    int x1, j, index; 
    char c1 = wordArray[z];     

    //INSERT char second level 
    // do alphaNum conversions and check uper or lower case for lev1Char
    x1 = char2Num(c1);
        if(x1 >26 ){
        printf("ERRRRRRRRRRRRRRRRRRrr:line475");
    return root;
    }

    //base case
    if( sizeOfWord == z ) 
    return root; 

    //check to see if we already inserted this 
    if( root->children[x1] == NULL ){ 

    //we insert second letter 
    root->children[x1] = malloc(sizeof(struct TrieNode) );
            root->children[x1]->subtrie = NULL;
    root->children[x1]->count = 0;  
    //set children of newly malloced to null 
    for(j = 0; j < 27; j++)
        root->children[x1]->children[j] = NULL;

    }

    //increment word count on last char of word 
    if((sizeOfWord - 1) == z) 
    root->children[x1]->count++;    

    return trieInsert(root->children[x1], wordArray, sizeOfWord, z);

}

Here is the code i cant figure out. It was to print the trie alphabetically, however, it's output is incorrect.

void printTrieAlefBet( TrieNode *root ){

    int i; 

    if( root->subtrie != NULL){
        printf(" (%d)", root->count);
        return;  
    }


    for( i = 0; i < 27; i++)
        if( root->children[i] != NULL){
            printTrieAlefBet(root->children[i]);
            printf("%c", num2Char(i, 0) );
        }


}

Any thoughts would be greatly appreciated!

1
Sorry my explanation on how my tries and subtries were implemented in my code didn't print how I expected. :( .... here's the wiki link. they better express what tries are. en.wikipedia.org/wiki/Trie - ShowLove
I hope the source part of the trie works from an 'e' that's missing in the diagram and not the 'c' as shown. :D - Jonathan Leffler
What is the output you are getting, and what is the output you'd like to get? Are you sure you don't just want to printf("%c", num2Char(i, 0) ); before recursively calling printTrieAlefBet()? - Claudiu
it prints the following (2)a (1)era (1)edise (1)sseldnuo (1)yb (1)d (1)lassol (1)dnammoc (1)yac (1)tre (1)riapsed (1)ra (1)de (1)m (1)nworf (1)fl (1)dna (1)trae (1)ih (1)i (1)gnik (1)dna (1)sg (1)leve (1)se (1)sselef (1)pi (1)en (1)kool (1)te (1)ythgi ... - ShowLove
I would like it to print a (2) an (1) and (8) antique (1) appear (1) away (1) bare (1) beside (1) boundless (1) by (1) bysshe (1) cold (1) colossal (1) command (1) decay (1) desert (1) despair (1) far (1) fed (1) from (1) frown (1) half (1) hand (1) heart (1) i (1) in (1) is (1) its (1) king (1) kings (1) land (1) legs (1) level (1) lies (1) lifeless (1) lip (1) lone (1) look (1) met (1) mighty (1) mocked (1) my (2) name (1) near (1) nothing (1) of (4) on (4) ozymandias (2) passions (1) pedestal (1) percy (1) read (1) remains (1) round (1) said (1) sand (1) sands (1) ... - ShowLove

1 Answers

3
votes

Got it to work. Here's the code.
Special thanks to prof Sean Szumlanski!

// Helper function called by `printTrie()`.
void printTrieHelper(TrieNode *root, char *buffer, int k)
{
    int i;
    if (root == NULL)
        return;

    if (root->count > 0)
        printf("%s (%d)\n", buffer, root->count);

    buffer[k + 1] = '\0';

    for (i = 0; i < 26; i++)
    {
        buffer[k] = 'a' + (i - 1);

        printTrieHelper(root->children[i], buffer, k + 1);
    }

    buffer[k] = '\0';
}

// If printing a subtrie, the second parameter should be 1; otherwise, 0.
void printTrie(TrieNode *root, int useSubtrieFormatting)
{
    char buffer[1026];

    if (useSubtrieFormatting)
    {
        strcpy(buffer, "- ");
        printTrieHelper(root, buffer, 2);
    }

    else
    {
        strcpy(buffer, 'and');
        printTrieHelper(root, buffer, 0);
    }
}