1
votes

Having trouble understanding an explanation of how to find the height of a binary search tree using a recursive function of type int.

For any binary search tree, given a pointer to the root node of the tree, where Node is defined conventionally as follows...

struct Node {
    int data;
    Node* left;
    Node* right;
}
Node* root;

we can use the following recursive function of type int to give us the height of the binary search tree... (the function 'max' just takes two ints and returns the greater of the two)

int findHeight(Node* root){
    if(root == NULL){
        return -1;
    }
    return max(findHeight(root->left),findHeight(root->right)) + 1;
}

int max(int a, int b){
    if (a > b)
        return a;
    else
        return b;
}

My understanding is that findHeight(root->left) finds the height of the left subtree of root, findHeight(root->right) finds the height of the right subtree of root, and then max returns whichever is greater, since that will be the overall height of the tree, plus one to include the edge that connects root to that subtree.

I'm pretty new to recursion, so I decided to just unpack one of the recursive function calls and try to understand it. I wrote out findHeight(root->left) in a sort of pseudo-code way, with the word "PAUSE" in it every time the function called itself, to represent what was happening on the stack and indented the next function call to show a new stack frame... (in my sample BST the height of the left subtree was 2)

int fH(Node* root){
    if(root == NULL)
        return -1;

    fH(root->left);
    PAUSE

    int fH(Node* root){
        if root == NULL)
            return -1;

        fH(root->left);
        PAUSE

        int fHR(Node*root){
            if (root == NULL)
                return -1;

            fH(root->left);
            PAUSE

            int fHR(Node* root){
                if (root == NULL)
                    return -1;

                fH(root->left);
                PAUSE

                int fHR(Node* root){
                if(root==NULL)
                    return -1;
                }
            }
        }
    }
}

The function works correctly, but I don't understand how the return value of the function is growing from -1, which is what the final execution of the function returns, to 2. Do recursive int functions add 1 to their return value every time they return to their preceding stack frame?

I don't see anything in that short function that increments the return value of the function. The only thing I see is that it's -1 for the final call.

If anyone could try to explain this to me I would really appreciate the help. Thank you very much.

6
"Do recursive int functions add 1 to their return value every time they return to their preceding stack frame?" In general? No. But your findHeight() function does -- see the + 1 at the end of the return line.MrEricSir

6 Answers

2
votes

The function findHeight returns the currently found max height.

So The max call is reasonable with that in mind.

enter image description here

In this action reenactment we get the following function calls.

  1. Call to node 1. We expect to find 2 more depth on the left hand side. One more on the right hand side. So max, will hopefully decide on the left hand side. max calls Node2 and node3
  2. The call to 2 is discussed in point 4.
  3. The call to 3 returns 0. max of -1 and -1, +1. Max is -1, add one to that.
  4. The findHeight to node 2, will have the max of(left node (-1)), and the right node 4) Item 5 + 1.
  5. max of -1, -1 +1 = 0. So this makes max of Point 4 evaluate to one.

So in the end the function findHeight says, this is the end of my current inspection (0 height). The previous function calls can add what ever they like to my one.

In my opinion this function can be written in a more intuitive way.

Maybe

int treeHeight(Node* node){
    if (!node) // We are not a node.
        return 0;
    int weAreANode = 1;  // Height 1.
    int heightLeft = treeHeight(node->left);
    int heightRight = treeHeight(node->right);
    int theHighestSubtreeHeight = std::max(heightLeft, heightRight);

    return weAreANode + theHighestSubtreeHeight;
}

I really just wanted to paint stuff again.

1
votes

My understanding of your question is how is the height incremented for each recursive call of findHeight, let me know if that's not quite right.

In general in C++ there isn't specific syntax to define a function as recursive and is instead left to the programmer to use their function recursively through the arguments and return values of their function.

With this specific function it looks like the increment is happening at the end of this line:

    return findHeight(max(findHeight(root->left),findHeight(root->right)) + 1;

The +1 at the end of the return function increments the recursive call explicitly, there isn't any implicit increment being done by the int function other than the fact that it returns an int.

1
votes

Your pseudo-code misses what happens after the PAUSE. What's missing is

  • the call fH(root->right) (even if root->right is a nullptr)
  • computing max of the two recursive calls and
  • adding 1 to the result.

Particularly, the last point is essential to count the height levels of the tree.

If you add these points to your pseudo-code you may see that it resembles the structure of your specific tree.

1
votes

Consider this tree,

     5
   /   \
  3     10
 / \   /  
2   4 8

The function will be called with root which is 5.

Then it will be called with 3 , 2 (left child 3) and then 4 (right child 3).

2's height evaluates to max(-1, -1) + 1 = 0 and returned as the result of 3's call to findHeight(root->left). The same happens with 4.

The call to 3 (5's findHeight(root->left)) now evaluates to max(0, 0) + 1 = 1.

So 5's left child has a height of 1.

The other child (right child) follows the same process where the call to 10's left child returns a height of 0 and it's right child returns a height of -1. max(0, -1) + 1 = 1 which will be returned to 5's call to findHeight(root->right).

Now back at the top, 5's height is evaluated as max(1,1) + 1 = 2.

1
votes

So considering your example, let's say you have root node as 'A' and left node as 'B' and right node as 'C'. The program should return '1'-

Here is the stack trace-

findHeight(A)       
findHeight(B)       
POP findHeight(NULL)    returns -1

The findHeight(NULL) is for the B's left node, which is NULL and returns -1

Now the stack has-

findHeight(A)       
findHeight(B)       max(-1, /*code yet to execute*/) + 1
POP findHeight(NULL)    returns -1 

The findHeight(NULL) is for the B's right node, which is NULL and returns -1.

Now the stack would be like-

findHeight(A)       
POP findHeight(B)       max(-1, -1) + 1 -> returns 0 

At this point the control is back to your BST's root node i.e. the node 'A' and after executing right node's (Node C's) left node, the stack would be like-

findHeight(A)       max(0, /*code yet to execute*/) + 1
findHeight(C)
findHeight(NULL)    returns -1

The findHeight(NULL) is for the C's left node, which is NULL and returns -1

Now the stack has-

findHeight(A)       max(0, /*code yet to execute*/) + 1       
findHeight(C)       max(-1, /*code yet to execute*/) + 1
POP findHeight(NULL)    returns -1

The findHeight(NULL) is for the B's right node, which is NULL and returns -1.

Now the stack would be like-

findHeight(A)       max(0, /*code yet to execute*/) + 1      
POP findHeight(C)       max(-1, -1) + 1 -> returns 0 

After completion of findHeight(C), the control goes back to findHeight(A) and now the stack has-

findHeight(A)       max(0, 0) + 1 -> returns 1

You have to add + 1 where you are explictly calling the function from your main.

1
votes

There are too many things happening on that return line. Lets rewrite the code like this:

int findHeight(Node* root){
    if(root == NULL){
        return -1;
    }
    int left = findHeight(root->left);
    int right = findHeight(root->right);
    int maxval = max(left,right);
    return maxval + 1;
}

int max(int a, int b){
    if (a > b)
        return a;
    else
        return b;
}

Now if we expand the calls you can see that at a leaf on the very bottom the left is -1 and the right is -1 and then maxval is -1 and it will return 0 (because there is nothing below the leaf). Then after that bottom call has returned 0 the maxval of the next level up will be 0 and the value returned will be 1 (because the leaf is below).

Let's look at one that only has left children a couple levels deep:

int findHeight(Node* root){
    if(root == NULL) return -1; // not taken

    int left = findHeight(Node* root->left){
        if(root == NULL) return -1; // not taken

        int left = findHeight(Node* root->left){
            if(root == NULL) return -1; // not taken

            int left = findHeight(Node* root->left){
                if(root == NULL) return -1; // return here, rest of code doesn't happen
            } // left = -1 at this point (no left children below here)

            int right = findHeight(Node* root->right){
                if(root == NULL) return -1; // return here, rest of code doesn't happen
            } // right = -1 at this point (no right children)

            int maxval = max(left,right); // left = -1, right  = -1
            return maxval + 1; // -1 + 1 = 0
        } // left = 0 at this point (1 left child below here)

        int right = findHeight(Node* root->right){
            if(root == NULL) return -1; // return here, rest of code doesn't happen
        } // right = -1 at this point (no right children)

        int maxval = max(left,right); // left = 0, right = -1;
        return maxval + 1; // 0 + 1 = 1
    } // left = 1 at this point (2 left child below here)

    int right = findHeight(Node* root->right){
        if(root == NULL) return -1; // return here, rest of code doesn't happen
    } // right = -1 at this point (no right children)

    int maxval = max(left,right); // left = 1, right = -1
    return maxval + 1; // 1 + 1 = 2
}