1
votes

I want to ask about a memory allocation question about char *buffer; buffer = malloc(5); and char buffer[5].

My code for test is below:]

#include <stdio.h>
void echo1();
void echo2();

int main() {
    echo1();
    printf("\n");
    echo2();
    printf("Back to Main \n");
    return 0;
}

void echo1() {
    int i;
    unsigned long addr;
    char *buffer;
    printf("[Echo1] Buffer      : %p \n", buffer);
    printf("[Echo1] &Buffer     : %p \n", &buffer);
    buffer = malloc(5);
    printf("[Echo1] &Buffer     : %p \n", &buffer);
    printf("[Echo1] Buffer      : %p \n", buffer);
    for (i = 0; i < 5; i++) {
        buffer[i] = 'A';
    }
    printf("[Echo1] Buffer[]    : %s \n", buffer); 
    addr = &buffer;
    printf("[Echo1] *Buffer     : %p \n", *buffer); 
    printf("[Echo1] addr        : %p \n", addr); 
    printf("[Echo1] &addr       : %p \n", &addr); 
}

void echo2() {
    int i;
    unsigned long addr;
    char buffer[5];
    printf("[Echo2] Buffer      : %p \n", buffer);
    printf("[Echo2] &Buffer     : %p \n", &buffer);
    for (i = 0; i < 5; i++) {
        buffer[i] = 'A';
    }
    printf("[Echo2] Buffer[]    : %s \n", buffer); 
    addr = &buffer;
    printf("[Echo2] *Buffer     : %p \n", *buffer); 
    printf("[Echo2] addr        : %p \n", addr); 
    printf("[Echo2] &addr       : %p \n", &addr); 
}

I compiled it on ubuntu by typing gcc test.c -o test -m32 (I'm using a 64 machine). And the following figure shows the result. enter image description here

I think I understand the echo1() case (except for why addr is above buffer), and made a figure to show what happened in the memory. enter image description here

But I have no idea what happened in echo2() case.

So I want to ask the following questions:

  1. In echo1(), I declared addr before buffer, why addr is placed above buffer in stack? (since stack moves upward, shouldn't the program firstly push addr and then push buffer above it?)
  2. In echo2(), why buffer and &buffer give the same value? Doesn't this mean buffer is pointing to itself?
  3. In echo2(), addr's address and addr's content(address of buffer) is separated by 12 bytes, what is stored in them?
  4. Can anyone make a figure to help me understanding what happened in ehco2()?

Thanks a lot.

3
When you printf with %s, the argument must be a null-terminated string. You never added a null byte to the buffer. - Barmar
I was trying to cause a buffer overflow... And I also don't know why the program correctly output AAAAA and nothing else... Shouldn't some other characters be output since I didn't add a null at the end? - Emma
Just because you didn't add it doesn't mean there isn't one there. You got lucky. - Barmar
Local variables aren't pushed one at a time on the stack. When a function is called, space is allocated on the stack for all the local variables. The locations of each variable within the stack frame are arbitrary. The compiler may rearrange them to minimize wasted space between variables, for instance. - Barmar

3 Answers

4
votes

Why the same?

The address of an array on the stack and the address of it's first element is the same. The difference would be incrementing the resulting pointers because they have different sizes.

What's in between?

The address of a pointer is a stack address while the address returned by malloc() and stored in the pointer is from the heap.

Note: Your code also exhibits many sources of undefined behavior, for example

  • Printing *Buffer with the "%p" specifier.
  • Printing uninitialized pointers.
  • Printing unsigned long addr uninitialized and with the "%p" specifier, both of which invoke undefined behavior.
  • Printing a non-null terminated array with the "%s" specifier.

So you can't expect a given behavior in echo2() and in general in the whole program.

2
votes

Can anyone make a figure to help me understanding what happened in ehco2()?

The best answer for this entire problem is to use gcc's -O0 flag to produce un-optimized code, and either run it in a debugger or also use the -S to review the assembly that corresponds to the code it is producing.

  1. In echo1(), I declared addr before buffer, why addr is placed above buffer in stack? (since stack moves upward, shouldn't the program firstly push addr and then push buffer above it?)

But you didn't assign a value to addr until after you assigned one to buffer, so the compiler doesn't have to allocate memory for it until later.

See iharob's good answer for some other points.

1
votes

In echo1(), I declared addr before buffer, why addr is placed above buffer in stack?

Because the compiler doesn't have to put automatic variables onto the stack in the order in which they're declared, and, apparently, chose not to. (In fact, it doesn't have to put them onto the stack at all - it could put them in registers if they fit in a register.)

In echo2(), why buffer and &buffer give the same value? Doesn't this mean buffer is pointing to itself?

It means that buffer is an array and, in most contexts, an expression buffer evaluates to a pointer to the first element of buffer, i.e. it evaluates to &buffer[0], and '&buffer` evaluates to a pointer to the array - but the address of the array is the same as the address of the first element of the array.

buffer and &buffer have different data types, however ("pointer to char" vs. "pointer to array of 5 chars").

In echo2(), addr's address and addr's content(address of buffer) is separated by 12 bytes, what is stored in them?

Whatever the compiler decided to put there in the stack frame. The layout of a stack frame is not under the direct control of a C (or C++) programmer. Perhaps the compiler decided to make the address following the array be in a 4-byte boundary, rather than putting the first element on a 4-byte boundary. Perhaps, as you weren't compiling with optimization turned on, it didn't put i into a register, but put it on the stack.

C is not a programming language that specifies a strict mapping between automatic variables and locations on the stack; the compiler is free to put variables wherever it chooses (including, as I said above, into registers if that's possible).