151
votes

Consider:

char amessage[] = "now is the time";
char *pmessage = "now is the time";

I read from The C Programming Language, 2nd Edition that the above two statements don't do the same thing.

I always thought that an array is an convenient way to manipulate pointers to store some data, but this is clearly not the case... What are the "non-trivial" differences between arrays and pointers in C?

14
I may be misremembering this, but I'd like to point out that that you can use the [] notation on pointers and the * notation on arrays. The only big difference from the code's point of view is that amessage's value cannot change, so amessage++ should fail (but I believe *(amessage+1) will succeed. There are other differences internally I believe, but they almost never actually matter.Bill K
Oh, and generally (not in the cases you mentioned), arrays automatically allocate memory, pointers you have to allocate your own memory. Yours should both just point to blocks of memory that were allocated as part of the program loading.Bill K
Along with the K&R (which is a great book, by the way) I'd suggest you read pw2.netcom.com/~tjensen/ptr/cpoint.htm - in interim.amaterasu
Closing this as duplicate since we had two "canonical" FAQ threads about this very same question.Lundin

14 Answers

113
votes

True, but it's a subtle difference. Essentially, the former:

char amessage[] = "now is the time";

Defines an array whose members live in the current scope's stack space, whereas:

char *pmessage = "now is the time";

Defines a pointer that lives in the current scope's stack space, but that references memory elsewhere (in this one, "now is the time" is stored elsewhere in memory, commonly a string table).

Also, note that because the data belonging to the second definition (the explicit pointer) is not stored in the current scope's stack space, it is unspecified exactly where it will be stored and should not be modified.

Edit: As pointed out by Mark, GMan, and Pavel, there is also a difference when the address-of operator is used on either of these variables. For instance, &pmessage returns a pointer of type char**, or a pointer to a pointer to chars, whereas &amessage returns a pointer of type char(*)[16], or a pointer to an array of 16 chars (which, like a char**, needs to be dereferenced twice as litb points out).

160
votes

Here's a hypothetical memory map, showing the results of the two declarations:

                0x00  0x01  0x02  0x03  0x04  0x05  0x06  0x07
    0x00008000:  'n'   'o'   'w'   ' '   'i'   's'   ' '   't'
    0x00008008:  'h'   'e'   ' '   't'   'i'   'm'   'e'  '\0'
        ...
amessage:
    0x00500000:  'n'   'o'   'w'   ' '   'i'   's'   ' '   't'
    0x00500008:  'h'   'e'   ' '   't'   'i'   'm'   'e'  '\0'
pmessage:
    0x00500010:  0x00  0x00  0x80  0x00

The string literal "now is the time" is stored as a 16-element array of char at memory address 0x00008000. This memory may not be writable; it's best to assume that it's not. You should never attempt to modify the contents of a string literal.

The declaration

char amessage[] = "now is the time";

allocates a 16-element array of char at memory address 0x00500000 and copies the contents of the string literal to it. This memory is writable; you can change the contents of amessage to your heart's content:

strcpy(amessage, "the time is now");

The declaration

char *pmessage = "now is the time";

allocates a single pointer to char at memory address 0x00500010 and copies the address of the string literal to it.

Since pmessage points to the string literal, it should not be used as an argument to functions that need to modify the string contents:

strcpy(amessage, pmessage); /* OKAY */
strcpy(pmessage, amessage); /* NOT OKAY */
strtok(amessage, " ");      /* OKAY */
strtok(pmessage, " ");      /* NOT OKAY */
scanf("%15s", amessage);      /* OKAY */
scanf("%15s", pmessage);      /* NOT OKAY */

and so on. If you changed pmessage to point to amessage:

pmessage = amessage;

then it can be used everywhere amessage can be used.

14
votes

An array contains the elements. A pointer points to them.

The first is a short form of saying

char amessage[16];
amessage[0] = 'n';
amessage[1] = 'o';
...
amessage[15] = '\0';

That is, it is an array that contains all the characters. The special initialization initializes it for you, and determines it size automatically. The array elements are modifiable - you may overwrite characters in it.

The second form is a pointer, that just points to the characters. It stores the characters not directly. Since the array is a string literal, you cannot take the pointer and write to where it points

char *pmessage = "now is the time";
*pmessage = 'p'; /* undefined behavior! */

This code would probably crash on your box. But it may do anything it likes, because its behavior is undefined.

7
votes

I can't add usefully to the other answers, but I will remark that in Deep C Secrets, Peter van der Linden covers this example in detail. If you are asking these kinds of questions I think you will love this book.


P.S. You can assign a new value to pmessage. You can't assign a new value to amessage; it is immutable.

7
votes

If an array is defined so that its size is available at declaration time, sizeof(p)/sizeof(type-of-array) will return the number of elements in the array.

4
votes

Along with the memory for the string "now is the time" being allocated in two different places, you should also keep in mind that the array name acts as a pointer value as opposed to a pointer variable which pmessage is. The main difference being that the pointer variable can be modified to point somewhere else and the array cannot.

char arr[] = "now is the time";
char *pchar = "later is the time";

char arr2[] = "Another String";

pchar = arr2; //Ok, pchar now points at "Another String"

arr = arr2; //Compiler Error! The array name can be used as a pointer VALUE
            //not a pointer VARIABLE
4
votes

A pointer is just a variable that holds a memory address. Notice that you are playinf with "string literals" which is another issue. Differences explained inline: Basically:

#include <stdio.h>

int main ()
{

char amessage[] = "now is the time"; /* Attention you have created a "string literal" */

char *pmessage = "now is the time";  /* You are REUSING the string literal */


/* About arrays and pointers */

pmessage = NULL; /* All right */
amessage = NULL; /* Compilation ERROR!! */

printf ("%d\n", sizeof (amessage)); /* Size of the string literal*/
printf ("%d\n", sizeof (pmessage)); /* Size of pmessage is platform dependent - size of memory bus (1,2,4,8 bytes)*/

printf ("%p, %p\n", pmessage, &pmessage);  /* These values are different !! */
printf ("%p, %p\n", amessage, &amessage);  /* These values are THE SAME!!. There is no sense in retrieving "&amessage" */


/* About string literals */

if (pmessage == amessage)
{
   printf ("A string literal is defined only once. You are sharing space");

   /* Demostration */
   "now is the time"[0] = 'W';
   printf ("You have modified both!! %s == %s \n", amessage, pmessage);
}


/* Hope it was useful*/
return 0;
}
4
votes

The first form (amessage) defines a variable (an array) which contains a copy of the string "now is the time".

The second form (pmessage) defines a variable (a pointer) that lives in a different location than any copy of the string "now is the time".

Try this program out:

#include <inttypes.h>
#include <stdio.h>

int main (int argc, char *argv [])
{
     char  amessage [] = "now is the time";
     char *pmessage    = "now is the time";

     printf("&amessage   : %#016"PRIxPTR"\n", (uintptr_t)&amessage);
     printf("&amessage[0]: %#016"PRIxPTR"\n", (uintptr_t)&amessage[0]);
     printf("&pmessage   : %#016"PRIxPTR"\n", (uintptr_t)&pmessage);
     printf("&pmessage[0]: %#016"PRIxPTR"\n", (uintptr_t)&pmessage[0]);

     printf("&\"now is the time\": %#016"PRIxPTR"\n",
            (uintptr_t)&"now is the time");

     return 0;
}

You'll see that while &amessage is equal to &amessage[0], this is not true for &pmessage and &pmessage[0]. In fact, you'll see that the string stored in amessage lives on the stack, while the string pointed at by pmessage lives elsewhere.

The last printf shows the address of the string literal. If your compiler does "string pooling" then there will be only one copy of the string "now is the time" -- and you'll see that its address is not the same as the address of amessage. This is because amessage gets a copy of the string when it is initialized.

In the end, the point is that amessage stores the string in its own memory (on the stack, in this example), while pmessage points to the string which is stored elsewhere.

4
votes

differences between char pointer and array

C99 N1256 draft

There are two different uses of character string literals:

  1. Initialize char[]:

    char c[] = "abc";      
    

    This is "more magic", and described at 6.7.8/14 "Initialization":

    An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

    So this is just a shortcut for:

    char c[] = {'a', 'b', 'c', '\0'};
    

    Like any other regular array, c can be modified.

  2. Everywhere else: it generates an:

    So when you write:

    char *c = "abc";
    

    This is similar to:

    /* __unnamed is magic because modifying it gives UB. */
    static char __unnamed[] = "abc";
    char *c = __unnamed;
    

    Note the implicit cast from char[] to char *, which is always legal.

    Then if you modify c[0], you also modify __unnamed, which is UB.

    This is documented at 6.4.5 "String literals":

    5 In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence [...]

    6 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

6.7.8/32 "Initialization" gives a direct example:

EXAMPLE 8: The declaration

char s[] = "abc", t[3] = "abc";

defines "plain" char array objects s and t whose elements are initialized with character string literals.

This declaration is identical to

char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };

The contents of the arrays are modifiable. On the other hand, the declaration

char *p = "abc";

defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.

GCC 4.8 x86-64 ELF implementation

Program:

#include <stdio.h>

int main(void) {
    char *s = "abc";
    printf("%s\n", s);
    return 0;
}

Compile and decompile:

gcc -ggdb -std=c99 -c main.c
objdump -Sr main.o

Output contains:

 char *s = "abc";
8:  48 c7 45 f8 00 00 00    movq   $0x0,-0x8(%rbp)
f:  00 
        c: R_X86_64_32S .rodata

Conclusion: GCC stores char* it in .rodata section, not in .text.

If we do the same for char[]:

 char s[] = "abc";

we obtain:

17:   c7 45 f0 61 62 63 00    movl   $0x636261,-0x10(%rbp)

so it gets stored in the stack (relative to %rbp).

Note however that the default linker script puts .rodata and .text in the same segment, which has execute but no write permission. This can be observed with:

readelf -l a.out

which contains:

 Section to Segment mapping:
  Segment Sections...
   02     .text .rodata
3
votes

The second one allocates the string in some read-only section of the ELF. Try the following:

#include <stdio.h>

int main(char argc, char** argv) {
    char amessage[] = "now is the time";
    char *pmessage = "now is the time";

    amessage[3] = 'S';
    printf("%s\n",amessage);

    pmessage[3] = 'S';
    printf("%s\n",pmessage);
}

and you will get a segfault on the second assignment (pmessage[3]='S').

1
votes

The above answers must have answered your question. But I would like to suggest you to read the paragraph "Embryonic C" in The Development of C Language authored by Sir Dennis Ritchie.

0
votes

For this line: char amessage[] = "now is the time";

the compiler will evaluate uses of amessage as a pointer to the start of the array holding the characters "now is the time". The compiler allocates memory for "now is the time" and initializes it with the string "now is the time". You know where that message is stored because amessage always refers to the start of that message. amessage may not be given a new value- it is not a variable, it is the name of the string "now is the time".

This line: char *pmessage = "now is the time";

declares a variable, pmessage which is initialized (given an initial value) of the starting address of the string "now is the time". Unlike amessage, pmessage can be given a new value. In this case, as in the previous case, the compiler also stores "now is the time" elsewhere in memory. For example, this will cause pmessage to point to the 'i' which begins "is the time". pmessage = pmessage + 4;

-1
votes

Here is my summary of key differences between arrays and pointers, which I made for myself:

//ATTENTION:
    //Pointer depth 1
     int    marr[]  =  {1,13,25,37,45,56};      // array is implemented as a Pointer TO THE FIRST ARRAY ELEMENT
     int*   pmarr   =  marr;                    // don't use & for assignment, because same pointer depth. Assigning Pointer = Pointer makes them equal. So pmarr points to the first ArrayElement.

     int*   point   = (marr + 1);               // ATTENTION: moves the array-pointer in memory, but by sizeof(TYPE) and not by 1 byte. The steps are equal to the type of the array-elements (here sizeof(int))

    //Pointer depth 2
     int**  ppmarr  = &pmarr;                   // use & because going one level deeper. So use the address of the pointer.

//TYPES
    //array and pointer are different, which can be seen by checking their types
    std::cout << "type of  marr is: "       << typeid(marr).name()          << std::endl;   // int*         so marr  gives a pointer to the first array element
    std::cout << "type of &marr is: "       << typeid(&marr).name()         << std::endl;   // int (*)[6]   so &marr gives a pointer to the whole array

    std::cout << "type of  pmarr is: "      << typeid(pmarr).name()         << std::endl;   // int*         so pmarr  gives a pointer to the first array element
    std::cout << "type of &pmarr is: "      << typeid(&pmarr).name()        << std::endl;   // int**        so &pmarr gives a pointer to to pointer to the first array elelemt. Because & gets us one level deeper.
-2
votes

An array is a const pointer. You cannot update its value and make it point anywhere else. While for a pointer you can do.