What exactly is a C pointer if not a memory address?

213

votes

In a reputable source about C, the following information is given after discussing the & operator:

... It's a bit unfortunate that the terminology [address of] remains, because it confuses those who don't know what addresses are about, and misleads those who do: thinking about pointers as if they were addresses usually leads to grief...

Other materials I have read (from equally reputable sources, I would say) have always unabashedly referred to pointers and the & operator as giving memory addresses. I would love to keep searching for the actuality of the matter, but it is kind of difficult when reputable sources KIND OF disagree.

Now I am slightly confused--what exactly is a pointer, then, if not a memory address?

P.S.

The author later says: ...I will continue to use the term 'address of' though, because to invent a different one [term] would be even worse.

c pointersmemory-address

A pointer is a variable that holds an address. It also has its own address. This is the fundamental difference between a pointer and an array. An array effectively is an address (and by implication, its address is itself). – WhozCraig

What's your "reputable source" for the quote? – Cornstalks

The ultimate reputable source is the language standard and not books semi-derived from it and semi-pulled-from-the-author's-butt. I learned it the hard way, making almost every mistake I could and slowly building a mental model of C somewhat close that of described by the standard and then finally replacing said model with the standard's model. – Alexey Frunze

@thang People think pointer=integer because it is often so (x86 Linux and Windows "teach" us that), because people love generalizing, because people don't know the language standard well and because they've had little experience with radically different platforms. Those same people are likely to assume that a pointer to data and a pointer to a function can be converted to one another and data can be executed as code and code be accessed as data. While this may be true on von Neuman architectures (with 1 address space), but not necessarily true on Harvard architectures (w/ code & data spaces). – Alexey Frunze

@exebook Standards are not for newbies (especially, complete ones). They aren't supposed to provide gentle introductions and multitudes of examples. They formally define something, so it can be correctly implemented by professionals. – Alexey Frunze

25 Answers

153

votes

The C standard does not define what a pointer is internally and how it works internally. This is intentional so as not to limit the number of platforms, where C can be implemented as a compiled or interpreted language.

A pointer value can be some kind of ID or handle or a combination of several IDs (say hello to x86 segments and offsets) and not necessarily a real memory address. This ID could be anything, even a fixed-size text string. Non-address representations may be especially useful for a C interpreter.

votes

I'm not sure about your source, but the type of language you're describing comes from the C standard:

6.5.3.2 Address and indirection operators
[...]
3. The unary & operator yields the address of its operand. [...]

So... yeah, pointers point to memory addresses. At least that's how the C standard suggests it to mean.

To say it a bit more clearly, a pointer is a variable holding the value of some address. The address of an object (which may be stored in a pointer) is returned with the unary & operator.

I can store the address "42 Wallaby Way, Sydney" in a variable (and that variable would be a "pointer" of sorts, but since that's not a memory address it's not something we'd properly call a "pointer"). Your computer has addresses for its buckets of memory. Pointers store the value of an address (i.e. a pointer stores the value "42 Wallaby Way, Sydney", which is an address).

Edit: I want to expand on Alexey Frunze's comment.

What exactly is a pointer? Let's look at the C standard:

6.2.5 Types
[...]
20. [...]
A pointer type may be derived from a function type or an object type, called the referenced type. A pointer type describes an object whose value provides a reference to an entity of the referenced type. A pointer type derived from the referenced type T is sometimes called ‘‘pointer to T’’. The construction of a pointer type from a referenced type is called ‘‘pointer type derivation’’. A pointer type is a complete object type.

Essentially, pointers store a value that provides a reference to some object or function. Kind of. Pointers are intended to store a value that provides a reference to some object or function, but that's not always the case:

6.3.2.3 Pointers
[...]
5. An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

The above quote says that we can turn an integer into a pointer. If we do that (that is, if we stuff an integer value into a pointer instead of a specific reference to an object or function), then the pointer "might not point to an entity of reference type" (i.e. it may not provide a reference to an object or function). It might provide us with something else. And this is one place where you might stick some kind of handle or ID in a pointer (i.e. the pointer isn't pointing to an object; it's storing a value that represents something, but that value may not be an address).

So yes, as Alexey Frunze says, it's possible a pointer isn't storing an address to an object or function. It's possible a pointer is instead storing some kind of "handle" or ID, and you can do this by assigning some arbitrary integer value to a pointer. What this handle or ID represents depends on the system/environment/context. So long as your system/implementation can make sense of the value, you're in good shape (but that depends on the specific value and the specific system/implemenation).

Normally, a pointer stores an address to an object or function. If it isn't storing an actual address (to an object or function), the result is implementation defined (meaning that exactly what happens and what the pointer now represents depends on your system and implementation, so it might be a handle or ID on a particular system, but using the same code/value on another system might crash your program).

That ended up being longer than I thought it would be...

votes

Pointer vs Variable

In this picture,

pointer_p is a pointer which is located at 0x12345, and is pointing to a variable variable_v at 0x34567.

votes

To think of a pointer as an address is an approximation. Like all approximations, it's good enough to be useful sometimes, but it's also not exact which means that relying on it causes trouble.

A pointer is like an address in that it indicates where to find an object. One immediate limitation of this analogy is that not all pointers actually contain an address. NULL is a pointer which is not an address. The content of a pointer variable can in fact be of one of three kinds:

the address of an object, which can be dereferenced (if p contains the address of x then the expression *p has the same value as x);
a null pointer, of which NULL is an example;
invalid content, which doesn't point to an object (if p doesn't hold a valid value, then *p could do anything (“undefined behavior”), with crashing the program a fairly common possibility).

Furthermore, it would be more accurate to say that a pointer (if valid and non-null) contains an address: a pointer indicates where to find an object, but there is more information tied to it.

In particular, a pointer has a type. On most platforms, the type of the pointer has no influence at runtime, but it has an influence that goes beyond the type at compile time. If p is a pointer to int (int *p;), then p + 1 points to an integer which is sizeof(int) bytes after p (assuming p + 1 is still a valid pointer). If q is a pointer to char that points to the same address as p (char *q = p;), then q + 1 is not the same address as p + 1. If you think of pointer as addresses, it is not very intuitive that the “next address” is different for different pointers to the same location.

It is possible in some environments to have multiple pointer values with different representations (different bit patterns in memory) that point to the same location in memory. You can think of these as different pointers holding the same address, or as different addresses for the same location — the metaphor isn't clear in this case. The == operator always tells you whether the two operands are pointing to the same location, so on these environments you can have p == q even though p and q have different bit patterns.

There are even environments where pointers carry other information beyond the address, such as type or permission information. You can easily go through your life as a programmer without encountering these.

There are environments where different kinds of pointers have different representations. You can think of it as different kinds of addresses having different representations. For example, some architectures have byte pointers and word pointers, or object pointers and function pointers.

All in all, thinking of pointers as addresses isn't too bad as long as you keep in mind that

it's only valid, non-null pointers that are addresses;
you can have multiple addresses for the same location;
you can't do arithmetic on addresses, and there's no order on them;
the pointer also carries type information.

Going the other way round is far more troublesome. Not everything that looks like an address can be a pointer. Somewhere deep down any pointer is represented as a bit pattern that can be read as an integer, and you can say that this integer is an address. But going the other way, not every integer is a pointer.

There are first some well-known limitations; for example, an integer that designates a location outside your program's address space can't be a valid pointer. A misaligned address doesn't make a valid pointer for a data type that requires alignment; for example, on a platform where int requires 4-byte alignment, 0x7654321 cannot be a valid int* value.

However, it goes well beyond that, because when you make a pointer into an integer, you're in for a world of trouble. A big part of this trouble is that optimizing compilers are far better at microoptimization than most programmers expect, so that their mental model of how a program works is deeply wrong. Just because you have pointers with the same address doesn't mean that they are equivalent. For example, consider the following snippet:

unsigned int x = 0;
unsigned short *p = (unsigned short*)&x;
p[0] = 1;
printf("%u = %u\n", x, *p);

You might expect that on a run-of-the-mill machine where sizeof(int)==4 and sizeof(short)==2, this either prints 1 = 1? (little-endian) or 65536 = 1? (big-endian). But on my 64-bit Linux PC with GCC 4.4:

$ c99 -O2 -Wall a.c && ./a.out 
a.c: In function ‘main’:
a.c:6: warning: dereferencing pointer ‘p’ does break strict-aliasing rules
a.c:5: note: initialized from here
0 = 1?

GCC is kind enough to warn us what's going wrong in this simple example — in more complex examples, the compiler might not notice. Since p has a different type from &x, changing what p points to cannot affect what &x points to (outside some well-defined exceptions). Therefore the compiler is at liberty to keep the value of x in a register and not update this register as *p changes. The program dereferences two pointers to the same address and obtains two different values!

The moral of this example is that thinking of a (non-null valid) pointer as an address is fine, as long as you stay within the precise rules of the C language. The flip side of the coin is that the rules of the C language are intricate, and difficult to get an intuitive feeling for unless you know what happens under the hood. And what happens under the hood is that the tie between pointers and addresses is somewhat loose, both to support “exotic” processor architectures and to support optimizing compilers.

So think of pointers being addresses as a first step in your understanding, but don't follow that intuition too far.

votes

A pointer is a variable that HOLDS memory address, not the address itself. However, you can dereference a pointer - and get access to the memory location.

For example:

int q = 10; /*say q is at address 0x10203040*/
int *p = &q; /*means let p contain the address of q, which is 0x10203040*/
*p = 20; /*set whatever is at the address pointed by "p" as 20*/

That's it. It's that simple.

enter image description here

A program to demonstrate what I am saying and its output is here:

http://ideone.com/rcSUsb

The program:

#include <stdio.h>

int main(int argc, char *argv[])
{
  /* POINTER AS AN ADDRESS */
  int q = 10;
  int *p = &q;

  printf("address of q is %p\n", (void *)&q);
  printf("p contains %p\n", (void *)p);

  p = NULL;
  printf("NULL p now contains %p\n", (void *)p);
  return 0;
}

votes

It's difficult to tell exactly what the authors of those books mean exactly. Whether a pointer contains an address or not depends on how you define an address and how you define a pointer.

Judging from all the answers that are written, some people assume that (1) an address must be an integer and (2) a pointer doesn't need to be by virtual of not being said so in the specification. With these assumptions, then clearly pointers do not necessarily contain addresses.

However, we see that while (2) is probably true, (1) probably doesn't have to be true. And what to make of the fact that the & is called the address of operator as per @CornStalks's answer? Does this mean that the authors of the specification intend for a pointer to contain an address?

So can we say, pointer contains an address, but an address doesn't have to be an integer? Maybe.

I think all of this is jibberish pedantic semantic talk. It is totally worthless practically speaking. Can you think of a compiler that generates code in such a way that the value of a pointer is not an address? If so, what? That's what I thought...

I think what the author of the book (the first excerpt that claims that pointers are not necessarily just addresses) probably is referring to is the fact that a pointer comes with it the inherent type information.

For example,

 int x;
 int* y = &x;
 char* z = &x;

both y and z are pointers, but y+1 and z+1 are different. if they are memory addresses, wouldn't those expressions give you the same value?

And here in lies the thinking about pointers as if they were addresses usually leads to grief. Bugs have been written because people think about pointers as if they were addresses, and this usually leads to grief.

55555 is probably not a pointer, although it may be an address, but (int*)55555 is a pointer. 55555+1 = 55556, but (int*)55555+1 is 55559 (+/- difference in terms of sizeof(int)).

votes

Well, a pointer is an abstraction representing a memory location. Note that the quote doesn't say that thinking about pointers as if they were memory addresses is wrong, it just says that it "usually leads to grief". In other words, it leads you to have incorrect expectations.

The most likely source of grief is certainly pointer arithmetic, which is actually one of C's strengths. If a pointer was an address, you'd expect pointer arithmetic to be address arithmetic; but it's not. For example, adding 10 to an address should give you an address that is larger by 10 addressing units; but adding 10 to a pointer increments it by 10 times the size of the kind of object it points to (and not even the actual size, but rounded up to an alignment boundary). With an int * on an ordinary architecture with 32-bit integers, adding 10 to it would increment it by 40 addressing units (bytes). Experienced C programmers are aware of this and put it to all kinds of good uses, but your author is evidently no fan of sloppy metaphors.

There's the additional question of how the contents of the pointer represent the memory location: As many of the answers have explained, an address is not always an int (or long). In some architectures an address is a "segment" plus an offset. A pointer might even contain just the offset into the current segment ("near" pointer), which by itself is not a unique memory address. And the pointer contents might have only an indirect relationship to a memory address as the hardware understands it. But the author of the quote cited doesn't even mention representation, so I think it was conceptual equivalence, rather than representation, that they had in mind.

votes

Here's how I've explained it to some confused people in the past: A pointer has two attributes that affect its behavior. It has a value, which is (in typical environments) a memory address, and a type, which tells you the type and size of the object that it points at.

For example, given:

union {
    int i;
    char c;
} u;

You can have three different pointers all pointing to this same object:

void *v = &u;
int *i = &u.i;
char *c = &u.c;

If you compare the values of these pointers, they're all equal:

v==i && i==c

However, if you increment each pointer, you'll see that the type that they point to becomes relevant.

i++;
c++;
// You can't perform arithmetic on a void pointer, so no v++
i != c

The variables i and c will have different values at this point, because i++ causes i to contain the address of the next-accessible integer, and c++ causes c to point to the next-addressable character. Typically, integers take up more memory than characters, so i will end up with a larger value than c after they are both incremented.

votes

You are right and sane. Normally, a pointer is just an address, so you can cast it to integer and do any arithmetics.

But sometimes pointers are only a part of an address. On some architectures a pointer is converted to an address with addition of base or another CPU register is used.

But these days, on PC and ARM architecture with a flat memory model and C language natively compiled, it's OK to think that a pointer is an integer address to some place in one-dimensional addressable RAM.

votes

Mark Bessey already said it, but this needs to be re-emphasised until understood.

Pointer has as much to do with a variable than a literal 3.

Pointer is a tuple of a value (of an address) and a type (with additional properties, such as read only). The type (and the additional parameters if any) can further define or restrict the context; eg. __far ptr, __near ptr : what is the context of the address: stack, heap, linear address, offset from somewhere, physical memory or what.

It's the property of type that makes pointer arithmetic a bit different to integer arithmetic.

The counter examples of a pointer of not being a variable are too many to ignore

fopen returning a FILE pointer. (where's the variable)
stack pointer or frame pointer being typically unaddressable registers

*(int *)0x1231330 = 13; -- casting an arbitrary integer value to a pointer_of_integer type and writing/reading an integer without ever introducing a variable

In the lifetime of a C-program there will be many other instances of temporary pointers that do not have addresses -- and therefore they are not variables, but expressions/values with a compile time associated type.

votes

A pointer, like any other variable in C, is fundamentally a collection of bits which may be represented by one or more concatenated unsigned char values (as with any other type of cariable, sizeof(some_variable) will indicate the number of unsigned char values). What makes a pointer different from other variables is that a C compiler will interpret the bits in a pointer as identifying, somehow, a place where a variable may be stored. In C, unlike some other languages, it is possible to request space for multiple variables, and then convert a pointer to any value in that set into a pointer to any other variable within that set.

Many compilers implement pointers by using their bits store actual machine addresses, but that is not the only possible implementation. An implementation could keep one array--not accessible to user code--listing the hardware address and allocated size of all of the memory objects (sets of variables) which a program was using, and have each pointer contain an index into an array along with an offset from that index. Such a design would allow a system to not only restrict code to only operating upon memory that it owned, but also ensure that a pointer to one memory item could not be accidentally converted into a pointer to another memory item (in a system that uses hardware addresses, if foo and bar are arrays of 10 items that are stored consecutively in memory, a pointer to the "eleventh" item of foo might instead point to the first item of bar, but in a system where each "pointer" is an object ID and an offset, the system could trap if code tried to index a pointer to foo beyond its allocated range). It would also be possible for such a system to eliminate memory-fragmentation problems, since the physical addresses associated with any pointers could be moved around.

Note that while pointers are somewhat abstract, they're not quite abstract enough to allow a fully-standards-compliant C compiler to implement a garbage collector. The C compiler specifies that every variable, including pointers, is represented as a sequence of unsigned char values. Given any variable, one can decompose it into a sequence of numbers and later convert that sequence of numbers back into a variable of the original type. Consequently, it would be possible for a program to calloc some storage (receiving a pointer to it), store something there, decompose the pointer into a series of bytes, display those on the screen, and then erase all reference to them. If the program then accepted some numbers from the keyboard, reconstituted those to a pointer, and then tried to read data from that pointer, and if user entered the same numbers that the program had earlier displayed, the program would be required to output the data that had been stored in the calloc'ed memory. Since there is no conceivable way the computer could know whether the user had made a copy of the numbers that were displayed, there would be no conceivable may the computer could know whether the aforementioned memory might ever be accessed in future.

votes

A pointer is a variable type that is natively available in C/C++ and contains a memory address. Like any other variable it has an address of its own and takes up memory (the amount is platform specific).

One problem you will see as a result of the confusion is trying to change the referent within a function by simply passing the pointer by value. This will make a copy of the pointer at function scope and any changes to where this new pointer "points" will not change the referent of the pointer at the scope that invoked the function. In order to modify the actual pointer within a function one would normally pass a pointer to a pointer.

votes

BRIEF SUMMARY (which I will also put at the top):

(0) Thinking of pointers as addresses is often a good learning tool, and is often the actual implementation for pointers to ordinary data types.

(1) But on many, perhaps most, compilers pointers to functions are not addresses, but are bigger than an address (typically 2x, sometimes more), or are actually pointers to a struct in memory than contains the addresses of function and stuff like a constant pool.

(2) Pointers to data members and pointers to methods are often even stranger.

(3) Legacy x86 code with FAR and NEAR pointer issues

(4) Several examples, most notably the IBM AS/400, with secure "fat pointers".

I am sure you can find more.

DETAIL:

UMMPPHHH!!!!! Many of the answers so far are fairly typical "programmer weenie" answers - but not compiler weenie or hardware weenie. Since I pretend to be a hardware weenie, and often work with compiler weenies, let me throw in my two cents:

On many, probably most, C compilers, a pointer to data of type T is, in fact, the address of T.

Fine.

But, even on many of these compilers, certain pointers are NOT addresses. You can tell this by looking at sizeof(ThePointer).

For example, pointers to functions are sometimes quite a lot bigger than ordinary addresses. Or, they may involve a level of indirection. This article provides one description, involving the Intel Itanium processor, but I have seen others. Typically, to call a function you must know not only the address of the function code, but also the address of the function's constant pool - a region of memory from which constants are loaded with a single load instruction, rather than the compiler having to generate a 64 bit constant out of several Load Immediate and Shift and OR instructions. So, rather than a single 64 bit address, you need 2 64 bit addresses. Some ABIs (Application Binary Interfaces) move this around as 128 bits, whereas others use a level of indirection, with the function pointer actually being the address of a function descriptor that contains the 2 actual addresses just mentioned. Which is better? Depends on your point of view: performance, code size, and some compatibility issues - often code assumes that a pointer can be cast to a long or a long long, but may also assume that the long long is exactly 64 bits. Such code may not be standards compliant, but nevertheless customers may want it to work.

Many of us have painful memories of the old Intel x86 segmented architecture, with NEAR POINTERs and FAR POINTERS. Thankfully these are nearly extinct by now, so only a quick summary: in 16 bit real mode, the actual linear address was

LinearAddress = SegmentRegister[SegNum].base << 4 + Offset

Whereas in protected mode, it might be

LinearAddress = SegmentRegister[SegNum].base + offset

with the resulting address being checked against a limit set in the segment. Some programs used not really standard C/C++ FAR and NEAR pointer declarations, but many just said *T --- but there were compiler and linker switches so, for example, code pointers might be near pointers, just a 32 bit offset against whatever is in the CS (Code Segment) register, while the data pointers might be FAR pointers, specifying both a 16 bit segment number and a 32 bit offset for a 48 bit value. Now, both of these quantities are certainly related to the address, but since they aren't the same size, which of them is the address? Moreover, the segments also carried permissions - read-only, read-write, executable - in addition to stuff related to the actual address.

A more interesting example, IMHO, is (or, perhaps, was) the IBM AS/400 family. This computer was one of the first to implement an OS in C++. Pointers on this machime were typically 2X the actual address size - e.g. as this presentation says, 128 bit pointers, but the actual addresses were 48-64 bits, and, again, some extra info, what is called a capability, that provided permissions such as read, write, as well as a limit to prevent buffer overflow. Yes: you can do this compatibly with C/C++ -- and if this were ubiquitous, the Chinese PLA and slavic mafia would not be hacking into so many Western computer systems. But historically most C/C++ programming has neglected security for performance. Most interestingly, the AS400 family allowed the operating system to create secure pointers, that could be given to unprivileged code, but which the unprivileged code could not forge or tamper with. Again, security, and while standards compliant, much sloppy non-standards compliant C/C++ code will not work in such a secure system. Again, there are official standards, and there are de-facto standards.

Now, I'll get off my security soapbox, and mention some other ways in which pointers (of various types) are often not really addresses: Pointers to data members, pointers to member functions methods, and the static versions thereof are bigger than an ordinary address. As this post says:

There are many ways of solving this [problems related to single versus multiple inheitance, and virtual inheritance]. Here's how the Visual Studio compiler decides to handle it: A pointer to a member function of a multiply-inherited class is really a structure." And they go on to say "Casting a function pointer can change its size!".

As you can probably guess from my pontificating on (in)security, I've been involved in C/C++ hardware/software projects where a pointer was treated more like a capability than a raw address.

I could go on, but I hope you get the idea.

BRIEF SUMMARY (which I will also put at the top):

(0) thinking of pointers as addresses is often a good learning tool, and is often the actual implementation for pointers to ordinary data types.

(1) But on many, perhaps most, compilers pointers to functions are not addresses, but are bigger than an address (typically 2X, sometimes more), or are actually pointers to a struct in memory than contains the addresses of function and stuff like a constant pool.

(2) Pointers to data members and pointers to methods are often even stranger.

(3) Legacy x86 code with FAR and NEAR pointer issues

(4) Several examples, most notably the IBM AS/400, with secure "fat pointers".

I am sure you can find more.

votes

A pointer is just another variable which is used to hold the address of a memory location (usually the memory address of another variable).

votes

You can see it this way. A pointer is a value that represents an address in the addressable memory space.

votes

A pointer is just another variable that can contain memory address usually of another variable. A pointer being a variable it too has an memory address.

votes

A C pointer is very similar to a memory address but with machine-dependent details abstracted away, as well as some features not found in the lower level instruction set.

For example, a C pointer is relatively richly typed. If you increment a pointer through an array of structures, it nicely jumps from one structure to the other.

Pointers are subject to conversion rules and provide compile time type checking.

There is a special "null pointer" value which is portable at the source code level, but whose representation may differ. If you assign an integer constant whose value is zero to a pointer, that pointer takes on the null pointer value. Ditto if you initialize a pointer that way.

A pointer can be used as a boolean variable: it tests true if it is other than null, and false if it is null.

In a machine language, if the null pointer is a funny address like 0xFFFFFFFF, then you may have to have explicit tests for that value. C hides that from you. Even if the null pointer is 0xFFFFFFFF, you can test it using if (ptr != 0) { /* not null! */}.

Uses of pointers which subvert the type system lead to undefined behavior, whereas similar code in machine language might be well defined. Assemblers will assemble the instructions you have written, but C compilers will optimize based on the assumption that you haven't done anything wrong. If a float *p pointer points to a long n variable, and *p = 0.0 is executed, the compiler is not required to handle this. A subsequent use of n will not necessary read the bit pattern of the float value, but perhaps, it will be an optimized access which is based on the "strict aliasing" assumption that n has not been touched! That is, the assumption that the program is well-behaved, and so p should not be pointing at n.

In C, pointers to code and pointers to data are different, but on many architectures, the addresses are the same. C compilers can be developed which have "fat" pointers, even though the target architecture does not. Fat pointers means that pointers are not just machine addresses, but contain other info, such as information about the size of the object being pointed at, for bounds checking. Portably written programs will easily port to such compilers.

So you can see, there are many semantic differences between machine addresses and C pointers.

votes

Before understanding pointers we need to understand objects. Objects are entities which exist and has a location specifier called an address. A pointer is just a variable like any other variables in C with a type called pointer whose content is interpreted as the address of an object which supports the following operation.

+ : A variable of type integer (usually called offset) can be added to yield a new pointer
- : A variable of type integer (usually called offset) can be subtracted to yield a new pointer
  : A variable of type pointer can be subtracted to yield an integer (usually called offset)
* : De-referencing. Retrieve the value of the variable (called address) and map to the object the address refers to.
++: It's just `+= 1`
--: It's just `-= 1`

A pointer is classified based on the type of object it is currently referring. The only part of the information it matters is the size of the object.

Any object supports an operation, & (address of), which retrieves the location specifier (address) of the object as a pointer object type. This should abate the confusion surrounding the nomenclature as this would make sense to call & as an operation of an object rather than a pointer whose resultant type is a pointer of the object type.

Note Throughout this explanation, I have left out the concept of memory.

votes

An address is used to identify a piece of fixed-size storage, usually for each bytes, as an integer. This is precisely called as byte address, which is also used by the ISO C. There can be some other methods to construct an address, e.g. for each bit. However, only byte address is so often used, we usually omit "byte".

Technically, an address is never a value in C, because the definition of term "value" in (ISO) C is:

precise meaning of the contents of an object when interpreted as having a specific type

(Emphasized by me.) However, there is no such "address type" in C.

Pointer is not the same. Pointer is a kind of type in the C language. There are several distinct pointer types. They does not necessarily obey to identical set of rules of the language, e.g. the effect of ++ on a value of type int* vs. char*.

A value in C can be of a pointer type. This is called a pointer value. To be clear, a pointer value is not a pointer in the C language. But we are accustomed to mix them together, because in C it is not likely to be ambiguous: if we call an expression p as a "pointer", it is merely a pointer value but not a type, since a named type in C is not expressed by an expression, but by a type-name or a typedef-name.

Some other things are subtle. As a C user, firstly, one should know what object means:

region of data storage in the execution environment, the contents of which can represent values

An object is an entity to represent values, which are of a specific type. A pointer is an object type. So if we declare int* p;, then p means "an object of pointer type", or an "pointer object".

Note there is no "variable" normatively defined by the standard (in fact it is never being used as a noun by ISO C in normative text). However, informally, we call an object a variable, as some other language does. (But still not so exactly, e.g. in C++ a variable can be of reference type normatively, which is not an object.) The phrases "pointer object" or "pointer variable" are sometimes treated like "pointer value" as above, with a probable slight difference. (One more set of examples is "array".)

Since pointer is a type, and address is effectively "typeless" in C, a pointer value roughly "contains" an address. And an expression of pointer type can yield an address, e.g.

ISO C11 6.5.2.3

3 The unary & operator yields the address of its operand.

Note this wording is introduced by WG14/N1256, i.e. ISO C99:TC3. In C99 there is

3 The unary & operator returns the address of its operand.

It reflects the committee's opinion: an address is not a pointer value returned by the unary & operator.

Despite the wording above, there are still some mess even in the standards.

ISO C11 6.6

9 An address constant is a null pointer, a pointer to an lvalue designating an object of static storage duration, or a pointer to a function designator

ISO C++11 5.19

3 ... An address constant expression is a prvalue core constant expression of pointer type that evaluates to the address of an object with static storage duration, to the address of a function, or to a null pointer value, or a prvalue core constant expression of type std::nullptr_t. ...

(Recent C++ standard draft uses another wording so there is no this problem.)

Actually both "address constant" in C and "address constant expression" in C++ are constant expression of pointer types (or at least "pointer-like" types since C++11).

And the builtin unary & operator is called as "address-of" in C and C++; similarily, std::addressof is introduced in C++11.

These naming may bring misconception. The resulted expression is of pointer type, so they'd be interpreted as: the result contains/yields an address, rather than is an address.

votes

It says "because it confuses those who don't know what addresses are about" - also, it's true: if you learn what addresses are about, you'll be not confused. Theoretically, pointer is a variable which points to another, practically holds an address, which is the address of the variable it points to. I don't know why should hide this fact, it's not a rocket science. If you understand pointers, you'll one step closer to understand how computers work. Go ahead!

votes

Come to think about it, I think it's a matter of semantics. I don't think the author is right, since the C standard refers to a pointer as holding an address to the referenced object as others have already mentioned here. However, address!=memory address. An address can be really anything as per C standard although it will eventually lead to a memory address, the pointer itself can be an id, an offset + selector (x86), really anything as long as it can describe (after mapping) any memory address in the addressable space.

votes

One other way in which a C or C++ pointer differs from a simple memory address due to the different pointer types I haven't seen in the other answers (altrhough given their total size, I may have overlooked it). But it is probably the most important one, because even experienced C/C++ programmers can trip over it:

The compiler may assume that pointers of incompatible types do not point to the same address even if they clearly do, which may give behaviour that would no be possible with a simple pointer==address model. Consider the following code (assuming sizeof(int) = 2*sizeof(short)):

unsigned int i = 0;
unsigned short* p = (unsigned short*)&i;
p[0]=p[1]=1;

if (i == 2 + (unsigned short)(-1))
{
  // you'd expect this to execute, but it need not
}

if (i == 0)
{
  // you'd expect this not to execute, but it actually may do so
}

Note that there's an exception for char*, so manipulating values using char* is possible (although not very portable).

votes

Quick summary: A C address is a value, typically represented as a machine-level memory address, with a specific type.

The unqualified word "pointer" is ambiguous. C has pointer objects (variables), pointer types, pointer expressions, and pointer values.

It's very common to use the word "pointer" to mean "pointer object", and that can lead to some confusion -- which is why I try to use "pointer" as an adjective rather than as a noun.

The C standard, at least in some cases, uses the word "pointer" to mean "pointer value". For example, the description of malloc says it "returns either a null pointer or a pointer to the allocated space".

So what's an address in C? It's a pointer value, i.e., a value of some particular pointer type. (Except that a null pointer value is not necessarily referred to as an "address", since it isn't the address of anything).

The standard's description of the unary & operator says it "yields the address of its operand". Outside the C standard, the word "address" is commonly used to refer to a (physical or virtual) memory address, typically one word in size (whatever a "word" is on a given system).

A C "address" is typically implemented as a machine address -- just as a C int value is typically implemented as a machine word. But a C address (pointer value) is more than just a machine address. It's a value typically represented as a machine address, and it's a value with some specific type.

votes

A pointer value is an address. A pointer variable is an object that can store an address. This is true because that's what the standard defines a pointer to be. It's important to tell it to C novices because C novices are often unclear on the difference between a pointer and the thing it points to (that is to say, they don't know the difference between an envelope and a building). The notion of an address (every object has an address and that's what a pointer stores) is important because it sorts that out.

However, the standard talks at a particular level of abstraction. Those people the author talks about who "know what addresses are about", but who are new to C, must necessarily have learned about addresses at a different level of abstraction -- perhaps by programming assembly language. There is no guarantee that the C implementation uses the same representation for addresses as the CPUs opcodes use (referred to as "the store address" in this passage), that these people already know about.

He goes on to talk about "perfectly reasonable address manipulation". As far as the C standard is concerned there's basically no such thing as "perfectly reasonable address manipulation". Addition is defined on pointers and that is basically it. Sure, you can convert a pointer to integer, do some bitwise or arithmetic ops, and then convert it back. This is not guaranteed to work by the standard, so before writing that code you'd better know how your particular C implementation represents pointers and performs that conversion. It probably uses the address representation you expect, but it it doesn't that's your fault because you didn't read the manual. That's not confusion, it's incorrect programming procedure ;-)

In short, C uses a more abstract concept of an address than the author does.

The author's concept of an address of course is also not the lowest-level word on the matter. What with virtual memory maps and physical RAM addressing across multiple chips, the number that you tell the CPU is "the store address" you want to access has basically nothing to do with where the data you want is actually located in hardware. It's all layers of indirection and representation, but the author has chosen one to privilege. If you're going to do that when talking about C, choose the C level to privilege!

Personally I don't think the author's remarks are all that helpful, except in the context of introducing C to assembly programmers. It's certainly not helpful to those coming from higher level languages to say that pointer values aren't addresses. It would be far better to acknowledge the complexity than it is to say that the CPU has the monopoly on saying what an address is and thus that C pointer values "are not" addresses. They are addresses, but they may be written in a different language from the addresses he means. Distinguishing the two things in the context of C as "address" and "store address" would be adequate, I think.

votes

Simply to say pointers are actually offset part of the segmentation mechanism which translate to Linear Address after segmentation and then to Physical address after paging. Physical Addresses are actually addressed from you ram.

       Selector  +--------------+         +-----------+
      ---------->|              |         |           |
                 | Segmentation | ------->|  Paging   |
        Offset   |  Mechanism   |         | Mechanism |
      ---------->|              |         |           |
                 +--------------+         +-----------+
        Virtual                   Linear                Physical