16
votes

Possible Duplicate:
Why does this C code work?
How do you use offsetof() on a struct?

I read about this offsetof macro on the Internet, but it doesn't explain what it is used for.

#define offsetof(a,b) ((int)(&(((a*)(0))->b)))

What is it trying to do and what is the advantage of using it?

4
That offsetof macro is incorrect. They should cast to size_t, not int, and they should probably subtract (char*)0 from the result before casting even though it's a null pointer constant.Chris Lutz

4 Answers

41
votes

R.. is correct in his answer to the second part of your question: this code is not advised when using a modern C compiler.

But to answer the first part of your question, what this is actually doing is:

(
  (int)(         // 4.
    &( (         // 3.
      (a*)(0)    // 1.
     )->b )      // 2.
  )
)

Working from the inside out, this is ...

  1. Casting the value zero to the struct pointer type a*
  2. Getting the struct field b of this (illegally placed) struct object
  3. Getting the address of this b field
  4. Casting the address to an int

Conceptually this is placing a struct object at memory address zero and then finding out at what the address of a particular field is. This could allow you to figure out the offsets in memory of each field in a struct so you could write your own serializers and deserializers to convert structs to and from byte arrays.

Of course if you would actually dereference a zero pointer your program would crash, but actually everything happens in the compiler and no actual zero pointer is dereferenced at runtime.

In most of the original systems that C ran on the size of an int was 32 bits and was the same as a pointer, so this actually worked.

17
votes

It has no advantages and should not be used, since it invokes undefined behavior (and uses the wrong type - int instead of size_t).

The C standard defines an offsetof macro in stddef.h which actually works, for cases where you need the offset of an element in a structure, such as:

#include <stddef.h>

struct foo {
    int a;
    int b;
    char *c;
};

struct struct_desc {
    const char *name;
    int type;
    size_t off;
};

static const struct struct_desc foo_desc[] = {
    { "a", INT, offsetof(struct foo, a) },
    { "b", INT, offsetof(struct foo, b) },
    { "c", CHARPTR, offsetof(struct foo, c) },
};

which would let you programmatically fill the fields of a struct foo by name, e.g. when reading a JSON file.

5
votes

It's finding the byte offset of a particular member of a struct. For example, if you had the following structure:

struct MyStruct
{
    double d;
    int i;
    void *p;
};

Then you'd have offsetOf(MyStruct, d) == 0, offsetOf(MyStruct, i) == 8, and offsetOf(MyStruct, p) == 12 (that is, the member named d is 0 bytes from the start of the structure, etc.).

The way that it works is it pretends that an instance of your structure exists at address 0 (the ((a*)(0)) part), and then it takes the address of the intended structure member and casts it to an integer. Although dereferencing an object at address 0 would ordinarily be an error, it's ok to take the address because the address-of operator & and the member dereference -> cancel each other out.

It's typically used for generalized serialization frameworks. If you have code for converting between some kind of wire data (e.g. bytes in a file or from the network) and in-memory data structures, it's often convenient to create a mapping from member name to member offset, so that you can serialize or deserialize values in a generic manner.

-2
votes

The implementation of the offsetof macro is really irrelevant.

The actual C standard defines it as in 7.17.3:

offsetof(type, member-designator)

which expands to an integer constant expression that has type size_t, the value of which is the offset in bytes, to the structure member (designated by member-designator), from the beginning of its structure (designated by type). The type and member designator shall be such that given static type t;.

Trust Adam Rosenfield's answer.

R is completely wrong, and it has many uses - especially being able to tell when code is non-portable among platforms.

(OK, it's C++, but we use it in static template compile time assertions to make sure our data structures do not change size between platforms/versions.)