2
votes

Here is a simple program that shows how we normally type cast struct sockaddr * to struct sockaddr_in * or struct sockaddr_in6 * while writing socket programs.

#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>

int main()
{
    struct addrinfo *ai;

    printf("sizeof (struct sockaddr): %zu\n", sizeof (struct sockaddr));
    printf("sizeof (struct sockaddr_in): %zu\n", sizeof (struct sockaddr_in));
    printf("sizeof (struct sockaddr_in6): %zu\n", sizeof (struct sockaddr_in6));

    if (getaddrinfo("localhost", "http", NULL, &ai) != 0) {
        printf("error\n");
        return EXIT_FAILURE;
    }

    if (ai->ai_family == AF_INET) {
        struct sockaddr_in *addr = (struct sockaddr_in *) ai->ai_addr;
        printf("IPv4 port: %d\n", addr->sin_port);
    } else if (ai->ai_family == AF_INET6) {
        struct sockaddr_in6 *addr = (struct sockaddr_in6 *) ai->ai_addr;
        printf("IPv6 port: %d\n", addr->sin6_port);
    }

    return 0;
}

Beej's Guide to Network Programming also recommends this in page 10.

To deal with struct sockaddr, programmers created a parallel structure: struct sockaddr_in (“in” for “Internet”) to be used with IPv4.

And this is the important bit: a pointer to a struct sockaddr_in can be cast to a pointer to a struct sockaddr and vice-versa. So even though connect() wants a struct sockaddr*, you can still use a struct sockaddr_in and cast it at the last minute!

But from the discussion at another question, it appears that this is just a hack, not valid C code as per the C standard.

In particular, see AnT's answer that mentions,

As for the popular technique with casts between struct sockaddr *, struct sockaddr_in * and struct sockaddr_in6 * - these are just hacks that have nothing to do with C language. They just work in practice, but as far as C language is concerned, the technique is invalid.

So if this technique we use to do socket programming (and what is also recommended by the books) is invalid, what is the valid way to rewrite the above code so that it is also a valid C code as per the C standard?

4
Beej's Guide to Network Programming is generally rock-solid on network programming techniques. It would be rare if what was contained was outright improper.David C. Rankin
The way mentioned in your quotation is what is intended by the interface. It's been going on for over thirty years. Nothing else is necessary.user207421

4 Answers

4
votes

So if the way we do socket programming (and what is also recommended by the books) is a hack, what is the correct way to rewrite the above code so that it is also a valid C code as per the C standard?

TL;DR: continue to do what you present in your example.

The code you presented appears to be syntactically correct. It may or may not exhibit undefined behavior under some circumstances. Whether or not it does depends on the behavior of getaddrinfo().

There is no way to do this in C that meets all the functional requirements and is any better protected against undefined behavior than the standard technique you've presented. That's why it's the standard technique. The issue here is that the function must support all conceivable address types, including types that have not yet been defined. It could declare the socket address pointer as a void *, which would not require casting, but that wouldn't actually change anything about whether any given program exhibits undefined behavior.

For its part, getaddrinfo() is designed with exactly such usage in mind, so it is its problem if using the expected cast on the result allows for misbehavior. Moreover, getaddrinfo() is not part of the C standard library -- it is standardized (only) by POSIX, which also incorporates the C standard. Analyzing that function in the light of C alone therefore demonstrates an inappropriate hyperfocus. Though the casts raise some concern in light of C alone, you should expect that in the context of getaddrinfo() and other POSIX networking functions using struct sockaddr *, casting to the correct specific address type and accessing the referenced object produces reliable results.

Additionally, I think AnT's answer to your other question is oversimplified and overly negative. I'm considering whether to write a contrasting answer.

1
votes

The POSIX standard guarantees that a pointer to any kind of socket can be cast to a struct sockaddr*. So you can cast a pointer to any kind of socket to a struct sockaddr* to use it in bind() or connect(); the library knows which bits to check. You could also check the sa_family field of your socket to see what it really is, assuming it contains valid data, then cast to the appropriate pointer type. If you need to allocate a big enough block of memory to safely store any kind of socket, use sockaddr_storage. A cast from a sockaddr_storage* to any other socket pointer is guaranteed to align properly, and the field containing the socket family is guaranteed to still work.

To obtain an IPv6 socket from a sockaddr_in, you can convert the IPv4 address to IPv6 notation and use getaddrinfo(). However, the modern lookup functions probably give you a linked list including both an IPv4 and IPv6 socket.

0
votes

The answer is in man getaddrinfo and sys/socket.h. man getaddrinfo provides the rational behind using a common struct sockaddr:

Given node and service, which identify an Internet host and a service, 
getaddrinfo() returns one or more addrinfo structures, each of which 
contains an Internet address that can be specified in a call to bind(2) 
or connect(2). The getaddrinfo() function combines the functionality 
provided by the gethostbyname(3) and getservbyname(3) functions into a 
single interface, but unlike the latter functions, getaddrinfo() is 
reentrant and allows programs to eliminate IPv4-versus-IPv6 dependencies.

There is only one struct sockaddr. It appears the various types are all simply used within a transparent union to provide for any struct sockaddr_X needed. For example:

/* This is the type we use for generic socket address arguments.

   With GCC 2.7 and later, the funky union causes redeclarations or
   uses with any of the listed types to be allowed without complaint.
   G++ 2.7 does not support transparent unions so there we want the
   old-style declaration, too.  */
#if defined __cplusplus || !__GNUC_PREREQ (2, 7) || !defined __USE_GNU
# define __SOCKADDR_ARG         struct sockaddr *__restrict
# define __CONST_SOCKADDR_ARG   const struct sockaddr *
#else
/* Add more `struct sockaddr_AF' types here as necessary.
   These are all the ones I found on NetBSD and Linux.  */
# define __SOCKADDR_ALLTYPES \
  __SOCKADDR_ONETYPE (sockaddr) \
  __SOCKADDR_ONETYPE (sockaddr_at) \
  __SOCKADDR_ONETYPE (sockaddr_ax25) \
  __SOCKADDR_ONETYPE (sockaddr_dl) \
  __SOCKADDR_ONETYPE (sockaddr_eon) \
  __SOCKADDR_ONETYPE (sockaddr_in) \
  __SOCKADDR_ONETYPE (sockaddr_in6) \
  __SOCKADDR_ONETYPE (sockaddr_inarp) \
  __SOCKADDR_ONETYPE (sockaddr_ipx) \
  __SOCKADDR_ONETYPE (sockaddr_iso) \
  __SOCKADDR_ONETYPE (sockaddr_ns) \
  __SOCKADDR_ONETYPE (sockaddr_un) \
  __SOCKADDR_ONETYPE (sockaddr_x25)

# define __SOCKADDR_ONETYPE(type) struct type *__restrict __##type##__;
typedef union { __SOCKADDR_ALLTYPES
            } __SOCKADDR_ARG __attribute__ ((__transparent_union__));
# undef __SOCKADDR_ONETYPE
# define __SOCKADDR_ONETYPE(type) const struct type *__restrict __##type##__;
typedef union { __SOCKADDR_ALLTYPES
            } __CONST_SOCKADDR_ARG __attribute__ ((__transparent_union__));
# undef __SOCKADDR_ONETYPE
#endif

I haven't waded though all the macro-soup, but it appears you are safe with either type.

-2
votes

Refering to this and the other link Is it legal to type-cast pointers of different struct types (e.g. struct sockaddr * to struct sockaddr_in6 *)? . These are not exactly hacks. To do what you want do, if right understood I would do something like :

struct base
{
    int a;
    char b;
    double *n;
}
struct derived 
{
  struct base b; //(no pointer, but the whole struct)
  int c;
  int d;
}

In this way, when you cast from the derived to the base, you are sure that the first n bytes of the derived overlaps exactly the base. The code works and is fully portable. Different problems differents solutions. Actually in my experience I ever preferred that base contains derived, and not viceversa. So to have a "polymorphic" structure. But 1)if it works, 2)the people is going to read the code is going to understand 3) you feel usefull ... why not? all up to you. Probably c++ implements the hinerhitance exactly in this way! who can say it?
Just be carefull with array of them, to index with the right type, and be carefull to put ever at first place. (but also C++ has trouble with array of polymorphic objects, it can old just pointers of them)