47
votes

I'm writing a point-to-point message queue system, and it has to be able to operate over UDP. I could arbitrarily pick one side or the other to be the "server" but it doesn't seem quite right since both ends are sending and receiving the same type of data from the other.

Is it possible to bind() and connect() both ends so that they send/receive only from each other? That seems like a nicely symmetric way to do it.

10
Seems a bit strange, but I don't see why not. connect() just sets the default destination address/port for the socket. (Have you tried it? If it doesn't work for some reason, just use sendto().) Personally I'd just use sendto() because otherwise you'll get confused if multiple clients connect to your server.mpontillo

10 Answers

55
votes

Hello from the distant future which is the year 2018, to the year 2012.

There's, in fact, a reason behind connect()ing an UDP socket in practice (though blessed POSIX and its implementations don't in theory require you to).

An ordinary UDP socket doesn't know anything about its future destinations, so it performs a route lookup each time sendmsg() is called.

However, if connect() is called beforehand with a particular remote receiver's IP and port, the operating system kernel will be able to write down the reference to the route and assign it to the socket, making it significantly faster to send a message if subsequent sendmsg() calls do not specify a receiver (otherwise the previous setting would be ignored), choosing the default one instead.

Look at the lines 1070 through 1171:

if (connected)
    rt = (struct rtable *)sk_dst_check(sk, 0);

if (!rt) {
    [..skip..]

    rt = ip_route_output_flow(net, fl4, sk);

    [..skip..]
}

Until Linux kernel 4.18, this feature had been mostly limited to the IPv4 address family only. However, since 4.18-rc4 (and hopefully Linux kernel release 4.18 as well), it's fully functional with IPv6 sockets as well.

It may be a source of a serious performance benefit, though it will heavily depend on the OS you're using. At least, if you're using Linux and don't use the socket for multiple remote handlers, you should give it a try.

26
votes

UDP is connectionless, so there's little sense for the OS in actually making some sort of connection.

In BSD sockets one can do a connect on a UDP socket, but this basically just sets the default destination address for send (instead giving explicitly to send_to).

Bind on a UDP socket tells the OS for which incoming address to actually accept packets (all packets to other addresses are dropped), regardless the kind of socket.

Upon receiving you must use recvfrom to identify which source the packet comes from. Note that if you want some sort of authentication, then using just the addresses involved is as insecure as no lock at all. TCP connections can be hijacked and naked UDP literally has IP spoofing written all over its head. You must add some sort of HMAC

17
votes

Here's a program that demonstrates how to bind() and connect() on the same UDP socket to a specific set of source and destination ports respectively. The program can be compiled on any Linux machine and has the following usage:

usage: ./<program_name> dst-hostname dst-udpport src-udpport

I tested this code opening two terminals. You should be able to send a message to the destination node and receive messages from it.

In terminal 1 run

./<program_name> 127.0.0.1 5555 5556

In terminal 2 run

./<program_name> 127.0.0.1 5556 5555

Even though I've tested it on a single machine I think it should also work on two different machines once you've setup the correct firewall settings

Here's a description of the flow:

  1. Setup hints indicated the type of destination address as that of a UDP connection
  2. Use getaddrinfo to obtain the address info structure dstinfo based on argument 1 which is the destination address and argument 2 which is the destination port
  3. Create a socket with the first valid entry in dstinfo
  4. Use getaddrinfo to obtain the address info structure srcinfo primarily for the source port details
  5. Use srcinfo to bind to the socket obtained
  6. Now connect to the first valid entry of dstinfo
  7. If all is well enter the loop
  8. The loop uses a select to block on a read descriptor list which consists of the STDIN and sockfd socket created
  9. If STDIN has an input it is sent to the destination UDP connection using sendall function
  10. If EOM is received the loop is exited.
  11. If sockfd has some data it is read through recv
  12. If recv returns -1 it is an error we try to decode it with perror
  13. If recv returns 0 it means the remote node has closed the connection. But I believe has no consequence with UDP a which is connectionless.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <netdb.h>

#define STDIN 0

int sendall(int s, char *buf, int *len)
{
    int total = 0;        // how many bytes we've sent
    int bytesleft = *len; // how many we have left to send
    int n;

    while(total < *len) {
        n = send(s, buf+total, bytesleft, 0);
        fprintf(stdout,"Sendall: %s\n",buf+total);
        if (n == -1) { break; }
        total += n;
        bytesleft -= n;
    }

    *len = total; // return number actually sent here

    return n==-1?-1:0; // return -1 on failure, 0 on success
} 

int main(int argc, char *argv[])
{
   int sockfd;
   struct addrinfo hints, *dstinfo = NULL, *srcinfo = NULL, *p = NULL;
   int rv = -1, ret = -1, len = -1,  numbytes = 0;
   struct timeval tv;
   char buffer[256] = {0};
   fd_set readfds;

   // don't care about writefds and exceptfds:
   //     select(STDIN+1, &readfds, NULL, NULL, &tv);

   if (argc != 4) {
      fprintf(stderr,"usage: %s dst-hostname dst-udpport src-udpport\n");
      ret = -1;
      goto LBL_RET;
   }


   memset(&hints, 0, sizeof hints);
   hints.ai_family = AF_UNSPEC;
   hints.ai_socktype = SOCK_DGRAM;        //UDP communication

   /*For destination address*/
   if ((rv = getaddrinfo(argv[1], argv[2], &hints, &dstinfo)) != 0) {
      fprintf(stderr, "getaddrinfo for dest address: %s\n", gai_strerror(rv));
      ret = 1;
      goto LBL_RET;
   }

   // loop through all the results and make a socket
   for(p = dstinfo; p != NULL; p = p->ai_next) {

      if ((sockfd = socket(p->ai_family, p->ai_socktype,
                  p->ai_protocol)) == -1) {
         perror("socket");
         continue;
      }
      /*Taking first entry from getaddrinfo*/
      break;
   }

   /*Failed to get socket to all entries*/
   if (p == NULL) {
      fprintf(stderr, "%s: Failed to get socket\n");
      ret = 2;
      goto LBL_RET;
   }

   /*For source address*/
   memset(&hints, 0, sizeof hints);
   hints.ai_family = AF_UNSPEC;
   hints.ai_socktype = SOCK_DGRAM;        //UDP communication
   hints.ai_flags = AI_PASSIVE;     // fill in my IP for me
   /*For source address*/
   if ((rv = getaddrinfo(NULL, argv[3], &hints, &srcinfo)) != 0) {
      fprintf(stderr, "getaddrinfo for src address: %s\n", gai_strerror(rv));
      ret = 3;
      goto LBL_RET;
   }

   /*Bind this datagram socket to source address info */
   if((rv = bind(sockfd, srcinfo->ai_addr, srcinfo->ai_addrlen)) != 0) {
      fprintf(stderr, "bind: %s\n", gai_strerror(rv));
      ret = 3;
      goto LBL_RET;
   }

   /*Connect this datagram socket to destination address info */
   if((rv= connect(sockfd, p->ai_addr, p->ai_addrlen)) != 0) {
      fprintf(stderr, "connect: %s\n", gai_strerror(rv));
      ret = 3;
      goto LBL_RET;
   }

   while(1){
      FD_ZERO(&readfds);
      FD_SET(STDIN, &readfds);
      FD_SET(sockfd, &readfds);

      /*Select timeout at 10s*/
      tv.tv_sec = 10;
      tv.tv_usec = 0;
      select(sockfd + 1, &readfds, NULL, NULL, &tv);

      /*Obey your user, take his inputs*/
      if (FD_ISSET(STDIN, &readfds))
      {
         memset(buffer, 0, sizeof(buffer));
         len = 0;
         printf("A key was pressed!\n");
         if(0 >= (len = read(STDIN, buffer, sizeof(buffer))))
         {
            perror("read STDIN");
            ret = 4;
            goto LBL_RET;
         }

         fprintf(stdout, ">>%s\n", buffer);

         /*EOM\n implies user wants to exit*/
         if(!strcmp(buffer,"EOM\n")){
            printf("Received EOM closing\n");
            break;
         }

         /*Sendall will use send to transfer to bound sockfd*/
         if (sendall(sockfd, buffer, &len) == -1) {
            perror("sendall");
            fprintf(stderr,"%s: We only sent %d bytes because of the error!\n", argv[0], len);
            ret = 5;
            goto LBL_RET;
         }  
      }

      /*We've got something on our socket to read */
      if(FD_ISSET(sockfd, &readfds))
      {
         memset(buffer, 0, sizeof(buffer));
         printf("Received something!\n");
         /*recv will use receive to connected sockfd */
         numbytes = recv(sockfd, buffer, sizeof(buffer), 0);
         if(0 == numbytes){
            printf("Destination closed\n");
            break;
         }else if(-1 == numbytes){
            /*Could be an ICMP error from remote end*/
            perror("recv");
            printf("Receive error check your firewall settings\n");
            ret = 5;
            goto LBL_RET;
         }
         fprintf(stdout, "<<Number of bytes %d Message: %s\n", numbytes, buffer);
      }

      /*Heartbeat*/
      printf(".\n");
   }

   ret = 0;
LBL_RET:

   if(dstinfo)
      freeaddrinfo(dstinfo);

   if(srcinfo)
      freeaddrinfo(srcinfo);

   close(sockfd);

   return ret;
}
6
votes

Really the key is connect():

If the socket sockfd is of type SOCK_DGRAM then addr is the address to which datagrams are sent by default, and the only address from which datagrams are received.

1
votes

There is a problem in your code:

memset(&hints, 0, sizeof hints);
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_DGRAM;        //UDP communication

/*For destination address*/
if ((rv = getaddrinfo(argv[1], argv[2], &hints, &dstinfo)) 

By using AF_UNSPEC and SOCK_DGRAM only, you gets a list of all the possible addrs. So, when you call socket, the address you are using might not be your expected UDP one. You should use

hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_DGRAM;
hints.ai_protocol = IPPROTO_UDP;
hints.ai_flags = AI_PASSIVE;

instead to make sure the addrinfo you are retrieving is what you wanted.

In another word, the socket you created may not be an UDP socket, and that is the reason why it does not work.

0
votes

This page contains some great info about connected versus unconnected sockets: http://www.masterraghu.com/subjects/np/introduction/unix_network_programming_v1.3/ch08lev1sec11.html

This quote answers your question:

Normally, it is a UDP client that calls connect, but there are applications in which the UDP server communicates with a single client for a long duration (e.g., TFTP); in this case, both the client and server can call connect.

-1
votes

I have not used connect() under UDP. I feel connect() was designed for two totally different purposes under UDP vs TCP.

The man page has some brief details on the usage of connect() under UDP:

Generally, connection-based protocol (like TCP) sockets may connect() successfully only once; connectionless protocol (like UDP) sockets may use connect() multiple times to change their association.

-1
votes

YES, you can. I do it too.

And your use case is the one where this is useful: both side act as both client & server, and there is only one process on both side.

-2
votes

I'd look at it more from the idea of what UDP is providing. UDP is an 8 byte header which adds 2 byte send and receive ports (4 bytes total). These ports interact with Berkeley Sockets to provide your traditional socket interface. I.e. you can't bind to an address without a port or vice-versa.

Typically when you send a UDP packet the receive side port (source) is ephemeral and the send side port (destination) is your destination port on the remote computer. You can defeat this default behavior by binding first and then connecting. Now your source port and destination port would be the same so long as the same ports are free on both computers.

In general this behavior (let's call it port hijacking) is frowned upon. This is because you have just limited your send side to only being able to send from one process, as opposed to working within the ephemeral model which dynamically allocates send side source ports.

Incidentally, the other four bytes of an eight byte UDP payload, length and CRC are pretty much totally useless as they are already provided in the IP packet and a UDP header is fixed length. Like come on people, computers are pretty good at doing a little subtraction.

-4
votes

If you are c/c++ lover, you may try route_io

It is simple to use, create a instance to accept different port routing to your function.

Example :

  void read_data(rio_request_t *req);
  void read_data(rio_request_t *req) {
  char *a = "CAUSE ERROR FREE INVALID";

  if (strncmp( (char*)req->in_buff->start, "ERROR", 5) == 0) {
    free(a);
  }
  // printf("%d,  %.*s\n", i++, (int) (req->in_buff->end - req->in_buff->start), req->in_buff->start);
  rio_write_output_buffer_l(req, req->in_buff->start, (req->in_buff->end - req->in_buff->start));
  // printf("%d,  %.*s\n", i++, (int) (req->out_buff->end - req->out_buff->start), req->out_buff->start);
}

int main(void) {

  rio_instance_t * instance = rio_create_routing_instance(24, NULL, NULL);
  rio_add_udp_fd(instance, 12345, read_data, 1024, NULL);
  rio_add_tcp_fd(instance, 3232, read_data, 64, NULL);

  rio_start(instance);

  return 0;
}