4
votes

I'm developing a tftp client and server and I want to dynamically select the udp payload size to boost transfer performance.

I have tested it with two linux machines ( one has a gigabit ethernet card, the other a fast ethernet one ). I changed the MTU of the gigabit card to 2048 bytes and left the other to 1500.

I have used setsockopt(sockfd, IPPROTO_IP, IP_MTU_DISCOVER, &optval, sizeof(optval)) to set the MTU_DISCOVER flag to IP_PMTUDISC_DO.

From what I have read this option should set the DF bit to one and so it should be possible to find the minimum MTU of the network ( the MTU of the host that has the lowest MTU ). However this thing only gives me an error when I send a packet which size is bigger than the MTU of the machine from which I'm sending packets.

Also the other machine ( the server in this case ) doesn't receive the oversized packets ( the server has a MTU of 1500 ). All the UDP packets are dropped, the only way is to send packets of 1472 bytes.

Why the hosts do this? From what I have read, if I send a packet larger than MTU, the ip layer should fragment it.

3
Will the IP layer still fragment the packets if MTU discovery is on?Konerak
Doesn't the DF bit prevent that fragmentation?CodesInChaos
Offcourse. So the sending host is expected himself to discover the MTU? Or does the underlaying library do that for him?Konerak
Thanks to all for the replies. I should add something about my experiment. I have tried with tracepath ( this utility do path MTU discovery ). I have setted the MTU to 4096 and when I start tracepath my own host says that the message is too long, so tracepath reduces the message size. However as far as the message passes the local host, no other host will signal the error. Tracepath will continue to send 4096 bytes long messages and the remote hosts will drop them happily without sending an ICMP reply back.Alex Vitale
You need to remember that ICMP delivery is best-effort, not reliable, and also that many firewalls are configured to only allow ICMP echo packets and not other types.Ben Voigt

3 Answers

8
votes

I fail to see the problem. You are setting the "don't fragment" bit, and you send a package smaller than the sending host's MTU, but larger than the receiving host's MTU. Of course nobody will fragment here (doing so would violate the DF bit). Instead, the sending host should get an ICMP message back.

Edit: IP specifies that an ICMP error message type 3 (destination unreachable) code 4 (Fragmentation Required but DF Bit Is Set) is sent to the originating host at the point where the fragmentation would have occurred. The TCP layer handles this on its own for PMTU discovery. On connection-less sockets, Linux reports the error in the socket's error queue if the IP_RECVERR option is activated; see ip(7).

2
votes

That "DF bit" you're setting, stands for "Don't Fragment". The IP layer should not be expected to fragment packets when you've told it not to.

1
votes

It is not correct to run hosts with different interface MTUs on the same subnet1.

This is a host/network misconfiguration, and IP path MTU discovery is not expected to work correctly in this situation.

If you wish to test your application's path MTU discovery, you will need to set up multiple subnets connected by a router2, with different MTUs. In this situation, the router is the device that will pick up the MTU mismatch, and send back an ICMP "Fragmentation Needed" error.


1. Well, technically, same broadcast domain.
2. The devices sold as "home routers" are really router/switches - they route between the WAN and the LAN, but switch between the ethernet ports on the LAN. This isn't sufficient to separate networks with different MTUs.