3
votes

Can anyone suggest why this (classic) BPF program sometimes lets non-DHCP-response packets through:

# Load the Ethertype field
BPF_LD | BPF_H | BPF_ABS    12
# And reject the packet if it's not 0x0800 (IPv4)
BPF_JMP | BPF_JEQ | BPF_K   0x0800    0    8

# Load the IP protocol field
BPF_LD | BPF_B | BPF_ABS    23
# And reject the packet if it's not 17 (UDP)
BPF_JMP | BPF_JEQ | BPF_K   17        0    6

# Check that the packet has not been fragmented
BPF_LD | BPF_H | BPF_ABS    20
BPF_JMP | BPF_JSET | BPF_K  0x1fff    4    0

# Load the IP header length field
BPF_LDX | BPF_B | BPF_MSH   14
# And load that offset + 16 to get the UDP destination port
BPF_LD | BPF_IND | BPF_H    16
# And reject the packet if the destination port is not 68
BPF_JMP | BPF_JEQ | BPF_K   68        0    1

# Accept the frame
BPF_RET | BPF_K             1500
# Reject the frame
BPF_RET | BPF_K             0

It doesn't let every frame through, but under heavy network load it fails quite often. I'm testing it with this Python 3 program:

import ctypes
import struct
import socket
ETH_P_ALL = 0x0003
SO_ATTACH_FILTER = 26
SO_ATTACH_BPF = 50
filters = [
0x28, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x00, 0x00,  0x15, 0x00, 0x00, 0x08, 0x00, 0x08, 0x00, 0x00,
0x30, 0x00, 0x00, 0x00, 0x17, 0x00, 0x00, 0x00,  0x15, 0x00, 0x00, 0x06, 0x11, 0x00, 0x00, 0x00,
0x28, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00, 0x00,  0x45, 0x00, 0x04, 0x00, 0xff, 0x1f, 0x00, 0x00,
0xb1, 0x00, 0x00, 0x00, 0x0e, 0x00, 0x00, 0x00,  0x48, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00,
0x15, 0x00, 0x00, 0x01, 0x44, 0x00, 0x00, 0x00,  0x06, 0x00, 0x00, 0x00, 0xdc, 0x05, 0x00, 0x00,
0x06, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
]

filters = bytes(filters)

b = ctypes.create_string_buffer(filters)
mem_addr_of_filters = ctypes.addressof(b)
pf = struct.pack("HL", 11, mem_addr_of_filters)
pf = bytes(pf)

def main():                                                                                     
    sock = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.htons(ETH_P_ALL))            
    sock.bind(("eth0", ETH_P_ALL))                                                              
    sock.setsockopt(socket.SOL_SOCKET, SO_ATTACH_FILTER, pf)                                    
#    sock.send(req)                                                                             
    sock.settimeout(1)                                                                          
    try:                                                                                        
        data = sock.recv(1500)                                                                  
                                                                                                
        if data[35] == 0x43:                                                                    
            return                                                                              
        print('Packet got through: 0x{:02x} 0x{:02x}, 0x{:02x}, 0x{:02x}'.format(data[12], data[13], data
    except:                   
        print('Timeout')                                                            
        return                                              
    sock.close()                                            
                              
for ii in range(1000):                                                                                   
    main()                    

If I do this while SCPing a big core file to the host running that script, it doesn't reach the one-second timeout in the large majority of, but not all, cases. Under lighter load failures are much rarer - eg twiddling around on an ssh link while the socket is receiving; sometimes it gets through 1000 iterations without failure.

The host in question is Linux 4.9.0. The kernel has CONFIG_BPF=y.

Edit

For a simpler version of the same question, why does this BPF program let through any packets at all:

BPF_RET | BPF_K    0

Edit 2 The tests above were on an ARM64 machine. I've retested on amd64 / Linux 5.9.0. I still see failures, though not nearly as many.

1

1 Answers

1
votes

I got a response on LKML explaining this.

The problem is that the filter is applied as the frame arrives on the interface, not as it's passed to userspace with recv(). So under heavy load, frames arrive between the creation of the socket with socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.htons(ETH_P_ALL)) and the filter being applied with sock.setsockopt(socket.SOL_SOCKET, SO_ATTACH_FILTER, pf). These frames sit in the queue; once the filter is applied, subsequent arriving packets have the filter applied to them.

So once the filter is applied, it's necessary to "drain" any queued frames from the socket before you can rely on the filter.