5
votes

I am looking into the possibility of filtering incoming UDP traffic based on IP addresses on a Linux machine, discarding packets completely that match any of the filter addresses. The set of IP addresses I am interested in is dynamically (and frequently) changing and is not known a priori. Packets that are deemed to be discarded should skip all further processing. I can grant CAP_NET_RAW capability to the process but do not want to write my own driver or modify the kernel.

Background information

A practical approach I am using for compactly representing a large set of IP addresses is a Bloom filter. This approach is already used by a dynamic packet filtering approach implemented as a device driver:

http://luca.ntop.org/Blooms.pdf

However, I have user-level code and have no means to tweak the kernel or write a device driver of my own.

Similarly, I do already have a solution that sniffs packets based on IP addresses in an efficient manner based on a PF_PACKET socket and an RX_RING, as is done in netsniff-ng:

http://netsniff-ng.org/

My approach is the extension of the capture mechanism in netsniff (or tcpdump or Wireshark) extended with the Bloom filter principle to get more compact Berkeley Packet Filter (BPF) programs. This works excellently but with the side effect that even if the filter discards the packet (and thus it does not appear in RX_RING), it still continues its journey in the kernel. Eventually, given there are no open sockets for much of the filtered traffic (which is mostly synthesized, as if by netsniff's trafgen) received, ICMP destination-unreachable messages are generated.

The question, formulated in a different way is whether there is a C/C++ approach to selectively discard traffic based on custom code (e.g. with a Bloom filter) in an early stage of network stack processing?

I have looked at approaches based on iptables but managing firewall rules via iptables-restore seems far too cumbersome for the scenario. Also, the addresses are not contiguous sets of IP addresses and would thus lead to a long list of separate addresses to test against.

Efficiency is a critical aspect due to the high volume of traffic involved.

2

2 Answers

3
votes

You might want to play with iptables perhaps using libipq and libiptc libraries.

0
votes

This answer wont provide much help regarding selectively discarding traffic, but instead of the Bloom filter, you could consider using a Patricia trie (Radix Tree) for the IP address representation and lookups? At work, we have to store and lookup large sets of dynamic IP addresses and ranges and have found the Patricia trie to be among the most efficient.

BTW I tried loading the Bloom link, and it wont load for me.