0
votes

I want to bypass the Linux network stack and transform raw packets to my custom codes in userland and handle them in there.

I know that you can make your custom drivers using pf-rings or DPDK and others. But I can not understand why should I make these kinds of drivers while I can use the Netfilter and hook my module to NF_IP_PRE_ROUTING state and send the packets to userland.

It would be a great help for me if anyone can explain me the main differences between them.

2

2 Answers

2
votes

There is a huge difference between DPDK and Netfilter hooks. When using Netfilter / hooking NF_IP_PRE_ROUTING you hijack the packet flow and copy packets form kernel space to user space. This copy causes a large overhead.

When using DPDK you're actually mapping you network card's packet buffers to a userspace memory area. Meaning that instead of the kernel getting an interrupt from the NIC, then passing it through all its queues until it reaches NF_IP_PRE_ROUTING which in turn will copy the packer to userland upon request, DPDK offers you the possibility to access the mapped packet buffers directly from userspace, bypassing all meta-handling by the kernel, effectively improving performance (at the cost of code complexity and security).

1
votes

There are a variety of techniques to grab raw packets and deliver them to a userspace application. The devil as usual in the details.

If all we need is to deliver packets to a userspace application -- there is no difference what solution to use. Libpcap, or tun/taps, or Netfilter, or pf-ring, or whatever. All will do just fine.

But if we need to process 100 million packets per second (~30 CPU cycles per packet on 3GHz) -- I don't think we have other options at the moment but DPDK. Google for "DPDK performance report" and have a look.

DPDK is a framework which works well on many platforms (x86, ARM, POWER etc) and supports many NICs. There is no need to write a driver, the support for the most popular NICs is already there.

There is also a support to manage CPU cores, huge pages, memory buffers, encryption, IP fragmentation etc etc. All designed to be able to forward 100 Mpps. If we need that performance...