2
votes

I'm trying to understanding NAPI implementaion in linux kernel. These are my basic doubts.

1) NAPI disables further interrupts and handles the skbs' using polling

  • Who disables it?
  • Does the Interrupt handler should disable it?

    If yes - Isn't the time gap between disabling interrupt and handling the SOFTIRQ net_rx_action where actually polling is done is way too much.

2) By default all NAPI enabled drivers on receiving a single frame disable interrupt and handle remaining frames using polling in bottom halfs? or is there a logic where only if frames > 32 (on continous receving all frames in irq handler) makes a switch to poll mode?

3) Now coming to SHARED IRQ -

  • what happens to other devices interrupts , other device bottom half might not run since those devices are not there in poll_list.
1

1 Answers

6
votes

I wrote a comprehensive guide to understanding, tuning, and optimizing the Linux network stack which explains everything about network drivers, NAPI, and more, so check it out.

As far as your questions:

  1. Device IRQs are supposed to be disabled by the driver's IRQ handler after NAPI is enabled. Yes, there is a time gap, but it should be quite small. That is part of the tradeoff decision you must make: do you care more about throughput or latency? Depending on which, you can optimize your network stack appropriately. In any case, most NICs allow the user to increase (or decrease) the size of the ring buffer that tracks incoming network data. So, a pause is fine because packets will just be queued for processing later.

  2. It depends on the driver, but in general most drivers will enable NAPI poll mode in the IRQ handler, as soon as it is fired (usually) with a call to napi_schedule. You can find a walkthrough of how NAPI is enabled for the Intel igb driver here. Note that IRQ handlers are not necessarily fired for every single packet. You can adjust the rate at which IRQ handlers fire on most cards by using a feature called interrupt coalescing. Some NICs may not support this option.

  3. The IRQ handlers for other devices will be executed when the IRQ is fired because IRQ handlers have very high priority on the CPU. The NAPI poll loop (which runs in a SoftIRQ) will run on whichever CPU the device IRQ was handled. Thus, if you have multiple NICs and multiple CPUs, you can tune the IRQ affinity of the IRQs for each NIC to prevent starving a particular NIC.

  4. As for the example you asked about in the comments:

say NIC 1 and NIC 2 share IRQ line , lets assume NIC 1 is low load , NIC 2 high load and NIC 1 receives interrupt, driver of NIC 1 would disable interrupt until it's softirq is handled , say that time gap as t1 . So for time t1 NIC 2 interrupts are too disabled, right?

This depends on the driver, but in the normal case, NIC 1 only disables interrupts while the IRQ handler is being executed. The call to napi_schedule tells the softirq code that it should start running if it hasn't started yet. The softirq code runs asynchronously, so no NIC 1 does not wait for the softirq to be handled.

Now, as far as shared IRQs go: again it depends on the device and the driver. The driver should be written in such a way that it can handle shared IRQs. If the driver disables an IRQ that is being shared, all devices sharing that IRQ will not receive interrupts. This would be bad. One way that some devices solve this is by allowing a driver to read/write to a specific register causing that specific device to stop generating interrupts. This is a preferred solution as it does not block other devices generating the same IRQ.

When IRQs are disabled for NAPI, what is meant is that the driver asks the NIC hardware to stop sending IRQs. Thus, other IRQs on the same line (for other devices) will still continue to be processed. Here's an example of how the Intel igb driver turns off IRQs for that device by writing to registers.