1
votes

Context :

Debian 64 bits.

Making a linux-only userspace networking stack that I may release open source.

Everything is ready but one last thing.

The problem :

I know about poll/select/epoll and use them heavily already but they are too complicated for my need and tend to add latency (few nanoseconds -> too much).

The need :

A simple mean to notify from the kernel to an application that packets are to be processed and the reverse with a shared mmap file operating as a multi-ring buffer. It would obviously not incur a context-switch.

I wrote a custom driver for my NIC (and plan to create others for the big league -> 1-10Gb).

I would like two shared arrays of int and two shared arrays of char. I have the multiprocess and non blocking design already working.

A peer (int and char) for the kernel -> app direction; another for app -> kernel.

But how to notify at the very moment mmap has changed. I read a msync would do it but it is slow too. That is my problem. Mutexes lead to dead slow code. Spinlocks tend to waste cpu cycles on overload.

Not talking about a busy while(1) loop always reading -> cpu cycles waste.

What do you recommend ?

It is my last step.

Thanks

Update:

I think i will have to pay the latency of setting the interrupt mask anyway. So it should ideally be amortized by the number of incoming packets during that required latency. The first few packets after a burst will always be slower i guess since i obviously don't infinite loop.

The worst case would be that packets are sparse to come (hence why seeking saturating link performance at the first place). That worts case will be met at times. But who cares, it is still faster than the stock kernel anyway. Trade-offs trade-offs :)

1
I am not expert, but I believe it is not possible. AFAIK, context switches (& user-mode -> kernel-mode transitions) are necessarily heavy operations. Ask on lkml.orgBasile Starynkevitch
You cannot have a process "not incur a context-switch" and wait on something to happen in the kernel unless A) you are on a multi-core / SMP system, and B) you're busy waiting (e.g. spinlocks). By definition, anything that waits without "spinning" will do a context switch and allow another process / thread to run.Brian McFarland
Why not simply send a signal, SIGIO for example? Also few nanoseconds is a couple of processor instructions. It is almost impossible to reach such latency, may be mean microseconds?myaut
@sergeyklyaus: sigio is complex and slow. I thought about my problem during the night and may have found something. We definitely talk about nanoseconds (counting every cpu cycle) not micro.Larry
@Larry That's going to be amazingly challenging on a non-RTOS operating system communicating between userspace<->kernel such as linux - you'll never have any guarantees the kernel doesn't decide to spend time doing something else for far longer than a few ns. The best you can do is a spinlock or park the user space process on a wait queue (with e.g. a custom ioctl) - and buffer as much as you can.nos

1 Answers

1
votes

It seems like you are taking approaches which are common to networking in embedded systems based on RTOS.

In Linux you are not supposed to write your own network stack - Linux kernel already have a good network stack. You are just expected to implement a NIC device driver (in kernel) which hands over all the packets for processing by the Linux network stack.

Any Linux network related components are always in the kernel - and the problems you describe provide some explanation why that is essential for reasonable performance.

The only exception is userspace network filters (e.g., for firewalls) that may be hooked to the iptables mechanism - and those incur higher latencies on packets that routed through them.