4
votes
  • I have my kernel module that receives interrupt (top-half) from the external GPIO pin change.
  • After receiving the interrupt, kernel module should wake up or somehow invoke the function/thread in the user-space that will start processing. Time is very limited.
  • No data needs to be sent, just the signal.
  • CPU is multicore, user-space app will have affinity to one core.

There are so many ways to do kernel-userspace communication. Which one has the lowest latency? (i.e. the time between ISR and waking up the function).

(Side note: Yes, I can benchmark them, the reason I'm asking the question is because I may not know about every possible solution)

2

2 Answers

4
votes

The interrupt handler could write directly into a mapped memory range provided by the process, while the process is busy-waiting for that memory location to change. This can even be performed in the top-half and should provide you with the lowest latency possible. Make sure, that the memory location provided is locked into physical memory, since it cannot be paged in during the isr.

Similar approaches can be found with packet sockets and PACKET_MMAP, where communication between kernel and userspace takes place through a shared memory space (see the Kernel Documentation).

If you do not care about resource management through the OS (because you only have a single application waiting for external input), you could also choose to directly access the hardware from userspace (with iopl/inb/outb and friends).

0
votes

A standard way for the process to wait for the kernel to wake it is to use the poll() system call and for your device driver's interrupt handler to wake any threads waiting for it.

The longest latency operation listed in your 4 bullets is waking the application thread, so if you need even shorter latency you need the thread to be awake but waiting for the event.

The lowest latency mechanism I have used is to have the interrupt write a word in the application process and to have a thread reading that word and proceeding when the value is changed. In effect, a spin lock between user space and kernel space. Use this mechanism when you can dedicate a CPU core to the spin lock or when you expect the waiting time to be very short.

You can use an ioctl() to pass in a user space pointer to the driver so it knows which word to update.