Two-way communication to PCIe device via /dev/mem in Linux user-space?

Question

Pretty sure I already know the answer to this question since there are related questions on SO already (here, here, and here,, and this was useful),,, but I wanted to be absolutely sure before I dive into kernel-space driver land (never been there before).

I have a PCIe device that I need to communicate with (and vice versa) from an app in linux user space. By opening /dev/mem, then mmap'ing,, I have been able to write a user-space driver built on top of pciutils that has allowed me to mmap the BARs and successfully write data to the device. Now, we need comm to go the other direction, from the PCIe device to the linux user app. In order for this to work, we believe we are going to need a large chunk (~100MB) of physically contiguous memory that never gets paged/swapped. Once allocated, that address will need to be passed to the PCIe device so it knows where to write its data (thus I don't see how this could be virtual, swappable memory). Is there any way to do this without a kernel space driver? One idea here was floated,, perhaps we can open /dev/mem and then feed it an ioctl command to allocate what we need? If this is possible, I haven't been able to find any examples online yet and will need to research it more heavily.

Assuming we need a kernel space driver, it will be best to allocate our large chuck during bootup, then use ioremap to get a kernel virtual address, then mmap from there to user-space, correct? From what I've read on kmalloc, we won't get anywhere close to 100MB using that call, and vmalloc is no good since that's virtual memory. In order to allocate at bootup, the driver should be statically-linked into the kernel, correct? This is basically an embedded application, so portability is not a huge concern to me. A module rather than a statically-linked driver could probably work, but my worry there is memory fragmentation could prevent a physically contiguous region from being found, so I'd like to allocate it asap from power-on. Any feedback?

EDIT1: My CPU is an ARM7 architecture.

Ctx Ctx · Accepted Answer · 2016-01-19T23:25:25

Hugepages-1G

Current x86_64-processors not only support 4k and 2M, but also 1G-pages (flag pdpe1gb in /proc/cpuinfo indicates support).

These 1G-pages must already be reserved at kernel boot, so the boot-parameters hugepagesz=1GB hugepages=1 must be specified.

Then, the hugetlbfs must be mounted:

mkdir /hugetlb-1G
mount -t hugetlbfs -o pagesize=1G none /hugetlb-1G

Then open some file and mmap it:

fd = open("/hugetlb-1G/page-1", O_CREAT | O_RDWR, 0755);
addr = mmap(NULL, SIZE_1G, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

You can now access 1G of physically contiguous memory at addr. To be sure it doesn't get swapped out you can use mlock (but this is probably not even necessary at all for hugepages).

Even if your process crashes, the huge page will be reserved for mapping it like above, so the pci-e device will not write rogue into system or process memory.

You can find out the physical address by reading /proc/pid/pagemap.

Two-way communication to PCIe device via /dev/mem in Linux user-space?

2 Answers