I have an arm cortex-a9 quad core device, and I'm programming a multi-process application. These processes share the same source of input - a DMA buffer which they all access using a mmap() call.
I noticed that the time it takes for the processes to access the DMA memory, is significantly longer than it takes if I change the source of input to be a normal allocated buffer (i.e. allocated using malloc).
I understand why a DMA buffer must be non-cacheable, however, since I have the ability to determine when the buffer is stable (unchanged by the hardware, which is the case most of the time) or dirty (data has changed) I thought I might get a significant speed improvement if I'll make the memory region temporarily cacheable.
Is there a way to do that?
I'm currently using this line to map the memory:
void *buf = mmap(0, size, PROT_READ | PROT_WRITE,MAP_SHARED, fd, phy_addr);
Thanks!