I'm building a data acquisition system based on the UltraScale+ FPGA equipped with arm64 CPU. The data are transmitted to RAM via DMA. The DMA buffers in the driver are reserved as below:
virt_buf[i] = dma_zalloc_coherent(&pdev->dev, BUF_SIZE, &phys_buf[i],GFP_KERNEL);
In the driver's mmap function, the mapping to the user space is done in the following way:
#ifdef ARCH_HAS_DMA_MMAP_COHERENT
printk(KERN_INFO "Mapping with dma_map_coherent DMA buffer at phys: %p virt %p\n",phys_buf[off],virt_buf[off]);
res = dma_mmap_coherent(&my_pdev->dev, vma, virt_buf[off], phys_buf[off], vsize);
#else
physical = phys_buf[off];
res=remap_pfn_range(vma,vma->vm_start, physical >> PAGE_SHIFT , vsize, pgprot_noncached(vma->vm_page_prot));
printk(KERN_INFO "Mapping with remap_pfn_range DMA buffer at phys: %p virt %p\n",physical,virt_buf[off]);
#endif
On my UltraScale+ CPU remap_pfn_range is used. In the user space application the data are read from the buffer, and currently immediately send in UDP packets with length limited to MAX_DGRAM (originally equal to 572).
int i = 0;
int bleft = nbytes;
while(i<nbytes) {
int bts = bleft < MAX_DGRAM ? bleft : MAX_DGRAM;
if (sendto(fd,&buf[nbuf][i],bts,0, res2->ai_addr,res2->ai_addrlen)==-1) {
printf("%s",strerror(errno));
exit(1);
}
bleft -= bts;
i+= bts;
}
Everything worked perfectly on the 32-bit Zynq FPGA. However, after I moved it to the 64-bit UltraScale+ FPGA, I started to receive random errors, after a few hundreds of transfers.
[ 852.703491] Unhandled fault: alignment fault (0x96000021) at 0x0000007f82635584
[ 852.710739] Internal error: : 96000021 [#4] SMP
[ 852.715235] Modules linked in: axi4s2dmov(O) ksgpio(O)
[ 852.720358] CPU: 0 PID: 1870 Comm: a4s2dmov_send Tainted: G D O 4.4.0 #3
[ 852.728001] Hardware name: ZynqMP ZCU102 RevB (DT)
[ 852.732769] task: ffffffc0718ac180 ti: ffffffc0718b8000 task.ti: ffffffc0718b8000
[ 852.740248] PC is at __copy_from_user+0x8c/0x180
[ 852.744836] LR is at copy_from_iter+0x70/0x24c
[ 852.749261] pc : [<ffffffc00039210c>] lr : [<ffffffc0003a36a8>] pstate: 80000145
[ 852.756644] sp : ffffffc0718bba40
[ 852.759935] x29: ffffffc0718bba40 x28: ffffffc06a4bae00
[ 852.765228] x27: ffffffc0718ac820 x26: 000000000000000c
[ 852.770523] x25: 0000000000000014 x24: 0000000000000000
[ 852.775818] x23: ffffffc0718bbe08 x22: ffffffc0710eba38
[ 852.781112] x21: ffffffc0718bbde8 x20: 000000000000000c
[ 852.786407] x19: 000000000000000c x18: ffffffc000823020
[ 852.791702] x17: 0000000000000000 x16: 0000000000000000
[ 852.796997] x15: 0000000000000000 x14: 00000000c0a85f32
[ 852.802292] x13: 0000000000000000 x12: 0000000000000032
[ 852.807586] x11: 0000000000000014 x10: 0000000000000014
[ 852.812881] x9 : ffffffc0718bbcf8 x8 : 000000000000000c
[ 852.818176] x7 : ffffffc0718bbdf8 x6 : ffffffc0710eba2c
[ 852.823471] x5 : ffffffc0710eba38 x4 : 0000000000000000
[ 852.828766] x3 : 000000000000000c x2 : 000000000000000c
[ 852.834061] x1 : 0000007f82635584 x0 : ffffffc0710eba2c
[ 852.839355]
[ 852.840833] Process a4s2dmov_send (pid: 1870, stack limit = 0xffffffc0718b8020)
[ 852.848134] Stack: (0xffffffc0718bba40 to 0xffffffc0718bc000)
[ 852.853858] ba40: ffffffc0718bba90 ffffffc0006a1b2c 000000000000000c ffffffc06a9bdb00
[ 852.861676] ba60: 00000000000005dc ffffffc071a0d200 0000000000000000 ffffffc0718bbdf8
[ 852.869488] ba80: 0000000000000014 ffffffc06a959000 ffffffc0718bbad0 ffffffc0006a2358
[...]
[ 853.213212] Call trace:
[ 853.215639] [<ffffffc00039210c>] __copy_from_user+0x8c/0x180
[ 853.221284] [<ffffffc0006a1b2c>] ip_generic_getfrag+0xa4/0xc4
[ 853.227011] [<ffffffc0006a2358>] __ip_append_data.isra.43+0x80c/0xa70
[ 853.233434] [<ffffffc0006a3d50>] ip_make_skb+0xc4/0x148
[ 853.238642] [<ffffffc0006c9d04>] udp_sendmsg+0x280/0x740
[ 853.243937] [<ffffffc0006d38e4>] inet_sendmsg+0x7c/0xbc
[ 853.249145] [<ffffffc000651f5c>] sock_sendmsg+0x18/0x2c
[ 853.254352] [<ffffffc000654b14>] SyS_sendto+0xb0/0xf0
[ 853.259388] [<ffffffc000084470>] el0_svc_naked+0x24/0x28
[ 853.264682] Code: a88120c7 a8c12027 a88120c7 36180062 (f8408423)
[ 853.270791] ---[ end trace 30e1cd8e2ccd56c5 ]---
Segmentation fault
root@Xilinx-ZCU102-2016_2:~#
The strange thing is, that when I simply read words from the buffer, it does not cause any alignment errors.
It seems, that the send function improperly uses the __copy_from_user function, causing unaligned memory access. The question is: is it the kernel bug, or have I done something incorrectly?
However, usually, sending the data block not starting at 8-byte boundary does not trigger the alignment error. The problem occurs with relatively low probability. I was not able to isolate the conditions that lead to the error.
I have worked around the problem by adjusting the MAX_DGRAM so that it is a multiple of 8. However I'm afraid, that the problem may reappear if the data in the mmapped buffer are submitted to more complex processing. Some people reported similar problems in arm64 architecture related to memcpy function (e.g. [https://bugs.launchpad.net/linux-linaro/+bug/1271649]).
What is the correct method for mapping of coherent DMA buffers to user space to avoid memory alignment errors?