AAch64 Advanced SIMD vector addition

Question

I am trying to add two advanced SIMD vector inside my assembly code. Here, I have two vectors v0 and v1 and I want to add upper half of v0 with lower half of v1 and put the result in upper half of v0. Performance is critical in my code, so I am trying to find a way that I can do this with one addition instruction. I know that I can move the upper half into another register and simply use UADDL instruction.
In AArch32 NEON instruction set, it can be done using Dn instead of Qn. For example in my case it can be done as: vqadd.u64 d1, d1, d2 Is there any way around that I can do this in AArch64 advanced SIMD instructions?

You'll have to rearrange your code to avoid the situation. Can you post the code fragment to illustrate how you've got to the point of needing to do this? — sh1

InfinitelyManic InfinitelyManic · Accepted Answer · 2016-10-07T20:16:10

As indicated by @sh1, you will need to rearrange some things.

The equivalent AArch64 instruction for vqadd is {sqadd or uqadd}. However, they will add, let's say, the 8 single bytes 0-7 in v0 to the 8 single bytes 0-7 in v1; which is not quite what you want. But if you can rearrange the load instruction of, let's say, v1 you can achieve the intended goal.

.data
 array:        .ascii  "73167176531330624919225119674426574742355349194934"
...
ldr x20,=array // ptr
ld1 {v0.16b, v1.16b}, [x20] // load multiple 1-element structures to two consecutive elements
uqadd v0.8b,v1.8b,v0.8b

...
(gdb) p $v0.b.s
$14 = {7, 3, 1, 6, 7, 1, 7, 6, 5, 3, 1, 3, 3, 0, 6, 2}
(gdb) p $v1.b.s
$15 = {4, 9, 1, 9, 2, 2, 5, 1, 1, 9, 6, 7, 4, 4, 2, 6}
(gdb)
(gdb) p $v0.b.s
$26 = {11, 12, 2, 15, 9, 3, 12, 7, 0, 0, 0, 0, 0, 0, 0, 0}

AAch64 Advanced SIMD vector addition

1 Answers