I'm porting some ARM NEON code to 64-bit ARM-v8, but I can't find a good documentation about it.
Many features seems to be gone, and I don't know how to implement the same function without using them.
So, the general question is: where can I find a complete reference for the new SIMD implementation, including explanation of how to do the same simple tasks which are explained in the many ARM-NEON tutorials?
Some questions about particular features:
1 - How do I load a value in all the lane of a Dx register? The old code was
mov R0, #42
vdup.8 D0, R0
My guess is:
mov W0, #42
dup V0.8B, W0
2 - How do I load multiple Dx/Qx registers with interleaved data? In the old code this was:
vld4.8 {D0-D3}, [R0]!
But I can't find anything in the new docs.
I understand it's a completely new model, but it's not very well-documented (or at least, I'm unable to find any reference with readable samples)
dup
looks to be the equivalent ofvdup
; the equivalent ofvld4
seems to be, perhaps unsurprisingly,ld4
. It might be worth trying to track down a copy of the old "ARMv8 Instruction Set Overview" PDF - it's gone from the ARM website since the proper ARMv8-A ARM was published, but was a lot easier to skim. – Notlikethat