0
votes

I'm glad that you have read my thread. Thank you very much.

My question is about ARM NEON.

First question is about the register's size.

I'd like to know "Apple A6" and "Cortex A15" 's actual SIMD register size.

Second question is about the SIMD instruction's cycle.

I assume that lot of ARM processor's NEON register has 64bit.

According to manual, "As dual view, it's 128 bit wide"

Is it means even if I use 4 x 32bit value at 2 of 64 bit NEON registers ,

It'll be processed at one cycle?

I'd like to know different cycle between 128 bit NEON and 64 bit NEON.

Thank you!

1
All this should be covered in excruciating detail in the CPU manuals, shouldn't it? I mean, that's what a CPU manual is for. - cHao
cHao // Did you find Apple A6's Manual? - Henrik
Nope. I didn't look for it. That's your job. :) - cHao
First , I asked here because I couldn't find it, anything that related about Apple A6. - Henrik
Second, Following question is about the cycle. I need some expert person's answer who has specify knowledge about processing cycle on NEONSIMD instruction. - Henrik

1 Answers

1
votes

It depends on the instruction executed.

As a general rule of thumb, simple ALU instructions require no more cycles dealing with Q registers than D registers, but multiply and/or permute instructions need twice the cycles when operating on Q registers. You should also be aware that very often the results in the lower 64-bits of Qd are available earlier than the ones in the upper half.

I don't think Apple's A6 behaves much differently than the "original" CA-15 when it comes to cycles. And since they all share the very same ISA, you can be assured that the registers are the same within the ARMv7 architecture.