3
votes

I'm also interested in _mm_cvtsi32_si128, but if there isn't one for that it's not such a big deal.

For shuffle, I know that in certain cases I can use the Neon equivalent of alignr (vext), but that by itself isn't going to cover all the situations I need to address.

1

1 Answers

3
votes

For _mm_shuffle_epi8 there is VTBL.

For _mm_unpackXX_YYY the closest is probably VMOVL but you will probably need to do a little extra work to get the equivalent functionality, e.g.

    int32x4_t v = vld1q_s32(p);                   // load vector from p

    int64x2_t vl = vmovl_s32(vget_low_s32(v));    // unpack v into 2 vectors
    int64x2_t vh = vmovl_s32(vget_high_s32(v));