I have to extract non-zero values of an __m128i register. For example I have a vector with eight unsigned shorts.
__m128i vector {40, 0, 22, 0, 0, 0, 0, 8}
I want to extract the 40, 22 and 8 with a minimal amount of SSE instructions. The non-zero values will then be stored in an array of non zero values.
{40, 22, 8, more values from different vectors ... }
Is it possible to shuffle them or is there a good intrinsic to extract and store?