Can someone tell me a fast function to find the square of each pixel of an int image. I need it for iOS app dev. I am working directly on the memory of the image defined as
int *image_sqr_Baseaaddr = (int *) malloc(noOfPixels * sizeof(int));
for (int i=0; i<newNoOfPixels; i++)
image_sqr_Baseaaddr[i] = (int) image_scaled_Baseaaddr[i] * (int) image_scaled_Baseaaddr[i];
This is obviously the slowest function possible. I heard that ARM Neon intrinsics on the iOS can be used to make several operations in 1 cycle. Maybe that's the way to go ?
The problem is that I am not very familiar and don't have enough time to learn assembly language at the moment. So it would be great if anyone can post a Neon intrinsics code for the problem mentioned above or any other fast implementation in C/C++.
The only code in NEON intrinsics that I am able to find online is the code for RGB to gray http://computer-vision-talks.com/2011/02/a-very-fast-bgra-to-grayscale-conversion-on-iphone/