I'm reading this document: http://software.intel.com/en-us/articles/interactive-ray-tracing
and I stumbled upon these three lines of code:
The SIMD version is already quite a bit faster, but we can do better. Intel has added a fast 1/sqrt(x) function to the SSE2 instruction set. The only drawback is that its precision is limited. We need the precision, so we refine it using Newton-Rhapson:
__m128 nr = _mm_rsqrt_ps( x );
__m128 muls = _mm_mul_ps( _mm_mul_ps( x, nr ), nr );
result = _mm_mul_ps( _mm_mul_ps( half, nr ), _mm_sub_ps( three, muls ) );
This code assumes the existence of a __m128 variable named 'half' (four times 0.5f) and a variable 'three' (four times 3.0f).
I know how to use Newton Raphson to calculate a function's zero and I know how to use it to calculate the square root of a number but I just can't see how this code performs it.
Can someone explain it to me please?