Is there a way to intersect a single ray with a SIMD-pack of 8 triangles such that I don't have to use store or shuffle or any such slow instructions? My main issue is the final part of the intersection where I find which of the 8 triangles in the pack is nearest to the ray; I'm storing and then getting the minimum t-value, basically a horizontal min.
Moreover, is this pattern correct? I'm using an 8 way BVH with single ray traversal as described in the "Stackless Multi-BVH Traversal for CPU, MIC and GPU Ray Tracing" paper on top of which I've added single ray to bundles of triangles intersection. Would ray-bundle intesected with single-triangle, coupled with binary BVH be more suitable?
Thank you.