I would like to force my program to miss cache L1 every time ( or nearly every time).
So, my IvyBridge has 32 KB L1 cache and it is 8-way. Therefore, every set contains 8 lines and every line has 64 bytes. First 6 bits of address map to set, 7 last bit map to offset in line, and others bits determine a tag.
How to miss cache? Should I use 8 ( every set has 8 lines) different load operation from the same set?