For 80x86 (I don't know about other architectures):
a) the normal instruction/data/unified caches are physically indexed (and therefore unaffected by paging tricks)
b) TLBs are virtually indexed. This means that (depending on a lot of things), for your circular buffer trick, you might expect a lot more TLB misses than you would have seen without the circular buffer trick. Things that could matter include the size of the area and the number of type of TLB entries used (4 KiB, 2 MiB/1 GiB); if the CPU prefetches TLB entries (recent CPUs do) and enough time is spent doing other work to ensure that the prefetched TLBs arrive before they're needed; and if the CPU caches higher level paging structures (e.g. page directories) to avoid fetching every level on a TLB miss (e.g. page table entry alone because the page directory was cached; or PML4 entry then PDPT entry then PD entry then page table entry).
c) Any uop cache (e.g. as part of a loop stream detector, or the old Pentium 4 "trace cache") is virtually indexed or not indexed at all (e.g. CPU just remembers "uops from start of loop"). That won't matter unless you have multiple copies of code; and if you do have multiple copies of code it becomes complicated (e.g. if duplication causes the number of uops to exceed the size of the uop cache).
d) Branch prediction is virtually indexed. This means that if you have multiple copies of the same code it becomes complicated again (e.g. it would increase "training time" for branches that aren't statically predicted correctly; and duplication can cause the number of branches to exceed the number of branch prediction slots and result in worse branch prediction).
e) The return buffer is virtually indexed, but I can't think of how that could matter (duplicating code wouldn't increase the depth of the call graph).
f) For store buffers (used for store forwarding); if stores are on different virtual pages then they have to assume a store may be aliased regardless of whether it is or not; and therefore shouldn't matter.
g) For write combining buffers; I'm honestly not sure if they're virtually indexed or physically indexed. Chances are that if it might matter you're going to run out of "write combining slots" before it actually does matter.