2
votes

When a processor pre-fetches a cache-line of data, does it pre-fetch from that address up to the number of bytes or does it pre-fetch from that address up to half the cache line and back wards up to half the cache line?

For example assume cache line is 4 bytes and pre-fetching from address 0x06. Will it fetch bytes at 0x06 0x07 0x08 0x09 or will it pre-fetch from addresses 0x04 0x05 0x06 0x07.

I need this info for a program which I am writing and need to optimize.

2
I definitely think that this is highly implementation dependant. Maybe adding the specifics would help...ppeterka
A cache line is something like 64 bytes, and it starts at the address with the lowest six bits all zero. You find the cache line of an address by masking out its lowest six bits. (Or whatever power-of-two size your cache line has.)Kerrek SB
@KerrekSB Please make this comment an answer.Pascal Cuoq
@KerrekSB your comment just answered my question. Okay, so as in the the example I gave, assuming the cache line is 4 bytes and I'm fetching at address 0x06 what I'll get in the cache will be bytes at 0x04, 0x05, 0x06 and 0x07. The next cache line would then be at 0x08. So let's say I want to get the byte at 0x0A, I would then have 0x08, 0x09, 0x0A, 0x0B pre-fetched into the cache!d2alphame
See this question for finding the actual size. Looks like it's 32 bytes on old Intels and 64 on contemporary ones.Kerrek SB

2 Answers

2
votes

According to this (which is naturally Intel specific)

"The cache line size is 32 bytes, or 256 bits. A cache line is filled by a burst of four reads on the processor’s 64-bit data bus."

This means 8 bytes are fetched in parallel from main memory, within these 8 bytes there's no first or last, they arrive simultaneously, as the bytes are fetched over a 64 bit wide bus.

As it takes 4 reads to fill a cache line, Intel seems to not specify the order of these 4 reads - which mean you're left with some choices, e.g.

  • assume that there is no specific order
  • assume the address are fetched from lowest to highest, or vice versa.

The first assumption is of course the safest - since the order is as far as I can find undocumented(so it could depend on the model, or other factors)

1
votes

The cache lines have to have an alignment, so if your first read or first transaction that has a miss that causes a cache line fetch, is in the middle of a cache line it will go back and read the whole cache line (so the part before your address and the part after).

In general the cache uses a portion of the address to determine hit/miss. So if say the cache line was 256 bytes, then the address bits used to determine hit/mist would start at bit 8 and depending on how big the cache was (depth and ways) would determine how many bits to look at. So using my example if an access at address 0x123 produced a miss, then the cache line from 0x100-0x1FF would be read.

if it were the other way that would be a lot more logic and work and confusion, if you could start a cache line on any byte, it would be harder to determine hit/miss, and/or you would/could have overlapping cache lines (some item of data is in more than one place), that would have to be managed overall making the cache slower.