2
votes

I am reading What every programmer should know about memory and I am struggling with the notions of CPU cache tags (page 15.).

If I understand correctly, each CPU cache line has a tag, which specifies to which data in main memory the given line corresponds to. That is, if you write to a particular line, you use the tag to find out where in RAM should you write the contents of this line to. Conversely, if you read data from RAM into a L1 cache line, you compute a tag from the RAM address and store it somewhere, such that you know where the data in the L1 cache line came from. A tag is something like a pointer.

I would like to ask if this tag itself is written somewhere in the cache line, or is there some special memory next to the L1 cache to store the tag?

In my system, the L1 line size is 64 bytes and a ponter is 8 bytes. Should I aim to make frequently iterated objects in my program to be not more than 64 bytes? Or, given that the size of the tag should not be greater than the size of a pointer, should I aim for 56 bytes?

1

1 Answers

4
votes

Caches are smaller than main memory. Therefore you need a mechanism to map large number of main memory entries to smaller number of cache entries. In order to do this efficiently and cheaply what most caches do I'd to use some number (let's say 10) of least significant bits from the address to index a cache entry. If you pick least 10 bits of the address, you have 1024 entries in your cache. Also there is a high chance that more than one memory address with the same least 10 bits. Therefore you need to distinguish different memory addresses with the same least 10 bits. So the tag comes into action in order to serve this. When you want to store data in the cache, the entry is selected using least 10 bits. And think of a cache line as an entry with two fields to be filled, first one is the tag where we store the remainder of the address bits and the second field is where we store data.

When someone says a cache line is xxx bytes (as in your case 64 bytes) they mean just the data portion. So in your case if you want your data to be cache line aligned, you need to pad them to be 64 bytes, NOT 56!

Same goes with the size. When the size of a cache is quoted, it typically mean the size of the entire data portion, the actual size is a bit bigger due to extra book keeping such as tag, valid bit etc