1
votes

Say I have a 2MB cache and a 3MB working data set. So when the cache is cold, it will experience 3MB of compulsory misses. However, after it has warmed up, there will be only conflict and capacity misses. So, now the miss rate is 33.33%. Can someone tell me if this correct? If not how do I calculate the miss rate accounting for the compulsory misses.

Thanks.

3

3 Answers

2
votes

The overall miss rate still depends on what the program actually does with the data and how the computer keeps the working set in cache. So access patterns, associativity, replacement algorithm all still matter in addition to capacity. It might be true that 1/3 of the memory references following the cache warmup are misses, but it depends on more than just data and/or instruction cache capacity.

1
votes

Your 33% assumption is wrong as Jason explained (or rather lacking more info), it's perfectly possible to get 100% miss rate by going over the 3M sequentially, thereby always thrashing your data before reuse.

But to answer the question in general - it's usually more interesting to measure miss rates over a warm cache, as it represents the steady state your processor is likely to be in most of the time (i.e - any time that's not immediately after a context switch, wakeup from powerdown, flushes, etc..). Most cache/CPU simulators offer a warmup phase for that purpose, and common benchmarks are usually also measuring performance after a few warmup iterations.

Of course, in other cases it might be interesting to measure the cold behavior, for e.g. if you're interested exactly in the above phases. Or if you're answering an exam question...

0
votes

Miss rate is defined as the number of misses per number of instructions.

Suppose the first piece of data of your working set is at address 0x00, and the cache can hold 4 words per line. So when you go to access data at 0x4, you will miss, but the data will be fetched into the cache for addresses 0x00, 0x04, 0x8, 0x0C. Now if your next access is to 0x4, it will be a hit since it is already in the cache. So if you access data sequentially at 4 byte intervals, you get a hit once out of every 4 times, so your miss rate is 25%.

Now the cache can hold multiple of these 4 word lines (depends on the design of the cache - associativity, number of lines, word/byte access) so those will again dictate your capacity and conflict misses.

Hope that gives you an insight into how the cache works!