9
votes

I want to know how Intel i7 processor's branch prediction works?

Currenly, I know the predictor called "dynamic branch prediction".

For 1-bit predictor: The hardware always predicts a branch instruction to take the same direction it took the last time it was executed.

A refined version working better in practice is the 2-bit predictor. In order to further improve the prediction accuracy, 2-bit prediction schemes were introduced. In these schemes the prediction must be wrong twice before it is changed.

Does i7 have the same predictor as the above?

2
I can almost guarantee you that the full details are a company proprietary secret.Mysticial

2 Answers

9
votes

Most of what we know about the branch predictor comes from testing. Intel has not released much in the way of details. The misprediction penalty is about 18 clock cycles, so accurate branch prediction is important.

Intel uses a two level branch predictor. The inner level is believed to be unchanged from the Core 2 CPUs.

The outer level is more sophisticated and can even correctly predict loops with fixed counts up to 64. Two 18-bit global history buffers are used. One contains all jumps that have been taken at least once. The other contains the most important jumps. (The number of entries in these buffers is unknown.)

Note that indirect jumps and calls have their own predictor.

6
votes

The short answer is no.

I'm reasonably certain no Intel CPU has used the one-bit predictor you describe.

The original Pentium used a two-bit descriptor, much like you describe. The four values it used were normally described as "strongly not taken", "weakly not taken", "weakly taken", and "strongly taken". Anytime a branch is taken, the counter is moved one spot toward "strongly taken". Anytime a branch is not taken, it's moved one spot toward "strongly not taken". It's a saturating counter, so if (for example) a branch is taken when the counter is already at "strongly taken", the counter simply doesn't change. [I should add: this is how Intel documented it, and apparently intended it to work -- if memory serves, Agner Fog and Terje Mathiesen found that it really works a little differently -- and, generally not as well as this would).

As of the Pentium/MMX and Pentium Pro, they designed a somewhat more sophisticated two-level branch predictor. It added a 4-bit branch history, which it used to select one of 16 2-bit counters. This meant if you had a pattern of (for example) taken, taken, not taken, taken, (then repeat) it would quickly adjust to that, and predict all the branches correctly.

I'm not sure about the details of the branch prediction in the i7, but I think it's safe to say that it's at least as sophisticated as the Pentium Pro's was, not a throwback to the original Pentium's.