Intel CPUs Instruction Queue provides static branch prediction?

Question

In Volume 3 of the Intel Manuals it contains the description of a hardware event counter:

BACLEAR_FORCE_IQ

Counts number of times a BACLEAR was forced by the Instruction Queue. The IQ is also responsible for providing conditional branch prediction direction based on a static scheme and dynamic data provided by the L2 Branch Prediction Unit. If the conditional branch target is not found in the Target Array and the IQ predicts that the branch is taken, then the IQ will force the Branch Address Calculator to issue a BACLEAR. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline.

I always thought the Branch Address Calculator performs the static prediction algorithm (when the Branch Target Buffer contains no branch entry)?

Can anybody confirm which of the above two are correct? I cannot find anything.

I deleted my answer since it wasn't helpful. But I noticed that the Intel Optimization reference manual says: "The Intel Core microarchitecture does not use the static prediction heuristic. However, to maintain consistency across Intel 64 and IA-32 processors, software should maintain the static prediction heuristic as the default." — Gabriel Southern

Surt Surt · Accepted Answer · 2016-11-08T23:05:22

If the conditional branch target is not found in the Target Array

How can it not be found? you mask it with a bit mask to find the index into the table and get the next branch target.

Well if you after you read the result check that the call address does not match the tag on the result you have a "not taken" result.

At this point we get to the second part of the statement.

and the IQ predicts that the branch is taken

So branch target says "not taken" and the IQ predicts that it will be taken we have a contradiction.

To solve the contradiction the IQ wins as the branch target is just "if we jump, we jump here", but the IQ predicts if we jump or not based on a lot more logic.

Hence

then the IQ will force the Branch Address Calculator to issue a BACLEAR. Each BACLEAR asserted by the BAC generates approximately an 8 cycle bubble in the instruction fetch pipeline.

Which is good in a 14-19 stage pipeline. The 8 cycles is if the IQ can read the actual target address from the instruction (combined with PC), if the value needs to be read in a register (that is possible not yet retired) it could take a bit longer.

Intel CPUs Instruction Queue provides static branch prediction?

2 Answers