[llvm-dev] Why did Intel change his static branch prediction mechanism during these years?
2016 quekong via llvm-dev
llvm-dev at lists.llvm.org
Tue Aug 14 14:09:28 PDT 2018
( I don't know if it's allowed to ask such question, if not, please remind me. )
I know Intel implemented several static branch prediction mechanisms
these years:
* 80486 age: Always-not-take
* Pentium4 age: Backwards Taken/Forwards Not-Taken
* PM, Core2: Didn't use static prediction, randomly depending on
what happens to be in corresponding BTB entry , according to agner's
optimization guide ¹.
* Newer CPUs like Ivy Bridge, Haswell have become increasingly
intangible, according to Matt G's experiment ².
And Intel seems don't want to talk about it any more, because the
latest material I found within Intel Document was written about ten
years ago.
I know static branch prediction is (far?) less important than dynamic,
but in quite a few situations, CPU will be completely lost and
programmers(with compiler) are usually the best guide. Of course these
situations are usually not performance bottleneck, because once a
branch is frequently executed, the dynamic predictor will capture it.
Since Intel no longer clearly statements the dynamic prediction
mechanism in its document, the builtin_expect() of GCC can do nothing
more than removing the unlikely branch from hot path or reversely for
likely branch.
I am not familiar with CPU design and I don't know what exactly
mechanism Intel use nowadays for its static predictor, I just feel the
best static mechanism for Intel should be to clearly document his CPU
"where I plan to go when dynamic predictor failed, forward or
backward", because usually the programmer is the best guide at that
time.
APPENDIX:
¹ Agner's optimization guide:
https://www.agner.org/optimize/microarchitecture.pdf , section 3.5
.
² Matt G's experiment: https://xania.org/201602/bpu-part-two
More information about the llvm-dev
mailing list