[LLVMdev] Is PIC code defeating the branch predictor?
Jonas Maebe
jonas.maebe at elis.ugent.be
Tue Jan 4 04:57:45 PST 2011
On 04 Jan 2011, at 08:30, Jakob Stoklund Olesen wrote:
> I noticed that we generate code like this for i386 PIC:
>
> calll L0$pb
> L0$pb:
> popl %eax
> movl %eax, -24(%ebp) ## 4-byte Spill
>
> I worry that this defeats the return address prediction for returns
> in the function because calls and returns no longer are matched.
According to benchmarks by Apple, it's nevertheless faster on modern
x86 processors than the trampoline-based alternative (except maybe on
Atom, as mentioned in another reply): http://lists.apple.com/archives/perfoptimization-dev/2007/Nov/msg00005.html
At the time of that post, Apple's version of GCC still generated
trampolines (hence the remark). They switched that to the above
pattern afterwards.
Jonas
More information about the llvm-dev
mailing list