[LLVMdev] Is PIC code defeating the branch predictor?

Tue Jan 4 00:37:56 PST 2011

On Jan 3, 2011, at 11:30 PM, Jakob Stoklund Olesen wrote:

> I noticed that we generate code like this for i386 PIC:
> 
> 	calll	L0$pb
> L0$pb:
> 	popl	%eax
> 	movl	%eax, -24(%ebp)         ## 4-byte Spill
> 
> I worry that this defeats the return address prediction for returns in the function because calls and returns no longer are matched.

Yes, this will defeat the processor's return address stack predictor.  That said, I suspect it's not much of an issue on "desktop" processors: the reissue of the pop is an Atom-specific issue, so you only need to worry about the branch misprediction caused on the next return.  Assuming these sequences aren't too frequent, the more elaborate tournament predictors in more powerful processors may be able to compensate for it.

That said, the alternative sequence you propose seems like it would be an improvement on any processor with a multiple issue pipeline (unless ret does a lot more work than I think it does), though it doesn't fix the reissued pop problem on Atom.

--Owen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110104/f15faf28/attachment.html>