Reordering two functions can slow down lld by 1.06 times
Mehdi Amini via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 21 08:14:16 PDT 2016
The attachment to this PR: https://llvm.org/bugs/show_bug.cgi?id=5615 <https://llvm.org/bugs/show_bug.cgi?id=5615>
Explains why this could happen.
(This could well be a different case here, but it may be related, or hint toward a similar type of problem).
OTH.
—
Mehdi
> On Oct 21, 2016, at 7:30 AM, Rafael Espíndola via llvm-commits <llvm-commits at lists.llvm.org> wrote:
>
> This is sufficiently crazy that I decided to create an easy reproducible.
>
> I uploaded it to https://drive.google.com/open?id=0B7iRtublysV6WmZPZzh5LUpSZUU
>
> I also tested it on a i7-3840QM where the problem reproduces exactly
> and on a AMD Opteron(tm) Processor 6380 where the two binaries have
> exactly the same performance.
>
> Craig, all that I was able to find about branch prediction alias
> problems was a suggestion on the intel optimization manual to align
> branch targets, but looks like that is not the problem here. Any idea
> if there is anything that can be done to avoid this problem?
>
> Thanks,
> Rafael
>
>
> On 20 October 2016 at 17:09, Rafael Espíndola
> <rafael.espindola at gmail.com> wrote:
>> I spend most of the day reducing an oddity I noticed while
>> benchmarking a small patch.
>>
>> It turns out that just reordering two adjacent functions can have a
>> massive impact on performance. The two binaries are in
>>
>> https://drive.google.com/open?id=0B7iRtublysV6VW5VVW1na2N1RGM
>>
>> https://drive.google.com/open?id=0B7iRtublysV6MUJoeGVCRHpXVUU
>>
>> And the total diff of the objdump is attached.
>>
>> When linking xul with one of the binaries I get
>>
>> 98,298,725 branch-misses # 2.24% of all branches
>> 7.206486289 seconds time elapsed
>>
>> With the other I get
>>
>> 139,849,372 branch-misses # 3.18% of all branches
>> 7.645573494 seconds time elapsed
>>
>> Adding enough padding before the function gets the performance back,
>> which suggests an aliasing problem in the branch predictor.
>>
>> The cpu is a E5-2697 (Ivy Bridge). Is anyone familiar with its branch
>> predictor and how to avoid hitting these problems?
>>
>> Cheers,
>> Rafael
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161021/6864a8f2/attachment.html>
More information about the llvm-commits
mailing list