[PATCH] D46662: [X86] condition branches folding for three-way conditional codes
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Mar 31 07:09:31 PDT 2019
lebedev.ri added a comment.
In D46662#1442345 <https://reviews.llvm.org/D46662#1442345>, @andreadb wrote:
> In D46662#1293043 <https://reviews.llvm.org/D46662#1293043>, @lebedev.ri wrote:
>
> > In D46662#1246810 <https://reviews.llvm.org/D46662#1246810>, @andreadb wrote:
> >
> > > In D46662#1246781 <https://reviews.llvm.org/D46662#1246781>, @xur wrote:
> > >
> > > > Hi Andrea,
> > > >
> > > > Thanks for running this test, and the explanation. Can you run the tests
> > > > on Bulldozer/Ryzen? I don't have access to these platforms. If I need to do
> > > > this in subtarget way, it would be good to know the performance there.
> > >
> > >
> > > CC'ing @lebedev.ri and @GGanesh.
> > > They should be able to help you with running those tests on Bulldozer/Ryzen. Unfortunately, I don't have access to those machines.
> >
> >
> > I *think* this should be fine on bdver2, as per https://www.agner.org/optimize/microarchitecture.pdf:
> >
> > 19.15 Branches and loops
> > The branch prediction mechanism is described on page 34. There is no longer any
> > restriction on the number of branches per 16 bytes of code that can be predicted efficiently.
> > The misprediction penalty is quite high because of a long pipeline.
> >
> >
> >
> >
> > In D46662#1246550 <https://reviews.llvm.org/D46662#1246550>, @andreadb wrote:
> >
> > > ...
> > > Bench: 4evencases.cc
> > > ...
> > > Bench: 15evencases.cc
> > > ...
> > > I wouldn't be surprised if instead this patch improves the performance of code on other big AMD cores like Bulldozer/ryzen.
> >
> >
> > Are these benchmarks available from somewhere? Can i run them
>
>
> Sorry Roman,
> I completely missed that comment.
>
> Those two benchmarks were attached by Xur to this code review.
> You should be able to see the attachments if you expand the “Show Older Changes” section (there is a link at the top of this review).
> One of his posts has got 3 attachments. Two of these files are the benchmarks to run.
Aha! Not sure how i did not find those. Thank you!
> I hope it helps.
>
>>
>>
>>> -Andrea
>>
>> Roman
>
>
>
> In D46662#1293043 <https://reviews.llvm.org/D46662#1293043>, @lebedev.ri wrote:
>
>> In D46662#1246810 <https://reviews.llvm.org/D46662#1246810>, @andreadb wrote:
>>
>> > In D46662#1246781 <https://reviews.llvm.org/D46662#1246781>, @xur wrote:
>> >
>> > > Hi Andrea,
>> > >
>> > > Thanks for running this test, and the explanation. Can you run the tests
>> > > on Bulldozer/Ryzen? I don't have access to these platforms. If I need to do
>> > > this in subtarget way, it would be good to know the performance there.
>> >
>> >
>> > CC'ing @lebedev.ri and @GGanesh.
>> > They should be able to help you with running those tests on Bulldozer/Ryzen. Unfortunately, I don't have access to those machines.
>>
>>
>> I *think* this should be fine on bdver2, as per https://www.agner.org/optimize/microarchitecture.pdf:
>>
>> 19.15 Branches and loops
>> The branch prediction mechanism is described on page 34. There is no longer any
>> restriction on the number of branches per 16 bytes of code that can be predicted efficiently.
>> The misprediction penalty is quite high because of a long pipeline.
>>
Measurements (n=25) say that `15 cases` improves (avg: -0.18%, median: -0.35%),
and `4 cases` appears to improve (avg: -0.03%, median: **+**0.07%)
I will submit a patch.
>> In D46662#1246550 <https://reviews.llvm.org/D46662#1246550>, @andreadb wrote:
>>
>>> ...
>>> Bench: 4evencases.cc
>>> ...
>>> Bench: 15evencases.cc
>>> ...
>>> I wouldn't be surprised if instead this patch improves the performance of code on other big AMD cores like Bulldozer/ryzen.
>>
>>
>> Are these benchmarks available from somewhere? Can i run them somehow?
>>
>>> -Andrea
>>
>> Roman
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D46662/new/
https://reviews.llvm.org/D46662
More information about the llvm-commits
mailing list