[PATCH] AArch64: Use CMP;CCMP sequences for and/or/setcc trees.

Wed Jun 3 11:27:04 PDT 2015

Hi Matthias,

Thanks for the clarification - I'll take a further look.

Cheers,

James

On Wed, 3 Jun 2015 at 17:21 Matthias Braun <matze at braunis.de> wrote:

> I reported my measurements in one of the early comments of the review
> thread: I only had slight improvements between 0.3 and 1.7% and two 0.5%
> regressions.
>
> However this commit does not introduce more select instructions. The idea
> behind it is that for this code:
>
> long foo(long a, long b, long v1, long v2) {
>   if (a >= v1 && a < v2)
>     return b;
>   return 0;
> }
>
> we used to generate:
>
> cmp x0, x2
> cset w8, ge
> cmp x0, x3
> cset w9, lt
> tst w8, w9
> csel x0, x1, xzr, ne
> ret
>
> in another commit I changed the generic backend to use 2 selects instead:
>
> cmp x0, x2
> csel x8, x1, xzr, ge
> cmp x0, x3
> csel x0, x8, xzr, lt
> ret
>
> in this commit I turned of the two select variant (by overriding
> shouldNormalizeToSelect()) so the improved cmp/ccmp matching can generate:
>
> cmp x0, x2
> ccmp x0, x3, #0, ge
> csel x0, x1, xzr, lt
> ret
>
> In any way the number of select does not increase, if anything it should
> decrease because we are not normalizing towards the two select sequence
> anymore. The only thing I can think of that may lead to worse code is that
> the third sequence requires the cmp/ccmp to be scheduled pretty close to
> each other to not loose the flags, this could lead to increased register
> pressure as operand computations now happen before that, while for the 2
> select version you could schedule the computations in between the two
> selects.
>
> Anyway I think this would need a more in-depth analysis to really
> understand what is going on in your benchmark.
>
> - Matthias
>
> On Jun 3, 2015, at 6:59 AM, James Molloy <james at jamesmolloy.co.uk> wrote:
>
> Hi Matthias,
>
> This actually caused a 10% regression in one of our tests on Cortex-A57
> (but a 4% improvement on Cortex-A53). I think this is to do with selects
> being expensive on heavily out of order architectures. Did you notice any
> regressions on Typhoon/Cyclone?
>
> It might be useful to have a heuristic determining if this is beneficial
> or not - if conversion in CGP already has this
> "isPredictableSelectExpensive()" hook - perhaps a similar one might be
> useful here?
>
> Cheers,
>
> James
>
> On Mon, 1 Jun 2015 at 23:39 Phabricator <reviews at reviews.llvm.org> wrote:
>
>> REPOSITORY
>>   rL LLVM
>>
>> http://reviews.llvm.org/D8232
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D8232&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=Gdqbblg081QXZ4Qg7OhixsHvneH_kDHGwUBZ8ZKdOTc&s=vje1lx_YnX_sc-CYJcXvlF3-Yt-dC4mR2IaDt_3uA4w&e=>
>>
>> Files:
>>   llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp
>>   llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h
>>   llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td
>>   llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td
>>   llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll
>>
>> EMAIL PREFERENCES
>>   http://reviews.llvm.org/settings/panel/emailpreferences/
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_settings_panel_emailpreferences_&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=Gdqbblg081QXZ4Qg7OhixsHvneH_kDHGwUBZ8ZKdOTc&s=OtEecvfwv8xDM2A9oXbehLujvOjjYcttTI8WpYPmLzE&e=>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150603/84821a14/attachment.html>