[llvm-dev] Question about canonicalizing cmp+select
Yuan Lin via llvm-dev
llvm-dev at lists.llvm.org
Tue Jul 3 18:25:15 PDT 2018
Hi,
"*While its true select convertSelectOfCosntantsToMath() creates add+sext
i1, there's a transform in DAGCombiner::visitADDLike that does "add (sext
i1), X -> sub X, (zext i1)". Why didn't that fire for your target?*"
That's a good question! We never explicitly marked sext-on-i1 as a custom
op. And I think the default behavior is "legal" (Am I right on that?) So
the following transform never gets executed because it thinks sext-on-i1 as
legal op.
// add (sext i1), X -> sub X, (zext i1)
if (N0.getOpcode() == ISD::SIGN_EXTEND &&
N0.getOperand(0).getValueType() == MVT::i1 &&
!TLI.isOperationLegal(ISD::SIGN_EXTEND, MVT::i1)) {
SDValue ZExt = DAG.getNode(ISD::ZERO_EXTEND, DL, VT, N0.getOperand(0));
return DAG.getNode(ISD::SUB, DL, VT, N1, ZExt);
}
I will try to change the isel behavior sext-on-i1, and see if that fixes
the issue.
On Tue, Jul 3, 2018 at 3:47 PM Craig Topper <craig.topper at gmail.com> wrote:
> While its true select convertSelectOfCosntantsToMath() creates add+sext
> i1, there's a transform in DAGCombiner::visitADDLike that does "add (sext
> i1), X -> sub X, (zext i1)". Why didn't that fire for your target?
>
> ~Craig
>
>
> On Tue, Jul 3, 2018 at 3:44 PM Sanjay Patel <spatel at rotateright.com>
> wrote:
>
>> I linked the wrong patch review. Here's the patch that was actually
>> committed:
>> https://reviews.llvm.org/D48508
>> https://reviews.llvm.org/rL335433
>>
>> On Tue, Jul 3, 2018 at 4:39 PM, Sanjay Patel <spatel at rotateright.com>
>> wrote:
>>
>>> [adding back llvm-dev and cc'ing Craig]
>>>
>>> I think you are asking if we are missing a fold (or your target is
>>> missing enabling another hook) to transform the sext+add into shift+or? I
>>> think the answer is 'yes'. We probably should add that fold. This seems
>>> like a similar case as the recent: https://reviews.llvm.org/D48466
>>>
>>> Note that on x86, the sext+add becomes zext+sub:
>>> t20: i8 = setcc t3, Constant:i16<-1>, setgt:ch
>>> t24: i16 = zero_extend t20
>>> t17: i16 = sub Constant:i16<5>, t24
>>>
>>> Would that transform help your target?
>>>
>>> On Tue, Jul 3, 2018 at 3:55 PM, Yuan Lin <yualin at google.com> wrote:
>>>
>>>> Hi, Roman and Sanjay,
>>>>
>>>> Thank you for your reply! We currently do run DAGCombiner, but
>>>> didn't implement this specific transformation. I just tried turning on
>>>> convertSelectOfCosntantsToMath() in our ISelLowering, but that doesn't
>>>> quite work because it generated a sign_extend op from i1 to i16, which our
>>>> backend currently doesn't support.
>>>>
>>>> Does the DAGCombiner already has this transformation implemented?
>>>>
>>>> Thanks,
>>>> --Yuan
>>>>
>>>> On Tue, Jul 3, 2018 at 11:22 AM Sanjay Patel <spatel at rotateright.com>
>>>> wrote:
>>>>
>>>>> Do you run DAGCombiner? And are you overriding
>>>>> TLI.convertSelectOfConstantsToMath(VT) for your target?
>>>>>
>>>>> For the stated example (true val and false val constants in the select
>>>>> differ by 1), that should already be converted.
>>>>>
>>>>>
>>>>> On Tue, Jul 3, 2018 at 12:13 PM, Roman Lebedev <lebedev.ri at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Tue, Jul 3, 2018 at 9:06 PM, Yuan Lin via llvm-dev
>>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>>> > Hi, Sanjay/all,
>>>>>> >
>>>>>> > I noticed in rL331486 that some compare-select optimizations are
>>>>>> disabled
>>>>>> > in favor of providing canonicalized cmp+select to the backend.
>>>>>> >
>>>>>> > I am currently working on a private backend target, and the
>>>>>> target has a
>>>>>> > small code size limit. With this change, some of the apps went
>>>>>> over the
>>>>>> > codesize limit. As an example,
>>>>>> >
>>>>>> > C code:
>>>>>> > b = (a > -1) ? 4 : 5;
>>>>>> >
>>>>>> > ll code:
>>>>>> > Before rL331486:
>>>>>> > %0 = lshr i16 %a.0.a.0., 15
>>>>>> > %1 = or i16 %0, 4
>>>>>> >
>>>>>> > After rL331486:
>>>>>> > %cmp = icmp sgt i16 %a.0.a.0., -1
>>>>>> > %cond = select i1 %cmp, i16 4, i16 5
>>>>>> >
>>>>>> > With the various encoding restrictions of my particular target,
>>>>>> the
>>>>>> > cmp/select generated slightly larger code size. However, because
>>>>>> the apps
>>>>>> > were very close to the code size limit, this slight change pushed
>>>>>> them over
>>>>>> > the limit.
>>>>>> >
>>>>>> > If I still prefer to have this optimization performed, then is my
>>>>>> best
>>>>>> > strategy moving forward to implement this optimization as a
>>>>>> peephole opt in
>>>>>> > my backend?
>>>>>> I personally think it should probably be a DAGCombine transform,
>>>>>> driven by a Target Transform Info hook (e.g. like hasAndNot()).
>>>>>>
>>>>>> > Thanks,
>>>>>> > --Yuan
>>>>>> Roman.
>>>>>>
>>>>>> > _______________________________________________
>>>>>> > LLVM Developers mailing list
>>>>>> > llvm-dev at lists.llvm.org
>>>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180703/66e2d5dd/attachment.html>
More information about the llvm-dev
mailing list