[LLVMdev] Illegal optimization in LLVM 2.8 during SelectionDAG? (Re: comparison pattern trouble - might be a bug in LLVM 2.8?)
Bill Wendling
wendling at apple.com
Fri Oct 1 15:20:12 PDT 2010
On Oct 1, 2010, at 3:50 AM, Heikki Kultala wrote:
> On 1 Oct 2010, at 13:35, Bill Wendling wrote:
>
>> On Sep 30, 2010, at 2:13 AM, Heikki Kultala wrote:
>>
>>> Bill Wendling wrote:
>>>> On Sep 29, 2010, at 12:36 AM, Heikki Kultala wrote:
>>>>
>>>>> On 29 Sep 2010, at 06:25, Heikki Kultala wrote:
>>>>>
>>>>>> Our architecture has 1-bit boolean predicate registers.
>>>>>>
>>>>>> I've defined comparison
>>>>>>
>>>>>> def NErrb : InstTCE<(outs I1Regs:$op3), (ins I32Regs:$op1,I32Regs:$op2), "", [(set I1Regs:$op3, (setne I32Regs:$op1, I32Regs:$op2))]>;
>>>>>>
>>>>>> But then I end up having the following bug:
>>>>>>
>>>>>> Code
>>>>>>
>>>>>> %0 = zext i8 %data to i32
>>>>>> %1 = zext i16 %crc to i32
>>>>>> %2 = xor i32 %1, %0
>>>>>> %3 = and i32 %2, 1
>>>>>> %4 = icmp eq i32 %3, 0
>>>>>>
>>>>>> which compares the lowest bits of the 2 variables
>>>>>> ends up being compiled as
>>>>>>
>>>>>> %reg16384<def> = LDWi <fi#-2>, 0; mem:LD4[FixedStack-2] I32Regs:%reg16384
>>>>>> %reg16385<def> = LDWi <fi#-1>, 0; mem:LD4[FixedStack-1] I32Regs:%reg16385
>>>>>> %reg16386<def> = COPY %reg16384; I32Regs:%reg16386,16384
>>>>>> %reg16390<def> = NErrb %reg16384, %reg16385; I1Regs:%reg16390 I32Regs:%reg16384,16385
>>>>>>
>>>>>> which just compares ALL BITS of the variables.
>>>>> I also have a pattern:
>>>>>
>>>>> def XORrrb : InstTCE<(outs I1Regs:$op3), (ins I32Regs:$op1,I32Regs:$op2), "", [(set I1Regs:$op3, (trunc (xor I32Regs:$op1, I32Regs:$op2)))]>;
>>>>>
>>>>> Which can do the whole 3-operation code sequence correctly with one operation.
>>>>>
>>>>> With LLVM 2.7 this correct operation is selected, with LLVM 2.8 the wrong operation(which compares all bits) is chosen
>>>>>
>>>>> So this looks like a bug in LLVM 2.8 isel?
>>>>>
>>>> Hi Heikki,
>>>>
>>>> We need a better example of what's going on. What's the original code? Also, I don't have access to your back-end's code so it's hard to tell just from these snippets what's going on. For instance, it's not clear whether it's the instruction selector that's at fault or if your .td files have a bug in them somewhere.
>>>
>>> The original code is:
>>
>> [snip]
>>
>>> where the interesting lines are lines 12-13:
>>>
>>> x16 = (e_u8)(((data) ^ ((e_u8)crc))&1);
>>> if (x16 == 1)
>>>
>>> The code which goes into isel is:
>>>
>>> bb.nph:
>>> %0 = zext i8 %data to i32
>>> %1 = zext i16 %crc to i32
>>> %2 = xor i32 %1, %0
>>> %3 = and i32 %2, 1
>>> %4 = icmp eq i32 %3, 0
>>> br i1 %4, label %bb.nph._crit_edge, label %5
>>>
>>> inside selectiondag this becomes:
>>>
>>> Legalized selection DAG:
>>
>> [snip]
>>
>>> 0x248d280: <multiple use>
>>> 0x248d980: <multiple use>
>>> 0x25bb7f0: i32 = xor 0x248d280, 0x248d980 [ORD=3] [ID=15]
>>>
>>> 0x25bbbf0: i1 = truncate 0x25bb7f0 [ID=18]
>>
>> This truncate is weird to me. If anything, it should be an "and" instruction. I have a feeling that your back-end is telling instruction selection and the type legalizer that it's okay to replace the normal "and" with this truncate call, which leads to your troubles later on.
>>
>> I would suggest running this same code through another back-end to see if there's anything different going on. For instance, both ARM and X86 keep the "and" around...
>>
>> -bw
>
> neither x86 or ARM have 1-bit registers, so they cannot convert the and to trunc.
>
> The trunc should be legal when the machine has 1-bit registers.
>
True. But I think you're using the truncation incorrectly. Take a look at your back-end and see how it's handling the "and" instruction. You may want to keep the "and" around, then perform the "truncate" on that "and." Don't be too worried that this will generate a lot of code. The DAG combiner is fairly good at getting rid of extraneous things.
I hope that this is helping...:-)
-bw
More information about the llvm-dev
mailing list