[llvm-dev] rotl: undocumented LLVM instruction?
Ryan Taylor via llvm-dev
llvm-dev at lists.llvm.org
Thu Nov 3 14:27:18 PDT 2016
Change the DAGCombine.
On Nov 3, 2016 17:24, "Phil Tomson" <phil.a.tomson at gmail.com> wrote:
> I could try setting ISD::ROTL to Expand... however, we do have a rol op
> and we'd like the ISD::ROTL to map to it. If I set it to Expand it's not
> going to do that, right?
>
> I think in this case we're just getting the ISD::ROTL a bit too soon in
> the process and that's causing us to miss other optimization opportunities
> later on.
>
> Phil
>
> On Thu, Nov 3, 2016 at 2:20 PM, Ryan Taylor <ryta1203 at gmail.com> wrote:
>
>> Setting the ISD::ROTL to Expand doesn't work? (via SetOperation)
>>
>> You could also do a Custom hook if that's what you're looking for.
>>
>> On Thu, Nov 3, 2016 at 5:12 PM, Phil Tomson <phil.a.tomson at gmail.com>
>> wrote:
>>
>>> ... or perhaps to rephrase:
>>>
>>> In 3.9 it seems to be doing a smaller combine much sooner, whereas in
>>> 3.6 it deferred that till later in the instruction selection pattern
>>> matching - the latter was giving us better results because it seems to
>>> match a larger pattern than the former did in the earlier stage.
>>>
>>> Phil
>>>
>>> On Thu, Nov 3, 2016 at 2:07 PM, Phil Tomson <phil.a.tomson at gmail.com>
>>> wrote:
>>>
>>>> Is there any way to get it to delay this optimization where it goes
>>>> from this:
>>>>
>>>> Initial selection DAG: BB#0 'bclr64:entry'
>>>> SelectionDAG has 14 nodes:
>>>> t0: ch = EntryToken
>>>> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0
>>>> t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1
>>>> t6: i64 = sub t4, Constant:i64<1>
>>>> t7: i64 = shl Constant:i64<1>, t6
>>>> t9: i64 = xor t7, Constant:i64<-1>
>>>> t10: i64 = and t2, t9
>>>> t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10
>>>> t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1
>>>>
>>>>
>>>>
>>>> Combining: t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1
>>>>
>>>> Combining: t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10
>>>>
>>>> Combining: t11: i64 = Register %R1
>>>>
>>>> Combining: t10: i64 = and t2, t9
>>>>
>>>> Combining: t9: i64 = xor t7, Constant:i64<-1>
>>>> ... into: t15: i64 = rotl Constant:i64<-2>, t6
>>>>
>>>> ...to this:
>>>>
>>>> Optimized lowered selection DAG: BB#0 'bclr64:entry'
>>>> SelectionDAG has 13 nodes:
>>>> t0: ch = EntryToken
>>>> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0
>>>> t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1
>>>> t17: i64 = add t4, Constant:i64<-1>
>>>> t15: i64 = rotl Constant:i64<-2>, t17
>>>> t10: i64 = and t2, t15
>>>> t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10
>>>> t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1
>>>>
>>>>
>>>> That combining of the xor & and there ends up giving us suboptimal
>>>> results as compared with 3.6.
>>>>
>>>> For example, in 3.6 the generated code is simply:
>>>>
>>>> bclr64: # @bclr64
>>>> # BB#0: # %entry
>>>> addI r1, r1, -1, 64
>>>> bclr r1, r0, r1, 64
>>>> jabs r511
>>>>
>>>> Whereas with 3.9 the generated code is:
>>>>
>>>> bclr64: # @bclr64
>>>> # BB#0: # %entry
>>>> addI r1, r1, -1, 64
>>>> movimm r2, -2, 64
>>>> rol r1, r2, r1, 64
>>>> bitop1 r1, r0, r1, AND, 64
>>>> jabs r511
>>>>
>>>>
>>>> ... it seems to be negatively impacting some of our larger benchmarks
>>>> as well that used to contains several bclr (bit clear) commands but now
>>>> contain much less.
>>>>
>>>> Phil
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Nov 2, 2016 at 4:10 PM, Ryan Taylor <ryta1203 at gmail.com> wrote:
>>>>
>>>>> I believe some of the ISDs were introduced to allow for DAG
>>>>> optimizations under the assumption that some of the major architectures
>>>>> directly support these types of instructions.
>>>>>
>>>>> -Ryan
>>>>>
>>>>> On Wed, Nov 2, 2016 at 6:24 PM, Phil Tomson via llvm-dev <
>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>
>>>>>> We've recently moved our project from LLVM 3.6 to LLVM 3.9. I
>>>>>> noticed one of our code generation tests is breaking in 3.9.
>>>>>>
>>>>>> The test is:
>>>>>>
>>>>>> ; RUN: llc < %s -march=xstg | FileCheck %s
>>>>>>
>>>>>> define i64 @bclr64(i64 %a, i64 %b) nounwind readnone {
>>>>>> entry:
>>>>>> ; CHECK: bclr r1, r0, r1, 64
>>>>>> %sub = sub i64 %b, 1
>>>>>> %shl = shl i64 1, %sub
>>>>>> %xor = xor i64 %shl, -1
>>>>>> %and = and i64 %a, %xor
>>>>>> ret i64 %and
>>>>>> }
>>>>>>
>>>>>> I ran llc with -debug to get a better idea of what's going on and
>>>>>> found:
>>>>>>
>>>>>> Initial selection DAG: BB#0 'bclr64:entry'
>>>>>> SelectionDAG has 14 nodes:
>>>>>> t0: ch = EntryToken
>>>>>> t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0
>>>>>> t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1
>>>>>> t6: i64 = sub t4, Constant:i64<1>
>>>>>> t7: i64 = shl Constant:i64<1>, t6
>>>>>> t9: i64 = xor t7, Constant:i64<-1>
>>>>>> t10: i64 = and t2, t9
>>>>>> t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10
>>>>>> t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1
>>>>>>
>>>>>>
>>>>>>
>>>>>> Combining: t13: ch = XSTGISD::Ret t12, Register:i64 %R1, t12:1
>>>>>>
>>>>>> Combining: t12: ch,glue = CopyToReg t0, Register:i64 %R1, t10
>>>>>>
>>>>>> Combining: t11: i64 = Register %R1
>>>>>>
>>>>>> Combining: t10: i64 = and t2, t9
>>>>>>
>>>>>> Combining: t9: i64 = xor t7, Constant:i64<-1>
>>>>>> ... into: t15: i64 = rotl Constant:i64<-2>, t6
>>>>>>
>>>>>> Combining: t10: i64 = and t2, t15
>>>>>>
>>>>>> Combining: t15: i64 = rotl Constant:i64<-2>, t6
>>>>>>
>>>>>> Combining: t14: i64 = Constant<-2>
>>>>>>
>>>>>> Combining: t6: i64 = sub t4, Constant:i64<1>
>>>>>> ... into: t17: i64 = add t4, Constant:i64<-1>
>>>>>>
>>>>>> Combining: t15: i64 = rotl Constant:i64<-2>, t17
>>>>>>
>>>>>>
>>>>>>
>>>>>> These rotl instructions weren't showing up when I ran llc 3.6 and
>>>>>> that's completely changing the generated code at the end which means the
>>>>>> test fails (and it's less optimal than it was in 3.6).
>>>>>>
>>>>>> I've been looking in the LLVM language docs (3.9 version) and I don't
>>>>>> see any documentation on 'rotl'. What does it do? Why isn't it in the docs?
>>>>>>
>>>>>> Phil
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161103/d0e8abd2/attachment.html>
More information about the llvm-dev
mailing list