[PATCH] D25347: [VirtRegRewriter] Eliminate COPYs before re-writing by renaming.

Geoff Berry via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 18 12:46:34 PST 2016


I've put up https://reviews.llvm.org/D26867 for review which mostly 
solves the WZR/XZR spilling issue.  Please take a look when you get a 
chance.

Thanks,

-Geoff


On 11/15/2016 8:19 PM, Quentin Colombet wrote:
> In the meantime, I pushed the base version (r287070) given this is 
> general goodness.
>
>> On Nov 15, 2016, at 12:51 PM, Geoff Berry <gberry at codeaurora.org 
>> <mailto:gberry at codeaurora.org>> wrote:
>>
>> I think I have a change building on Quentin's patch that catches this 
>> case with only a few extra changes.  I should have a rough draft 
>> ready for feedback in a day or two.
>>
>>
>> On 11/15/2016 3:46 PM, Matthias Braun wrote:
>>> Ah right,
>>>
>>> that is indeed a different issue. I had that issue on my TODO list 
>>> as well (but somewhere deep down). Without actually trying it, I 
>>> hope we can do that by having a smarter spill creation callback 
>>> which matches a spill(copy(zero)) pattern.
>>>
>>> - Matthias
>>>
>>>> On Nov 15, 2016, at 12:13 PM, Geoff Berry <gberry at codeaurora.org 
>>>> <mailto:gberry at codeaurora.org>> wrote:
>>>>
>>>> Hi Matthias,
>>>>
>>>> Those changes don't catch the particular cases I'm looking at.  The 
>>>> problem is that the xzr/wzr copies don't become candidates for 
>>>> coalescing until the middle of register allocation when some live 
>>>> ranges have been split/spilled.  Here's an example test case:
>>>>
>>>> declare i32 @bar()
>>>> declare i32 @baz()
>>>> ; Check that the spill of the zero value gets stored directly instead
>>>> ; of being copied from wzr and then stored.
>>>> define i32 @test_zr_spill_copyprop(i1 %c) {
>>>> ; CHECK-LABEL: test_zr_spill_copyprop:
>>>> entry:
>>>>   br i1 %c, label %if.else, label %if.then
>>>>
>>>> if.else:
>>>> ; CHECK: bl bar
>>>> ; CHECK-NEXT: str w0, [sp, #[[SLOT:[0-9]+]]]
>>>>   %call1 = tail call i32 @bar()
>>>>   br label %if.end
>>>>
>>>> if.then:
>>>> ; CHECK: bl baz
>>>> ; CHECK-NEXT: str wzr, [sp, #[[SLOT]]]
>>>>   %call2 = tail call i32 @baz()
>>>>   br label %if.end
>>>>
>>>> if.end:
>>>>   %x.0 = phi i32 [ 0, %if.then ], [ %call1, %if.else ]
>>>>   call void asm sideeffect "", 
>>>> "~{x0},~{x1},~{x2},~{x3},~{x4},~{x5},~{x6},~{x7},~{x8},~{x9},~{x10},~{x11},~{x12},~{x13},~{x14},~{x15},~{x16},~{x17},~{x18},~{x19},~{x20},~{x21},~{x22},~{x23},~{x24},~{x25},~{x26},~{x27},~{x28},~{fp},~{lr},~{sp}"() 
>>>> nounwind
>>>>   %x.1 = add i32 %x.0, 1
>>>>   ret i32 %x.1
>>>> }
>>>>
>>>>
>>>> On 11/15/2016 2:58 PM, Matthias Braun wrote:
>>>>> I would xzr/wzr problems expect to be handled by the changes in 
>>>>> https://reviews.llvm.org/D26106 as soon as 
>>>>> https://reviews.llvm.org/D26111 is committed.
>>>>>
>>>>> - Matthias
>>>>>
>>>>>> On Nov 15, 2016, at 11:01 AM, Geoff Berry <gberry at codeaurora.org 
>>>>>> <mailto:gberry at codeaurora.org>> wrote:
>>>>>>
>>>>>> Hi Quentin,
>>>>>>
>>>>>> Thanks for taking a look.  I would greatly prefer a one line 
>>>>>> change :)  My original motivation for looking into this was a 
>>>>>> case involving XZR/WZR copys on aarch64, so I would like to catch 
>>>>>> those cases as well.  I'll take a look into adding constant reg 
>>>>>> support to the register allocator unless you already know how 
>>>>>> you'd like to add that functionality.  Once we have that I can 
>>>>>> compare the two approaches to make sure there aren't more cases 
>>>>>> being missed.
>>>>>>
>>>>>> On 11/14/2016 9:37 PM, Quentin Colombet wrote:
>>>>>>> Hi Geoff,
>>>>>>>
>>>>>>> I had a closer look at the problem and in particular checked why the “hint fix-up” thing was not catching those cases.
>>>>>>> The reason why it didn’t catch them is because we were doing the fix-up only for live-range that regalloc split not all missed hints.
>>>>>>> Fixing that seems to solve your motivating cases (but the wzr case because this is a reserved reg and we don’t touch it, though I could improve the code to handle constant reg… let me know if you want me to have a look).
>>>>>>>
>>>>>>> Could you try the attached patch and tell me how it works for you?
>>>>>>> Assuming it solves the cases you wanted to catch, I’ll push it and we can drop that patch.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> -Quentin
>>>>>>>
>>>>>>>
>>>>>>>> On Nov 10, 2016, at 4:37 PM, Quentin Colombet<qcolombet at apple.com>  wrote:
>>>>>>>>
>>>>>>>> qcolombet added inline comments.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:158
>>>>>>>> +// Map from physical regunit to list of def SlotIndexes.
>>>>>>>> +using PhysDefsMap = DenseMap<unsigned, SmallVector<SlotIndex, 4>>;
>>>>>>>> +
>>>>>>>> ----------------
>>>>>>>> For consistency with the rest of LLVM source code I would use typedef.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:158
>>>>>>>> +// Map from physical regunit to list of def SlotIndexes.
>>>>>>>> +using PhysDefsMap = DenseMap<unsigned, SmallVector<SlotIndex, 4>>;
>>>>>>>> +
>>>>>>>> ----------------
>>>>>>>> qcolombet wrote:
>>>>>>>>> For consistency with the rest of LLVM source code I would use typedef.
>>>>>>>> Having read the code now, I believe it would be more efficient to map PhysReg to LI.
>>>>>>>> It would greatly simplify add/removeMappedPhysReg.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:221
>>>>>>>>    AU.addRequired<VirtRegMap>();
>>>>>>>> +  AU.addRequired<MachineBlockFrequencyInfo>();
>>>>>>>> +  AU.addPreserved<MachineBlockFrequencyInfo>();
>>>>>>>> ----------------
>>>>>>>> We could gate that dependency on DisableRenameCopys.
>>>>>>>> That being said, if we move all that code in its own pass, that's not going to be a problem :).
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:296
>>>>>>>> +      bool FoundDef = false;
>>>>>>>> +      for (auto DI = DefIndexes.begin(); DI != DefIndexes.end(); ++DI)
>>>>>>>> +        if (*DI == DefIndex) {
>>>>>>>> ----------------
>>>>>>>> Seems inefficient to use a vector here.
>>>>>>>> At the very least it should be sorted.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:336
>>>>>>>> +                                    const TargetRegisterInfo &TRI,
>>>>>>>> +                                    const VirtRegMap &VRM) {
>>>>>>>> +
>>>>>>>> ----------------
>>>>>>>> Looks to me like this is reimplementing functionalities already available in the LiveRegMatrix class.
>>>>>>>> Reuse that instead.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:428
>>>>>>>> +    // Check the cost/benefit of renaming.
>>>>>>>> +    BlockFrequency DefFreq = MBFI->getBlockFreq(MI->getParent());
>>>>>>>> +    BlockFrequency Benefit(DefFreq);
>>>>>>>> ----------------
>>>>>>>> Wasn't the goal to propagate useless copies, i.e., that should always be beneficial, right?
>>>>>>>>
>>>>>>>> If you want to do more, then please add test cases covering that. Indeed, as far as I can tell, the changes in the test cases only cover the simple case of propagating useless copies.
>>>>>>>>
>>>>>>>>
>>>>>>>> ================
>>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:491
>>>>>>>> +    // reg so kill flags computation will be correct.
>>>>>>>> +    assert(LI.getNumValNums() == 1);
>>>>>>>> +    SlotIndex SrcDefPrevSlot = LI.getValNumInfo(0)->def.getPrevSlot();
>>>>>>>> ----------------
>>>>>>>> Add && "Msg" in the assert.
>>>>>>>>
>>>>>>>> Also explain in the comment before the assert what would it take to extend that to multiple VNs.
>>>>>>>>
>>>>>>>>
>>>>>>>> https://reviews.llvm.org/D25347
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Geoff Berry
>>>>>> Employee of Qualcomm Datacenter Technologies, Inc.
>>>>>>   Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
>>>>>
>>>>
>>>> -- 
>>>> Geoff Berry
>>>> Employee of Qualcomm Datacenter Technologies, Inc.
>>>>   Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
>>>
>>
>> -- 
>> Geoff Berry
>> Employee of Qualcomm Datacenter Technologies, Inc.
>>   Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
>

-- 
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
  Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161118/c18aa4be/attachment.html>


More information about the llvm-commits mailing list