[PATCH] D25347: [VirtRegRewriter] Eliminate COPYs before re-writing by renaming.
Quentin Colombet via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 15 17:19:16 PST 2016
In the meantime, I pushed the base version (r287070) given this is general goodness.
> On Nov 15, 2016, at 12:51 PM, Geoff Berry <gberry at codeaurora.org> wrote:
>
> I think I have a change building on Quentin's patch that catches this case with only a few extra changes. I should have a rough draft ready for feedback in a day or two.
>
> On 11/15/2016 3:46 PM, Matthias Braun wrote:
>> Ah right,
>>
>> that is indeed a different issue. I had that issue on my TODO list as well (but somewhere deep down). Without actually trying it, I hope we can do that by having a smarter spill creation callback which matches a spill(copy(zero)) pattern.
>>
>> - Matthias
>>
>>> On Nov 15, 2016, at 12:13 PM, Geoff Berry <gberry at codeaurora.org <mailto:gberry at codeaurora.org>> wrote:
>>>
>>> Hi Matthias,
>>>
>>> Those changes don't catch the particular cases I'm looking at. The problem is that the xzr/wzr copies don't become candidates for coalescing until the middle of register allocation when some live ranges have been split/spilled. Here's an example test case:
>>> declare i32 @bar()
>>> declare i32 @baz()
>>> ; Check that the spill of the zero value gets stored directly instead
>>> ; of being copied from wzr and then stored.
>>> define i32 @test_zr_spill_copyprop(i1 %c) {
>>> ; CHECK-LABEL: test_zr_spill_copyprop:
>>> entry:
>>> br i1 %c, label %if.else, label %if.then
>>>
>>> if.else:
>>> ; CHECK: bl bar
>>> ; CHECK-NEXT: str w0, [sp, #[[SLOT:[0-9]+]]]
>>> %call1 = tail call i32 @bar()
>>> br label %if.end
>>>
>>> if.then:
>>> ; CHECK: bl baz
>>> ; CHECK-NEXT: str wzr, [sp, #[[SLOT]]]
>>> %call2 = tail call i32 @baz()
>>> br label %if.end
>>>
>>> if.end:
>>> %x.0 = phi i32 [ 0, %if.then ], [ %call1, %if.else ]
>>> call void asm sideeffect "", "~{x0},~{x1},~{x2},~{x3},~{x4},~{x5},~{x6},~{x7},~{x8},~{x9},~{x10},~{x11},~{x12},~{x13},~{x14},~{x15},~{x16},~{x17},~{x18},~{x19},~{x20},~{x21},~{x22},~{x23},~{x24},~{x25},~{x26},~{x27},~{x28},~{fp},~{lr},~{sp}"() nounwind
>>> %x.1 = add i32 %x.0, 1
>>> ret i32 %x.1
>>> }
>>>
>>>
>>> On 11/15/2016 2:58 PM, Matthias Braun wrote:
>>>> I would xzr/wzr problems expect to be handled by the changes in https://reviews.llvm.org/D26106 <https://reviews.llvm.org/D26106> as soon as https://reviews.llvm.org/D26111 <https://reviews.llvm.org/D26111> is committed.
>>>>
>>>> - Matthias
>>>>
>>>>> On Nov 15, 2016, at 11:01 AM, Geoff Berry <gberry at codeaurora.org <mailto:gberry at codeaurora.org>> wrote:
>>>>>
>>>>> Hi Quentin,
>>>>>
>>>>> Thanks for taking a look. I would greatly prefer a one line change :) My original motivation for looking into this was a case involving XZR/WZR copys on aarch64, so I would like to catch those cases as well. I'll take a look into adding constant reg support to the register allocator unless you already know how you'd like to add that functionality. Once we have that I can compare the two approaches to make sure there aren't more cases being missed.
>>>>> On 11/14/2016 9:37 PM, Quentin Colombet wrote:
>>>>>> Hi Geoff,
>>>>>>
>>>>>> I had a closer look at the problem and in particular checked why the “hint fix-up” thing was not catching those cases.
>>>>>> The reason why it didn’t catch them is because we were doing the fix-up only for live-range that regalloc split not all missed hints.
>>>>>> Fixing that seems to solve your motivating cases (but the wzr case because this is a reserved reg and we don’t touch it, though I could improve the code to handle constant reg… let me know if you want me to have a look).
>>>>>>
>>>>>> Could you try the attached patch and tell me how it works for you?
>>>>>> Assuming it solves the cases you wanted to catch, I’ll push it and we can drop that patch.
>>>>>>
>>>>>> Cheers,
>>>>>> -Quentin
>>>>>>
>>>>>>
>>>>>>> On Nov 10, 2016, at 4:37 PM, Quentin Colombet <qcolombet at apple.com> <mailto:qcolombet at apple.com> wrote:
>>>>>>>
>>>>>>> qcolombet added inline comments.
>>>>>>>
>>>>>>>
>>>>>>> ================
>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:158
>>>>>>> +// Map from physical regunit to list of def SlotIndexes.
>>>>>>> +using PhysDefsMap = DenseMap<unsigned, SmallVector<SlotIndex, 4>>;
>>>>>>> +
>>>>>>> ----------------
>>>>>>> For consistency with the rest of LLVM source code I would use typedef.
>>>>>>>
>>>>>>>
>>>>>>> ================
>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:158
>>>>>>> +// Map from physical regunit to list of def SlotIndexes.
>>>>>>> +using PhysDefsMap = DenseMap<unsigned, SmallVector<SlotIndex, 4>>;
>>>>>>> +
>>>>>>> ----------------
>>>>>>> qcolombet wrote:
>>>>>>>> For consistency with the rest of LLVM source code I would use typedef.
>>>>>>> Having read the code now, I believe it would be more efficient to map PhysReg to LI.
>>>>>>> It would greatly simplify add/removeMappedPhysReg.
>>>>>>>
>>>>>>>
>>>>>>> ================
>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:221
>>>>>>> AU.addRequired<VirtRegMap>();
>>>>>>> + AU.addRequired<MachineBlockFrequencyInfo>();
>>>>>>> + AU.addPreserved<MachineBlockFrequencyInfo>();
>>>>>>> ----------------
>>>>>>> We could gate that dependency on DisableRenameCopys.
>>>>>>> That being said, if we move all that code in its own pass, that's not going to be a problem :).
>>>>>>>
>>>>>>>
>>>>>>> ================
>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:296
>>>>>>> + bool FoundDef = false;
>>>>>>> + for (auto DI = DefIndexes.begin(); DI != DefIndexes.end(); ++DI)
>>>>>>> + if (*DI == DefIndex) {
>>>>>>> ----------------
>>>>>>> Seems inefficient to use a vector here.
>>>>>>> At the very least it should be sorted.
>>>>>>>
>>>>>>>
>>>>>>> ================
>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:336
>>>>>>> + const TargetRegisterInfo &TRI,
>>>>>>> + const VirtRegMap &VRM) {
>>>>>>> +
>>>>>>> ----------------
>>>>>>> Looks to me like this is reimplementing functionalities already available in the LiveRegMatrix class.
>>>>>>> Reuse that instead.
>>>>>>>
>>>>>>>
>>>>>>> ================
>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:428
>>>>>>> + // Check the cost/benefit of renaming.
>>>>>>> + BlockFrequency DefFreq = MBFI->getBlockFreq(MI->getParent());
>>>>>>> + BlockFrequency Benefit(DefFreq);
>>>>>>> ----------------
>>>>>>> Wasn't the goal to propagate useless copies, i.e., that should always be beneficial, right?
>>>>>>>
>>>>>>> If you want to do more, then please add test cases covering that. Indeed, as far as I can tell, the changes in the test cases only cover the simple case of propagating useless copies.
>>>>>>>
>>>>>>>
>>>>>>> ================
>>>>>>> Comment at: lib/CodeGen/VirtRegMap.cpp:491
>>>>>>> + // reg so kill flags computation will be correct.
>>>>>>> + assert(LI.getNumValNums() == 1);
>>>>>>> + SlotIndex SrcDefPrevSlot = LI.getValNumInfo(0)->def.getPrevSlot();
>>>>>>> ----------------
>>>>>>> Add && "Msg" in the assert.
>>>>>>>
>>>>>>> Also explain in the comment before the assert what would it take to extend that to multiple VNs.
>>>>>>>
>>>>>>>
>>>>>>> https://reviews.llvm.org/D25347 <https://reviews.llvm.org/D25347>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> Geoff Berry
>>>>> Employee of Qualcomm Datacenter Technologies, Inc.
>>>>> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
>>>>
>>>
>>> --
>>> Geoff Berry
>>> Employee of Qualcomm Datacenter Technologies, Inc.
>>> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
>>
>
> --
> Geoff Berry
> Employee of Qualcomm Datacenter Technologies, Inc.
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161115/9b153eb4/attachment.html>
More information about the llvm-commits
mailing list