[llvm-dev] [MachineCopyPropagation] Issue with register forwarding/allocation/verifier in out-of-tree target

Matthias Braun via llvm-dev llvm-dev at lists.llvm.org
Wed Sep 27 10:36:40 PDT 2017


> On Sep 26, 2017, at 8:24 PM, Geoff Berry via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> On 9/26/2017 6:47 PM, Matthias Braun wrote:
>>> On Sep 26, 2017, at 3:33 PM, Geoff Berry <gberry at codeaurora.org <mailto:gberry at codeaurora.org> <mailto:gberry at codeaurora.org <mailto:gberry at codeaurora.org>>> wrote:
>>> 
>>> 
>>> 
>>> On 9/26/2017 6:11 PM, Matthias Braun wrote:
>>>>> On Sep 26, 2017, at 2:39 PM, Geoff Berry via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> Mikael reported a machine verification failure in his out-of-tree target with the MachineCopyPropagation changes to forward registers (which is currently reverted).  The verification in question is:
>>>>> 
>>>>> *** Bad machine code: Multiple connected components in live interval ***
>>>>> - function:    utils_la_suite_matmul_ref
>>>>> - interval:    %vreg77 [192r,208B:0)[208B,260r:1)[312r,364r:2)[380r,464B:3)  0 at 192r 1 at 208B-phi 2 at 312r 3 at 380r
>>>>> 0: valnos 0 1 3
>>>>> 1: valnos 2
>>>>> 
>>>>> In this particular case, I believe that it is the greedy allocator that is creating the multiple components in the %vreg77 live interval.  If you look at the attached debug dump file, just after the greedy allocator runs, the segment of %vreg77 from the def at 312B to the use at 380B seems to be separable from the other segments.  The reason the above verification failure is not hit at that point seems to be related to the FIXME in the following snippet from ConnectedVNInfoEqClasses::Classify():
>>>> That dump seems to be well before greedy runs, isn't it?
>>> 
>>> I'm not sure what you mean.  The attached log contains -print-before-all -print-after-all and -debug output starting with the coalescer pass. The verification failure is right after the first pass of MachineCopyPropagation which runs after the greedy allocator.
>> The copy propagation seemed to be working on vregs. This was extra confusing as D30751 seems to be currently reverted from trunk so I couldn't find references to that code.
> 
> Sorry, I should have mentioned that as well.  This verification error is the last problem keeping me from re-enabling the copy forwarding patch (I can send you my latest rebased version, but I don't think it is relevant to this problem.  See below).
> 
>>> 
>>>> At a first glance the odd thing there is that the operand of fladd_a32_a32_a32 is rewritten from vreg77 to vreg76, but the vreg77 operand of the BUNDLE is not. Maybe you can find out why that is?
>>> 
>>> Sorry, I should have pointed this out before: because the loop over instructions in MachineCopyPropagation is only visiting the BUNDLE instructions themselves (i.e. it does not visit the instructions inside the BUNDLE) and we don't forward to implicit uses (which all of the BUNDLE operands are marked as), we won't currently forward a use to a bundled instruction.  I believe handling bundles more aggressively can be added as a follow-on enhancement unless we think not doing has an inherent problem.
>> I would expect you know the code in D30751 and can take a look into why only 1 of the instructions is rewritten?
>> From all I've seen so far the verification code seems to behave as expected.
> 
> I don't think the fact that BUNDLEd instructions aren't re-written has anything to do with the verification problem.  Let me try to simplify what I think is going on.  Just after greedy regalloc, we end up with some code like this:
> 
> ...
> %vreg1<def> = ...
> ...
> ... = %vreg1
> ...
> %vreg1<def> = %vreg1
> ...
> 
> verifyLiveInterval() accepts this code as valid since it sees the second def as part of the same live interval component because ConnectedVNInfoEqClasses::Classify() sees this second def as a "two-addr" redefinition, even though the def and source operands are not tied.
> 
> MachineCopyProp (pre-rewrite) runs next and turns this code into:
> ...
> %vreg1<def> = ...
> ...
> ... = %vreg1
> ...
> %vreg1<def> = *%vreg2*
> ...
> 
> verifyLiveInterval() now rejects this code since it sees these two def live ranges as being separate components.  My claim is that these two code snippets are equivalent as far as the number of live range components is concerned.  Therefore verifyLiveInterval() should have rejected the code just after regalloc greedy (as the FIXME in ConnectedVNInfoEqClasses::Classify hints at), which means the source of this particular problem is in regalloc greedy or before (and not in MachineCopyProp).
Ah I see. And I would agree with your interpretation.

- Only tied use/def operands or a subregister def without the undef flag should result in connected liveranges.
- Be careful and measure the compile time impact when switching the implementation in ConnectedVNInfoEqClasses; unfortunately there is currently no way to detect this situation just by looking at the VNInfo.
- I guess the RAGreedy result is indeed wrong then. If I'm reading this correctly, it basically looks like this (when simplified to the problem at hand):

BB2:
     vreg76 = COPY vreg77<kill>

     vreg77 = COPY vreg76
     vreg77 = someop vreg77<kill>
     CondJmp BB2

the `vreg77=COPY` should have used a different vreg. My guess would be that the liverange splitting code makes similar assumptions as the ConnectedVNInfoEqClasses.

- Matthias
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170927/13b86639/attachment.html>


More information about the llvm-dev mailing list