[llvm-dev] [MachineCopyPropagation] Issue with register forwarding/allocation/verifier in out-of-tree target

Geoff Berry via llvm-dev llvm-dev at lists.llvm.org
Thu Sep 28 11:34:00 PDT 2017



On 9/27/2017 1:36 PM, Matthias Braun wrote:
> 
>> On Sep 26, 2017, at 8:24 PM, Geoff Berry via llvm-dev 
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>> On 9/26/2017 6:47 PM, Matthias Braun wrote:
>>>> On Sep 26, 2017, at 3:33 PM, Geoff Berry <gberry at codeaurora.org 
>>>> <mailto:gberry at codeaurora.org><mailto:gberry at codeaurora.org>> wrote:
>>>>
>>>>
>>>>
>>>> On 9/26/2017 6:11 PM, Matthias Braun wrote:
>>>>>> On Sep 26, 2017, at 2:39 PM, Geoff Berry via llvm-dev 
>>>>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> 
>>>>>> <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Mikael reported a machine verification failure in his out-of-tree 
>>>>>> target with the MachineCopyPropagation changes to forward 
>>>>>> registers (which is currently reverted).  The verification in 
>>>>>> question is:
>>>>>>
>>>>>> *** Bad machine code: Multiple connected components in live 
>>>>>> interval ***
>>>>>> - function:    utils_la_suite_matmul_ref
>>>>>> - interval:    %vreg77 
>>>>>> [192r,208B:0)[208B,260r:1)[312r,364r:2)[380r,464B:3)  0 at 192r 
>>>>>> 1 at 208B-phi 2 at 312r 3 at 380r
>>>>>> 0: valnos 0 1 3
>>>>>> 1: valnos 2
>>>>>>
>>>>>> In this particular case, I believe that it is the greedy allocator 
>>>>>> that is creating the multiple components in the %vreg77 live 
>>>>>> interval.  If you look at the attached debug dump file, just after 
>>>>>> the greedy allocator runs, the segment of %vreg77 from the def at 
>>>>>> 312B to the use at 380B seems to be separable from the other 
>>>>>> segments.  The reason the above verification failure is not hit at 
>>>>>> that point seems to be related to the FIXME in the following 
>>>>>> snippet from ConnectedVNInfoEqClasses::Classify():
>>>>> That dump seems to be well before greedy runs, isn't it?
>>>>
>>>> I'm not sure what you mean.  The attached log contains 
>>>> -print-before-all -print-after-all and -debug output starting with 
>>>> the coalescer pass. The verification failure is right after the 
>>>> first pass of MachineCopyPropagation which runs after the greedy 
>>>> allocator.
>>> The copy propagation seemed to be working on vregs. This was extra 
>>> confusing as D30751 seems to be currently reverted from trunk so I 
>>> couldn't find references to that code.
>>
>> Sorry, I should have mentioned that as well.  This verification error 
>> is the last problem keeping me from re-enabling the copy forwarding 
>> patch (I can send you my latest rebased version, but I don't think it 
>> is relevant to this problem.  See below).
>>
>>>>
>>>>> At a first glance the odd thing there is that the operand of 
>>>>> fladd_a32_a32_a32 is rewritten from vreg77 to vreg76, but the 
>>>>> vreg77 operand of the BUNDLE is not. Maybe you can find out why 
>>>>> that is?
>>>>
>>>> Sorry, I should have pointed this out before: because the loop over 
>>>> instructions in MachineCopyPropagation is only visiting the BUNDLE 
>>>> instructions themselves (i.e. it does not visit the instructions 
>>>> inside the BUNDLE) and we don't forward to implicit uses (which all 
>>>> of the BUNDLE operands are marked as), we won't currently forward a 
>>>> use to a bundled instruction.  I believe handling bundles more 
>>>> aggressively can be added as a follow-on enhancement unless we think 
>>>> not doing has an inherent problem.
>>> I would expect you know the code in D30751 and can take a look into 
>>> why only 1 of the instructions is rewritten?
>>> From all I've seen so far the verification code seems to behave as 
>>> expected.
>>
>> I don't think the fact that BUNDLEd instructions aren't re-written has 
>> anything to do with the verification problem.  Let me try to simplify 
>> what I think is going on.  Just after greedy regalloc, we end up with 
>> some code like this:
>>
>> ...
>> %vreg1<def> = ...
>> ...
>> ... = %vreg1
>> ...
>> %vreg1<def> = %vreg1
>> ...
>>
>> verifyLiveInterval() accepts this code as valid since it sees the 
>> second def as part of the same live interval component because 
>> ConnectedVNInfoEqClasses::Classify() sees this second def as a 
>> "two-addr" redefinition, even though the def and source operands are 
>> not tied.
>>
>> MachineCopyProp (pre-rewrite) runs next and turns this code into:
>> ...
>> %vreg1<def> = ...
>> ...
>> ... = %vreg1
>> ...
>> %vreg1<def> = *%vreg2*
>> ...
>>
>> verifyLiveInterval() now rejects this code since it sees these two def 
>> live ranges as being separate components.  My claim is that these two 
>> code snippets are equivalent as far as the number of live range 
>> components is concerned.  Therefore verifyLiveInterval() should have 
>> rejected the code just after regalloc greedy (as the FIXME in 
>> ConnectedVNInfoEqClasses::Classify hints at), which means the source 
>> of this particular problem is in regalloc greedy or before (and not in 
>> MachineCopyProp).
> Ah I see. And I would agree with your interpretation.
> 
> - Only tied use/def operands or a subregister def without the undef flag 
> should result in connected liveranges.
> - Be careful and measure the compile time impact when switching the 
> implementation in ConnectedVNInfoEqClasses; unfortunately there is 
> currently no way to detect this situation just by looking at the VNInfo.
> - I guess the RAGreedy result is indeed wrong then. If I'm reading this 
> correctly, it basically looks like this (when simplified to the problem 
> at hand):
> 
> BB2:
>       vreg76 = COPY vreg77<kill>
> 
>       vreg77 = COPY vreg76
>       vreg77 = someop vreg77<kill>
>       CondJmp BB2
> 
> the `vreg77=COPY` should have used a different vreg. My guess would be 
> that the liverange splitting code makes similar assumptions as the 
> ConnectedVNInfoEqClasses.

I think I'm going to try to work around this issue for now (with a big 
FIXME comment) by not copy forwarding in these cases so I can get my 
original patch re-enabled.  Then we can look into fixing the above 
issue, though I don't think I'll be able to look into it for some time.

> - Matthias

-- 
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
  Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code 
Aurora Forum, a Linux Foundation Collaborative Project.


More information about the llvm-dev mailing list