[llvm-dev] [MachineCopyPropagation] Issue with register forwarding/allocation/verifier in out-of-tree target
Geoff Berry via llvm-dev
llvm-dev at lists.llvm.org
Thu Sep 28 11:34:00 PDT 2017
On 9/27/2017 1:36 PM, Matthias Braun wrote:
>
>> On Sep 26, 2017, at 8:24 PM, Geoff Berry via llvm-dev
>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>> On 9/26/2017 6:47 PM, Matthias Braun wrote:
>>>> On Sep 26, 2017, at 3:33 PM, Geoff Berry <gberry at codeaurora.org
>>>> <mailto:gberry at codeaurora.org><mailto:gberry at codeaurora.org>> wrote:
>>>>
>>>>
>>>>
>>>> On 9/26/2017 6:11 PM, Matthias Braun wrote:
>>>>>> On Sep 26, 2017, at 2:39 PM, Geoff Berry via llvm-dev
>>>>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>>>>>> <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Mikael reported a machine verification failure in his out-of-tree
>>>>>> target with the MachineCopyPropagation changes to forward
>>>>>> registers (which is currently reverted). The verification in
>>>>>> question is:
>>>>>>
>>>>>> *** Bad machine code: Multiple connected components in live
>>>>>> interval ***
>>>>>> - function: utils_la_suite_matmul_ref
>>>>>> - interval: %vreg77
>>>>>> [192r,208B:0)[208B,260r:1)[312r,364r:2)[380r,464B:3) 0 at 192r
>>>>>> 1 at 208B-phi 2 at 312r 3 at 380r
>>>>>> 0: valnos 0 1 3
>>>>>> 1: valnos 2
>>>>>>
>>>>>> In this particular case, I believe that it is the greedy allocator
>>>>>> that is creating the multiple components in the %vreg77 live
>>>>>> interval. If you look at the attached debug dump file, just after
>>>>>> the greedy allocator runs, the segment of %vreg77 from the def at
>>>>>> 312B to the use at 380B seems to be separable from the other
>>>>>> segments. The reason the above verification failure is not hit at
>>>>>> that point seems to be related to the FIXME in the following
>>>>>> snippet from ConnectedVNInfoEqClasses::Classify():
>>>>> That dump seems to be well before greedy runs, isn't it?
>>>>
>>>> I'm not sure what you mean. The attached log contains
>>>> -print-before-all -print-after-all and -debug output starting with
>>>> the coalescer pass. The verification failure is right after the
>>>> first pass of MachineCopyPropagation which runs after the greedy
>>>> allocator.
>>> The copy propagation seemed to be working on vregs. This was extra
>>> confusing as D30751 seems to be currently reverted from trunk so I
>>> couldn't find references to that code.
>>
>> Sorry, I should have mentioned that as well. This verification error
>> is the last problem keeping me from re-enabling the copy forwarding
>> patch (I can send you my latest rebased version, but I don't think it
>> is relevant to this problem. See below).
>>
>>>>
>>>>> At a first glance the odd thing there is that the operand of
>>>>> fladd_a32_a32_a32 is rewritten from vreg77 to vreg76, but the
>>>>> vreg77 operand of the BUNDLE is not. Maybe you can find out why
>>>>> that is?
>>>>
>>>> Sorry, I should have pointed this out before: because the loop over
>>>> instructions in MachineCopyPropagation is only visiting the BUNDLE
>>>> instructions themselves (i.e. it does not visit the instructions
>>>> inside the BUNDLE) and we don't forward to implicit uses (which all
>>>> of the BUNDLE operands are marked as), we won't currently forward a
>>>> use to a bundled instruction. I believe handling bundles more
>>>> aggressively can be added as a follow-on enhancement unless we think
>>>> not doing has an inherent problem.
>>> I would expect you know the code in D30751 and can take a look into
>>> why only 1 of the instructions is rewritten?
>>> From all I've seen so far the verification code seems to behave as
>>> expected.
>>
>> I don't think the fact that BUNDLEd instructions aren't re-written has
>> anything to do with the verification problem. Let me try to simplify
>> what I think is going on. Just after greedy regalloc, we end up with
>> some code like this:
>>
>> ...
>> %vreg1<def> = ...
>> ...
>> ... = %vreg1
>> ...
>> %vreg1<def> = %vreg1
>> ...
>>
>> verifyLiveInterval() accepts this code as valid since it sees the
>> second def as part of the same live interval component because
>> ConnectedVNInfoEqClasses::Classify() sees this second def as a
>> "two-addr" redefinition, even though the def and source operands are
>> not tied.
>>
>> MachineCopyProp (pre-rewrite) runs next and turns this code into:
>> ...
>> %vreg1<def> = ...
>> ...
>> ... = %vreg1
>> ...
>> %vreg1<def> = *%vreg2*
>> ...
>>
>> verifyLiveInterval() now rejects this code since it sees these two def
>> live ranges as being separate components. My claim is that these two
>> code snippets are equivalent as far as the number of live range
>> components is concerned. Therefore verifyLiveInterval() should have
>> rejected the code just after regalloc greedy (as the FIXME in
>> ConnectedVNInfoEqClasses::Classify hints at), which means the source
>> of this particular problem is in regalloc greedy or before (and not in
>> MachineCopyProp).
> Ah I see. And I would agree with your interpretation.
>
> - Only tied use/def operands or a subregister def without the undef flag
> should result in connected liveranges.
> - Be careful and measure the compile time impact when switching the
> implementation in ConnectedVNInfoEqClasses; unfortunately there is
> currently no way to detect this situation just by looking at the VNInfo.
> - I guess the RAGreedy result is indeed wrong then. If I'm reading this
> correctly, it basically looks like this (when simplified to the problem
> at hand):
>
> BB2:
> vreg76 = COPY vreg77<kill>
>
> vreg77 = COPY vreg76
> vreg77 = someop vreg77<kill>
> CondJmp BB2
>
> the `vreg77=COPY` should have used a different vreg. My guess would be
> that the liverange splitting code makes similar assumptions as the
> ConnectedVNInfoEqClasses.
I think I'm going to try to work around this issue for now (with a big
FIXME comment) by not copy forwarding in these cases so I can get my
original patch re-enabled. Then we can look into fixing the above
issue, though I don't think I'll be able to look into it for some time.
> - Matthias
--
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code
Aurora Forum, a Linux Foundation Collaborative Project.
More information about the llvm-dev
mailing list