[llvm-dev] [MachineCopyPropagation] Issue with register forwarding/allocation/verifier in out-of-tree target
Geoff Berry via llvm-dev
llvm-dev at lists.llvm.org
Tue Sep 26 20:24:41 PDT 2017
On 9/26/2017 6:47 PM, Matthias Braun wrote:
>
>> On Sep 26, 2017, at 3:33 PM, Geoff Berry <gberry at codeaurora.org
>> <mailto:gberry at codeaurora.org>> wrote:
>>
>>
>>
>> On 9/26/2017 6:11 PM, Matthias Braun wrote:
>>>> On Sep 26, 2017, at 2:39 PM, Geoff Berry via llvm-dev
>>>> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> Mikael reported a machine verification failure in his out-of-tree
>>>> target with the MachineCopyPropagation changes to forward registers
>>>> (which is currently reverted). The verification in question is:
>>>>
>>>> *** Bad machine code: Multiple connected components in live interval ***
>>>> - function: utils_la_suite_matmul_ref
>>>> - interval: %vreg77
>>>> [192r,208B:0)[208B,260r:1)[312r,364r:2)[380r,464B:3) 0 at 192r
>>>> 1 at 208B-phi 2 at 312r 3 at 380r
>>>> 0: valnos 0 1 3
>>>> 1: valnos 2
>>>>
>>>> In this particular case, I believe that it is the greedy allocator
>>>> that is creating the multiple components in the %vreg77 live
>>>> interval. If you look at the attached debug dump file, just after
>>>> the greedy allocator runs, the segment of %vreg77 from the def at
>>>> 312B to the use at 380B seems to be separable from the other
>>>> segments. The reason the above verification failure is not hit at
>>>> that point seems to be related to the FIXME in the following snippet
>>>> from ConnectedVNInfoEqClasses::Classify():
>>> That dump seems to be well before greedy runs, isn't it?
>>
>> I'm not sure what you mean. The attached log contains
>> -print-before-all -print-after-all and -debug output starting with the
>> coalescer pass. The verification failure is right after the first pass
>> of MachineCopyPropagation which runs after the greedy allocator.
> The copy propagation seemed to be working on vregs. This was extra
> confusing as D30751 seems to be currently reverted from trunk so I
> couldn't find references to that code.
Sorry, I should have mentioned that as well. This verification error is
the last problem keeping me from re-enabling the copy forwarding patch
(I can send you my latest rebased version, but I don't think it is
relevant to this problem. See below).
>>
>>> At a first glance the odd thing there is that the operand of
>>> fladd_a32_a32_a32 is rewritten from vreg77 to vreg76, but the vreg77
>>> operand of the BUNDLE is not. Maybe you can find out why that is?
>>
>> Sorry, I should have pointed this out before: because the loop over
>> instructions in MachineCopyPropagation is only visiting the BUNDLE
>> instructions themselves (i.e. it does not visit the instructions
>> inside the BUNDLE) and we don't forward to implicit uses (which all of
>> the BUNDLE operands are marked as), we won't currently forward a use
>> to a bundled instruction. I believe handling bundles more
>> aggressively can be added as a follow-on enhancement unless we think
>> not doing has an inherent problem.
> I would expect you know the code in D30751 and can take a look into why
> only 1 of the instructions is rewritten?
> From all I've seen so far the verification code seems to behave as
> expected.
I don't think the fact that BUNDLEd instructions aren't re-written has
anything to do with the verification problem. Let me try to simplify
what I think is going on. Just after greedy regalloc, we end up with
some code like this:
...
%vreg1<def> = ...
...
... = %vreg1
...
%vreg1<def> = %vreg1
...
verifyLiveInterval() accepts this code as valid since it sees the second
def as part of the same live interval component because
ConnectedVNInfoEqClasses::Classify() sees this second def as a
"two-addr" redefinition, even though the def and source operands are not
tied.
MachineCopyProp (pre-rewrite) runs next and turns this code into:
...
%vreg1<def> = ...
...
... = %vreg1
...
%vreg1<def> = *%vreg2*
...
verifyLiveInterval() now rejects this code since it sees these two def
live ranges as being separate components. My claim is that these two
code snippets are equivalent as far as the number of live range
components is concerned. Therefore verifyLiveInterval() should have
rejected the code just after regalloc greedy (as the FIXME in
ConnectedVNInfoEqClasses::Classify hints at), which means the source of
this particular problem is in regalloc greedy or before (and not in
MachineCopyProp).
> - Matthias
>
--
Geoff Berry
Employee of Qualcomm Datacenter Technologies, Inc.
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code
Aurora Forum, a Linux Foundation Collaborative Project.
More information about the llvm-dev
mailing list