[PATCH] D41835: [MachineCopyPropagation] Extend pass to do COPY source forwarding

Nemanja Ivanovic via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 2 13:18:30 PST 2018


Forgive me if I am misinterpreting what the suggestion is, but while I
agree that a pass with the potential to break things should be opt-in
rather than opt-out, I'm curious about a couple of things:

1. The granularity of the opt-in/opt-out. Seems that as Escha pointed out,
opting out for every instruction that could receive a result of a copy can
be prohibitive. But equivalently, opting in for every instruction can be
quite daunting as well. Would it make more sense for the pass itself to be
opt-in only and then instructions themselves to be opt-out? That way
targets that can just use the pass as-is can just go ahead and opt-in (and
fix any instructions that need to opt out) and targets for which it would
be prohibitive to do this per-instruction opt-out would simply just not
opt-in to the pass.

2. Seems like the issue that Escha describes can be handled in the pass by
ensuring that the source is forwarded only if it is a register that is in
the register class for the user of the copy. Isn't everything required for
this available from MCInstrDesc and TargetRegisterClass? Or maybe I'm
misinterpreting what the problem actually is...

On Fri, Feb 2, 2018 at 4:10 PM, <gberry at codeaurora.org> wrote:

> In the case of AMDGPU there were just one or two opcode base classes in
> the td files that needed to have the hasExtraSrcRegAllocReq attribute
> added, so it wasn’t a big deal.
>
> I don’t think flipping the default value of this opcode property makes
> sense, as that would involve changing all of the opcodes (or their base
> classes) in all of the in-tree targets (other than AMDGPU).
>
> How about a target property that says “all opcodes are
> hasExtraSrcRegAllocReq”?
>
>
>
> --
>
> Geoff Berry
>
> Employee of Qualcomm Datacenter Technologies, Inc.
>
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
> Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code
> Aurora Forum, a Linux Foundation Collaborative Project.
>
>
>
> *From:* qcolombet at apple.com [mailto:qcolombet at apple.com]
> *Sent:* Friday, February 2, 2018 3:55 PM
> *To:* Geoff Berry <gberry at codeaurora.org>
> *Cc:* reviews+D41835+public+9c1dec7fb6e75ce0 at reviews.llvm.org; Geoff
> Berry via Phabricator <reviews at reviews.llvm.org>; javed.absar at arm.com;
> Matthias Braun <matze at braunis.de>; Jonas Paulsson <
> paulsson at linux.vnet.ibm.com>; tstellar at redhat.com; Matt Arsenault <
> Matthew.Arsenault at amd.com>; junbuml at codeaurora.org;
> marina.yatsina at intel.com; wei.ding2 at amd.com; kannan.narayanan at amd.com;
> nhaehnle at gmail.com; Nemanja Ivanovic <nemanja.i.ibm at gmail.com>;
> llvm-commits <llvm-commits at lists.llvm.org>; tpr.llvm at botech.co.uk;
> escha at apple.com
>
> *Subject:* Re: [PATCH] D41835: [MachineCopyPropagation] Extend pass to do
> COPY source forwarding
>
>
>
>
>
>
>
> On Feb 2, 2018, at 12:21 PM, escha at apple.com wrote:
>
>
>
> I mean, in that case we are likely to have to mark every single opcode
> (all 12,000 or so) with this requirement. At that point we might as well
> just opt out of the pass, I think? At least, it feels like a gross hack
> that papers over the fact that LLVM has changed how register classes work
> such that our entire approach is no longer valid.
>
>
>
> Also, it seems very weird to make this constraint-violating behavior
> *opt-out*. Maybe it should be opt-in, i.e. put
> doesNotHaveExtraSrcRegAllocReq on all instructions it’s okay for?
>
>
>
> That sounds like a better approach to me.
>
> After talking with escha, I agree that TableGen is not necessarily
> expressive enough to model all the constraints that need to be met and I
> would err on the safe side of being opt-in instead of opt-out.
>
>
>
>
>
> —escha
>
>
>
> On Feb 2, 2018, at 12:17 PM, gberry at codeaurora.org wrote:
>
>
>
> Hi escha,
>
>
>
> This sounds very similar to the issues this change caused on AMDGPU.  The
> way we worked around them for those effected targets was to mark the
> relevant opcodes with hasExtraSrcRegAllocReq, which will prevent the
> MachineOperands from being marked as isRenamable, which will in turn
> prevent MachineCopyPropagation from changing them.  Is this a reasonable
> fix for your targets?
>
>
>
> --
>
> Geoff Berry
>
> Employee of Qualcomm Datacenter Technologies, Inc.
>
> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
> Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code
> Aurora Forum, a Linux Foundation Collaborative Project.
>
>
>
> *From:* fglaser at apple.com [mailto:fglaser at apple.com <fglaser at apple.com>] *On
> Behalf Of *escha at apple.com
> *Sent:* Friday, February 2, 2018 2:56 PM
> *To:* reviews+D41835+public+9c1dec7fb6e75ce0 at reviews.llvm.org; Geoff
> Berry via Phabricator <reviews at reviews.llvm.org>
> *Cc:* gberry at codeaurora.org; qcolombet at apple.com; javed.absar at arm.com;
> matze at braunis.de; paulsson at linux.vnet.ibm.com; tstellar at redhat.com;
> Matthew.Arsenault at amd.com; junbuml at codeaurora.org;
> marina.yatsina at intel.com; wei.ding2 at amd.com; kannan.narayanan at amd.com;
> nhaehnle at gmail.com; nemanja.i.ibm at gmail.com; llvm-commits at lists.llvm.org;
> tpr.llvm at botech.co.uk
> *Subject:* Re: [PATCH] D41835: [MachineCopyPropagation] Extend pass to do
> COPY source forwarding
>
>
>
> FYI, this completely broke our (out-of-tree) backends in more ways than i
> can count. In particular, it seems to break the ability for us to enforce
> constraints on register classes at all: we heavily rely on this for a wide
> variety of instructions whose behavior cannot be fully expressed in
> tablegen, or really, anywhere that we need to say “X argument of Y
> instruction should be constrained to class Z”.
>
>
>
> Here is one particular example that illustrates this (but what breaks is
> potentially far more than this).
>
>
>
> Suppose you have a machine in which instructions can have at most 2
> arguments in the category {immediates, special registers}. We limit
> immediates to 2 in instruction selection, and then run a small pass after
> instruction selection that identifies any instructions with more than 2
> {immediates, special registers} and marks the special register arguments as
> “no SR”. For example:
>
>
>
> %6:gpr32 = COPY $sr12
>
>
>
>
> becomes
>
>
>
> %6:gpr32nosr = COPY $sr12
>
>
>
> reasonably, this should mean the COPY cannot be folded into the
> instruction that uses it, because the argument is a “gpr32nosr”, which
> prohibits special registers.
>
>
>
> however, with this patch…
>
>
>
> # *** IR Dump After Greedy Register Allocator ***:
>
> %6:gpr32nosr = COPY $sr12
>
> %12.sub2:gpr32tup3 = /* some instruction */ 0, 0, %6, 0, 4294967312, 0, 0
>
> # *** IR Dump After Virtual Register Rewriter ***:
>
> renamable $r2 = COPY $sr12
>
> renamable $r2 = /* some instruction */ 0, 0, %6, 0, 4294967312, 0, 0
>
> […]
>
> # *** IR Dump After Machine Copy Propagation Pass ***:
>
> renamable $r2 = /* some instruction */ 0, 0, $sr12, 0, 4294967312, 0, 0
>
>
>
> in the second step the information that “%6” cannot be an SR is lost due
> to register allocation, and then copy propagation happily plows through and
> violates the constraint we set earlier.
>
>
>
> this is the simplest example, but this entire patch makes me extremely
> nervous because we rely on register classes heavily in our backend and if
> the compiler is now free to violate our constraints, all bets are off.
>
>
>
> —escha
>
>
>
>
> On Feb 1, 2018, at 10:56 AM, Geoff Berry via Phabricator via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
>
>
> This revision was automatically updated to reflect the committed changes.
> Closed by commit rL323991: [MachineCopyPropagation] Extend pass to do COPY
> source forwarding (authored by gberry, committed by ).
>
> Changed prior to commit:
>  https://reviews.llvm.org/D41835?vs=131323&id=132429#toc
>
> Repository:
>  rL LLVM
>
> https://reviews.llvm.org/D41835
>
> Files:
>  llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp
>  llvm/trunk/lib/CodeGen/TargetPassConfig.cpp
>  llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll
>  llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll
>  llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll
>  llvm/trunk/test/CodeGen/AArch64/cmpxchg-idioms.ll
>  llvm/trunk/test/CodeGen/AArch64/copyprop.mir
>  llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll
>  llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll
>  llvm/trunk/test/CodeGen/AArch64/ldst-opt.ll
>  llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll
>  llvm/trunk/test/CodeGen/AArch64/neg-imm.ll
>  llvm/trunk/test/CodeGen/AMDGPU/callee-special-input-sgprs.ll
>  llvm/trunk/test/CodeGen/AMDGPU/fix-vgpr-copies.mir
>  llvm/trunk/test/CodeGen/AMDGPU/multilevel-break.ll
>  llvm/trunk/test/CodeGen/AMDGPU/ret.ll
>  llvm/trunk/test/CodeGen/ARM/atomic-op.ll
>  llvm/trunk/test/CodeGen/ARM/intrinsics-overflow.ll
>  llvm/trunk/test/CodeGen/ARM/select-imm.ll
>  llvm/trunk/test/CodeGen/ARM/swifterror.ll
>  llvm/trunk/test/CodeGen/Mips/llvm-ir/ashr.ll
>  llvm/trunk/test/CodeGen/Mips/llvm-ir/lshr.ll
>  llvm/trunk/test/CodeGen/Mips/llvm-ir/shl.ll
>  llvm/trunk/test/CodeGen/Mips/llvm-ir/sub.ll
>  llvm/trunk/test/CodeGen/PowerPC/MCSE-caller-preserved-reg.ll
>  llvm/trunk/test/CodeGen/PowerPC/fma-mutate.ll
>  llvm/trunk/test/CodeGen/PowerPC/gpr-vsr-spill.ll
>  llvm/trunk/test/CodeGen/PowerPC/licm-remat.ll
>  llvm/trunk/test/CodeGen/PowerPC/opt-li-add-to-addi.ll
>  llvm/trunk/test/CodeGen/PowerPC/tail-dup-layout.ll
>  llvm/trunk/test/CodeGen/SPARC/32abi.ll
>  llvm/trunk/test/CodeGen/SPARC/atomics.ll
>  llvm/trunk/test/CodeGen/SystemZ/vec-sub-01.ll
>  llvm/trunk/test/CodeGen/Thumb/pr35836.ll
>  llvm/trunk/test/CodeGen/Thumb/thumb-shrink-wrapping.ll
>  llvm/trunk/test/CodeGen/X86/2006-03-01-InstrSchedBug.ll
>  llvm/trunk/test/CodeGen/X86/arg-copy-elide.ll
>  llvm/trunk/test/CodeGen/X86/avx-load-store.ll
>  llvm/trunk/test/CodeGen/X86/avx512-bugfix-25270.ll
>  llvm/trunk/test/CodeGen/X86/avx512-calling-conv.ll
>  llvm/trunk/test/CodeGen/X86/avx512-regcall-NoMask.ll
>  llvm/trunk/test/CodeGen/X86/avx512bw-intrinsics-fast-isel.ll
>  llvm/trunk/test/CodeGen/X86/avx512bw-intrinsics-upgrade.ll
>  llvm/trunk/test/CodeGen/X86/buildvec-insertvec.ll
>  llvm/trunk/test/CodeGen/X86/combine-fcopysign.ll
>  llvm/trunk/test/CodeGen/X86/combine-shl.ll
>  llvm/trunk/test/CodeGen/X86/complex-fastmath.ll
>  llvm/trunk/test/CodeGen/X86/divide-by-constant.ll
>  llvm/trunk/test/CodeGen/X86/fmaxnum.ll
>  llvm/trunk/test/CodeGen/X86/fmf-flags.ll
>  llvm/trunk/test/CodeGen/X86/fminnum.ll
>  llvm/trunk/test/CodeGen/X86/fp128-i128.ll
>  llvm/trunk/test/CodeGen/X86/h-registers-1.ll
>  llvm/trunk/test/CodeGen/X86/haddsub-2.ll
>  llvm/trunk/test/CodeGen/X86/haddsub-3.ll
>  llvm/trunk/test/CodeGen/X86/haddsub-undef.ll
>  llvm/trunk/test/CodeGen/X86/half.ll
>  llvm/trunk/test/CodeGen/X86/horizontal-reduce-smax.ll
>  llvm/trunk/test/CodeGen/X86/horizontal-reduce-smin.ll
>  llvm/trunk/test/CodeGen/X86/horizontal-reduce-umax.ll
>  llvm/trunk/test/CodeGen/X86/horizontal-reduce-umin.ll
>  llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll
>  llvm/trunk/test/CodeGen/X86/ipra-local-linkage.ll
>  llvm/trunk/test/CodeGen/X86/localescape.ll
>  llvm/trunk/test/CodeGen/X86/machine-cp.ll
>  llvm/trunk/test/CodeGen/X86/mul-i1024.ll
>  llvm/trunk/test/CodeGen/X86/mul-i256.ll
>  llvm/trunk/test/CodeGen/X86/mul-i512.ll
>  llvm/trunk/test/CodeGen/X86/mul128.ll
>  llvm/trunk/test/CodeGen/X86/mulvi32.ll
>  llvm/trunk/test/CodeGen/X86/pmul.ll
>  llvm/trunk/test/CodeGen/X86/powi.ll
>  llvm/trunk/test/CodeGen/X86/pr11334.ll
>  llvm/trunk/test/CodeGen/X86/pr29112.ll
>  llvm/trunk/test/CodeGen/X86/pr34080-2.ll
>  llvm/trunk/test/CodeGen/X86/psubus.ll
>  llvm/trunk/test/CodeGen/X86/retpoline-external.ll
>  llvm/trunk/test/CodeGen/X86/retpoline.ll
>  llvm/trunk/test/CodeGen/X86/sad.ll
>  llvm/trunk/test/CodeGen/X86/safestack.ll
>  llvm/trunk/test/CodeGen/X86/safestack_inline.ll
>  llvm/trunk/test/CodeGen/X86/scalar_widen_div.ll
>  llvm/trunk/test/CodeGen/X86/select.ll
>  llvm/trunk/test/CodeGen/X86/shrink-wrap-chkstk.ll
>  llvm/trunk/test/CodeGen/X86/slow-pmulld.ll
>  llvm/trunk/test/CodeGen/X86/sqrt-fastmath.ll
>  llvm/trunk/test/CodeGen/X86/sse-scalar-fp-arith.ll
>  llvm/trunk/test/CodeGen/X86/sse1.ll
>  llvm/trunk/test/CodeGen/X86/sse3-avx-addsub-2.ll
>  llvm/trunk/test/CodeGen/X86/statepoint-live-in.ll
>  llvm/trunk/test/CodeGen/X86/statepoint-stack-usage.ll
>  llvm/trunk/test/CodeGen/X86/vec_fp_to_int.ll
>  llvm/trunk/test/CodeGen/X86/vec_int_to_fp.ll
>  llvm/trunk/test/CodeGen/X86/vec_minmax_sint.ll
>  llvm/trunk/test/CodeGen/X86/vec_shift4.ll
>  llvm/trunk/test/CodeGen/X86/vector-blend.ll
>  llvm/trunk/test/CodeGen/X86/vector-idiv-sdiv-128.ll
>  llvm/trunk/test/CodeGen/X86/vector-idiv-udiv-128.ll
>  llvm/trunk/test/CodeGen/X86/vector-mul.ll
>  llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll
>  llvm/trunk/test/CodeGen/X86/vector-sext.ll
>  llvm/trunk/test/CodeGen/X86/vector-shift-ashr-128.ll
>  llvm/trunk/test/CodeGen/X86/vector-shift-lshr-128.ll
>  llvm/trunk/test/CodeGen/X86/vector-shift-shl-128.ll
>  llvm/trunk/test/CodeGen/X86/vector-shuffle-combining.ll
>  llvm/trunk/test/CodeGen/X86/vector-trunc-math.ll
>  llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll
>  llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll
>  llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll
>  llvm/trunk/test/CodeGen/X86/vector-zext.ll
>  llvm/trunk/test/CodeGen/X86/vselect-minmax.ll
>  llvm/trunk/test/CodeGen/X86/widen_conv-3.ll
>  llvm/trunk/test/CodeGen/X86/widen_conv-4.ll
>  llvm/trunk/test/CodeGen/X86/win64_frame.ll
>  llvm/trunk/test/CodeGen/X86/x86-interleaved-access.ll
>  llvm/trunk/test/CodeGen/X86/x86-shrink-wrap-unwind.ll
>  llvm/trunk/test/CodeGen/X86/x86-shrink-wrapping.ll
>  llvm/trunk/test/DebugInfo/COFF/fpo-shrink-wrap.ll
>  llvm/trunk/test/DebugInfo/X86/spill-nospill.ll
>
> <D41835.132429.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180202/30bff702/attachment.html>


More information about the llvm-commits mailing list