<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Menlo;
        panose-1:0 0 0 0 0 0 0 0 0 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal>Hi escha, <o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>This sounds very similar to the issues this change caused on AMDGPU.  The way we worked around them for those effected targets was to mark the relevant opcodes with hasExtraSrcRegAllocReq, which will prevent the MachineOperands from being marked as isRenamable, which will in turn prevent MachineCopyPropagation from changing them.  Is this a reasonable fix for your targets?<o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>-- <o:p></o:p></p><p class=MsoNormal>Geoff Berry<o:p></o:p></p><p class=MsoNormal>Employee of Qualcomm Datacenter Technologies, Inc.<o:p></o:p></p><p class=MsoNormal> Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.<o:p></o:p></p></div><p class=MsoNormal><o:p> </o:p></p><div><div style='border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in'><p class=MsoNormal><b>From:</b> fglaser@apple.com [mailto:fglaser@apple.com] <b>On Behalf Of </b>escha@apple.com<br><b>Sent:</b> Friday, February 2, 2018 2:56 PM<br><b>To:</b> reviews+D41835+public+9c1dec7fb6e75ce0@reviews.llvm.org; Geoff Berry via Phabricator <reviews@reviews.llvm.org><br><b>Cc:</b> gberry@codeaurora.org; qcolombet@apple.com; javed.absar@arm.com; matze@braunis.de; paulsson@linux.vnet.ibm.com; tstellar@redhat.com; Matthew.Arsenault@amd.com; junbuml@codeaurora.org; marina.yatsina@intel.com; wei.ding2@amd.com; kannan.narayanan@amd.com; nhaehnle@gmail.com; nemanja.i.ibm@gmail.com; llvm-commits@lists.llvm.org; tpr.llvm@botech.co.uk<br><b>Subject:</b> Re: [PATCH] D41835: [MachineCopyPropagation] Extend pass to do COPY source forwarding<o:p></o:p></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>FYI, this completely broke our (out-of-tree) backends in more ways than i can count. In particular, it seems to break the ability for us to enforce constraints on register classes at all: we heavily rely on this for a wide variety of instructions whose behavior cannot be fully expressed in tablegen, or really, anywhere that we need to say “X argument of Y instruction should be constrained to class Z”.<o:p></o:p></p><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Here is one particular example that illustrates this (but what breaks is potentially far more than this).<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Suppose you have a machine in which instructions can have at most 2 arguments in the category {immediates, special registers}. We limit immediates to 2 in instruction selection, and then run a small pass after instruction selection that identifies any instructions with more than 2 {immediates, special registers} and marks the special register arguments as “no SR”. For example:<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><div><p class=MsoNormal><span style='font-size:8.5pt;font-family:"Menlo",serif;background:white'>%6:gpr32 = COPY $sr12</span><span style='font-size:8.5pt;font-family:"Menlo",serif'><o:p></o:p></span></p></div></div><div><p class=MsoNormal><span style='font-size:8.5pt;font-family:"Menlo",serif;background:white'><br><br></span><span style='font-size:8.5pt;font-family:"Menlo",serif'><o:p></o:p></span></p></div><div><p class=MsoNormal>becomes<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal><span style='font-size:8.5pt;font-family:"Menlo",serif;background:white'>%6:gpr32nosr = COPY $sr12</span><o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>reasonably, this should mean the COPY cannot be folded into the instruction that uses it, because the argument is a “gpr32nosr”, which prohibits special registers.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>however, with this patch…<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><div><p class=MsoNormal style='background:white'><span style='font-size:8.5pt;font-family:"Menlo",serif'># *** IR Dump After Greedy Register Allocator ***:<o:p></o:p></span></p></div></div><div><p class=MsoNormal style='background:white'><span style='font-size:8.5pt;font-family:"Menlo",serif'>%6:gpr32nosr = COPY $sr12<o:p></o:p></span></p></div><div><p class=MsoNormal style='background:white'><span style='font-size:8.5pt;font-family:"Menlo",serif'>%12.sub2:gpr32tup3 = /* some instruction */ 0, 0, %6, 0, 4294967312, 0, 0<o:p></o:p></span></p></div><div><p class=MsoNormal style='background:white'><span style='font-size:8.5pt;font-family:"Menlo",serif'># *** IR Dump After Virtual Register Rewriter ***:<o:p></o:p></span></p></div><div><div><p class=MsoNormal style='background:white'><span style='font-size:8.5pt;font-family:"Menlo",serif'>renamable $r2 = COPY $sr12<o:p></o:p></span></p></div><div><p class=MsoNormal style='background:white'><span style='font-size:8.5pt;font-family:"Menlo",serif'>renamable $r2 = /* some instruction */ 0, 0, %6, 0, 4294967312, 0, 0<o:p></o:p></span></p></div><div><p class=MsoNormal style='background:white'><span style='font-size:8.5pt;font-family:"Menlo",serif'>[…]<o:p></o:p></span></p></div><div><p class=MsoNormal style='background:white'><span style='font-size:8.5pt;font-family:"Menlo",serif'># *** IR Dump After Machine Copy Propagation Pass ***:<o:p></o:p></span></p></div><div><p class=MsoNormal style='background:white'><span style='font-size:8.5pt;font-family:"Menlo",serif'>renamable $r2 = /* some instruction */ 0, 0, $sr12, 0, 4294967312, 0, 0<o:p></o:p></span></p></div></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>in the second step the information that “%6” cannot be an SR is lost due to register allocation, and then copy propagation happily plows through and violates the constraint we set earlier.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>this is the simplest example, but this entire patch makes me extremely nervous because we rely on register classes heavily in our backend and if the compiler is now free to violate our constraints, all bets are off.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>—escha<o:p></o:p></p><div><div><div><p class=MsoNormal><br><br><o:p></o:p></p><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt'><div><p class=MsoNormal>On Feb 1, 2018, at 10:56 AM, Geoff Berry via Phabricator via llvm-commits <<a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a>> wrote:<o:p></o:p></p></div><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal>This revision was automatically updated to reflect the committed changes.<br>Closed by commit rL323991: [MachineCopyPropagation] Extend pass to do COPY source forwarding (authored by gberry, committed by ).<br><br>Changed prior to commit:<br> <a href="https://reviews.llvm.org/D41835?vs=131323&id=132429#toc">https://reviews.llvm.org/D41835?vs=131323&id=132429#toc</a><br><br>Repository:<br> rL LLVM<br><br><a href="https://reviews.llvm.org/D41835">https://reviews.llvm.org/D41835</a><br><br>Files:<br> llvm/trunk/lib/CodeGen/MachineCopyPropagation.cpp<br> llvm/trunk/lib/CodeGen/TargetPassConfig.cpp<br> llvm/trunk/test/CodeGen/AArch64/aarch64-fold-lslfast.ll<br> llvm/trunk/test/CodeGen/AArch64/arm64-AdvSIMD-Scalar.ll<br> llvm/trunk/test/CodeGen/AArch64/arm64-zero-cycle-regmov.ll<br> llvm/trunk/test/CodeGen/AArch64/cmpxchg-idioms.ll<br> llvm/trunk/test/CodeGen/AArch64/copyprop.mir<br> llvm/trunk/test/CodeGen/AArch64/f16-instructions.ll<br> llvm/trunk/test/CodeGen/AArch64/flags-multiuse.ll<br> llvm/trunk/test/CodeGen/AArch64/ldst-opt.ll<br> llvm/trunk/test/CodeGen/AArch64/merge-store-dependency.ll<br> llvm/trunk/test/CodeGen/AArch64/neg-imm.ll<br> llvm/trunk/test/CodeGen/AMDGPU/callee-special-input-sgprs.ll<br> llvm/trunk/test/CodeGen/AMDGPU/fix-vgpr-copies.mir<br> llvm/trunk/test/CodeGen/AMDGPU/multilevel-break.ll<br> llvm/trunk/test/CodeGen/AMDGPU/ret.ll<br> llvm/trunk/test/CodeGen/ARM/atomic-op.ll<br> llvm/trunk/test/CodeGen/ARM/intrinsics-overflow.ll<br> llvm/trunk/test/CodeGen/ARM/select-imm.ll<br> llvm/trunk/test/CodeGen/ARM/swifterror.ll<br> llvm/trunk/test/CodeGen/Mips/llvm-ir/ashr.ll<br> llvm/trunk/test/CodeGen/Mips/llvm-ir/lshr.ll<br> llvm/trunk/test/CodeGen/Mips/llvm-ir/shl.ll<br> llvm/trunk/test/CodeGen/Mips/llvm-ir/sub.ll<br> llvm/trunk/test/CodeGen/PowerPC/MCSE-caller-preserved-reg.ll<br> llvm/trunk/test/CodeGen/PowerPC/fma-mutate.ll<br> llvm/trunk/test/CodeGen/PowerPC/gpr-vsr-spill.ll<br> llvm/trunk/test/CodeGen/PowerPC/licm-remat.ll<br> llvm/trunk/test/CodeGen/PowerPC/opt-li-add-to-addi.ll<br> llvm/trunk/test/CodeGen/PowerPC/tail-dup-layout.ll<br> llvm/trunk/test/CodeGen/SPARC/32abi.ll<br> llvm/trunk/test/CodeGen/SPARC/atomics.ll<br> llvm/trunk/test/CodeGen/SystemZ/vec-sub-01.ll<br> llvm/trunk/test/CodeGen/Thumb/pr35836.ll<br> llvm/trunk/test/CodeGen/Thumb/thumb-shrink-wrapping.ll<br> llvm/trunk/test/CodeGen/X86/2006-03-01-InstrSchedBug.ll<br> llvm/trunk/test/CodeGen/X86/arg-copy-elide.ll<br> llvm/trunk/test/CodeGen/X86/avx-load-store.ll<br> llvm/trunk/test/CodeGen/X86/avx512-bugfix-25270.ll<br> llvm/trunk/test/CodeGen/X86/avx512-calling-conv.ll<br> llvm/trunk/test/CodeGen/X86/avx512-regcall-NoMask.ll<br> llvm/trunk/test/CodeGen/X86/avx512bw-intrinsics-fast-isel.ll<br> llvm/trunk/test/CodeGen/X86/avx512bw-intrinsics-upgrade.ll<br> llvm/trunk/test/CodeGen/X86/buildvec-insertvec.ll<br> llvm/trunk/test/CodeGen/X86/combine-fcopysign.ll<br> llvm/trunk/test/CodeGen/X86/combine-shl.ll<br> llvm/trunk/test/CodeGen/X86/complex-fastmath.ll<br> llvm/trunk/test/CodeGen/X86/divide-by-constant.ll<br> llvm/trunk/test/CodeGen/X86/fmaxnum.ll<br> llvm/trunk/test/CodeGen/X86/fmf-flags.ll<br> llvm/trunk/test/CodeGen/X86/fminnum.ll<br> llvm/trunk/test/CodeGen/X86/fp128-i128.ll<br> llvm/trunk/test/CodeGen/X86/h-registers-1.ll<br> llvm/trunk/test/CodeGen/X86/haddsub-2.ll<br> llvm/trunk/test/CodeGen/X86/haddsub-3.ll<br> llvm/trunk/test/CodeGen/X86/haddsub-undef.ll<br> llvm/trunk/test/CodeGen/X86/half.ll<br> llvm/trunk/test/CodeGen/X86/horizontal-reduce-smax.ll<br> llvm/trunk/test/CodeGen/X86/horizontal-reduce-smin.ll<br> llvm/trunk/test/CodeGen/X86/horizontal-reduce-umax.ll<br> llvm/trunk/test/CodeGen/X86/horizontal-reduce-umin.ll<br> llvm/trunk/test/CodeGen/X86/inline-asm-fpstack.ll<br> llvm/trunk/test/CodeGen/X86/ipra-local-linkage.ll<br> llvm/trunk/test/CodeGen/X86/localescape.ll<br> llvm/trunk/test/CodeGen/X86/machine-cp.ll<br> llvm/trunk/test/CodeGen/X86/mul-i1024.ll<br> llvm/trunk/test/CodeGen/X86/mul-i256.ll<br> llvm/trunk/test/CodeGen/X86/mul-i512.ll<br> llvm/trunk/test/CodeGen/X86/mul128.ll<br> llvm/trunk/test/CodeGen/X86/mulvi32.ll<br> llvm/trunk/test/CodeGen/X86/pmul.ll<br> llvm/trunk/test/CodeGen/X86/powi.ll<br> llvm/trunk/test/CodeGen/X86/pr11334.ll<br> llvm/trunk/test/CodeGen/X86/pr29112.ll<br> llvm/trunk/test/CodeGen/X86/pr34080-2.ll<br> llvm/trunk/test/CodeGen/X86/psubus.ll<br> llvm/trunk/test/CodeGen/X86/retpoline-external.ll<br> llvm/trunk/test/CodeGen/X86/retpoline.ll<br> llvm/trunk/test/CodeGen/X86/sad.ll<br> llvm/trunk/test/CodeGen/X86/safestack.ll<br> llvm/trunk/test/CodeGen/X86/safestack_inline.ll<br> llvm/trunk/test/CodeGen/X86/scalar_widen_div.ll<br> llvm/trunk/test/CodeGen/X86/select.ll<br> llvm/trunk/test/CodeGen/X86/shrink-wrap-chkstk.ll<br> llvm/trunk/test/CodeGen/X86/slow-pmulld.ll<br> llvm/trunk/test/CodeGen/X86/sqrt-fastmath.ll<br> llvm/trunk/test/CodeGen/X86/sse-scalar-fp-arith.ll<br> llvm/trunk/test/CodeGen/X86/sse1.ll<br> llvm/trunk/test/CodeGen/X86/sse3-avx-addsub-2.ll<br> llvm/trunk/test/CodeGen/X86/statepoint-live-in.ll<br> llvm/trunk/test/CodeGen/X86/statepoint-stack-usage.ll<br> llvm/trunk/test/CodeGen/X86/vec_fp_to_int.ll<br> llvm/trunk/test/CodeGen/X86/vec_int_to_fp.ll<br> llvm/trunk/test/CodeGen/X86/vec_minmax_sint.ll<br> llvm/trunk/test/CodeGen/X86/vec_shift4.ll<br> llvm/trunk/test/CodeGen/X86/vector-blend.ll<br> llvm/trunk/test/CodeGen/X86/vector-idiv-sdiv-128.ll<br> llvm/trunk/test/CodeGen/X86/vector-idiv-udiv-128.ll<br> llvm/trunk/test/CodeGen/X86/vector-mul.ll<br> llvm/trunk/test/CodeGen/X86/vector-rotate-128.ll<br> llvm/trunk/test/CodeGen/X86/vector-sext.ll<br> llvm/trunk/test/CodeGen/X86/vector-shift-ashr-128.ll<br> llvm/trunk/test/CodeGen/X86/vector-shift-lshr-128.ll<br> llvm/trunk/test/CodeGen/X86/vector-shift-shl-128.ll<br> llvm/trunk/test/CodeGen/X86/vector-shuffle-combining.ll<br> llvm/trunk/test/CodeGen/X86/vector-trunc-math.ll<br> llvm/trunk/test/CodeGen/X86/vector-trunc-packus.ll<br> llvm/trunk/test/CodeGen/X86/vector-trunc-ssat.ll<br> llvm/trunk/test/CodeGen/X86/vector-trunc-usat.ll<br> llvm/trunk/test/CodeGen/X86/vector-zext.ll<br> llvm/trunk/test/CodeGen/X86/vselect-minmax.ll<br> llvm/trunk/test/CodeGen/X86/widen_conv-3.ll<br> llvm/trunk/test/CodeGen/X86/widen_conv-4.ll<br> llvm/trunk/test/CodeGen/X86/win64_frame.ll<br> llvm/trunk/test/CodeGen/X86/x86-interleaved-access.ll<br> llvm/trunk/test/CodeGen/X86/x86-shrink-wrap-unwind.ll<br> llvm/trunk/test/CodeGen/X86/x86-shrink-wrapping.ll<br> llvm/trunk/test/DebugInfo/COFF/fpo-shrink-wrap.ll<br> llvm/trunk/test/DebugInfo/X86/spill-nospill.ll<br><br><D41835.132429.patch>_______________________________________________<br>llvm-commits mailing list<br><a href="mailto:llvm-commits@lists.llvm.org">llvm-commits@lists.llvm.org</a><br><a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits</a><o:p></o:p></p></div></div></blockquote></div><p class=MsoNormal><o:p> </o:p></p></div></div></div></div></body></html>