[PATCH] D124852: [AMDGPU] Enable copying SCC

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 20 09:59:54 PST 2022


foad added inline comments.
Herald added a subscriber: kosarev.


================
Comment at: llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp:109
 
-  if (!IsClone && !IsCloned)
+  if (true)
     for (SDNode *User : Node->uses()) {
----------------
foad wrote:
> arsenm wrote:
> > foad wrote:
> > > FIXME - I'm not comfortable committing this without a better understanding of what's going on. I think the problem might be that we generate a DAG like this:
> > > ```
> > >   t0: ch = EntryToken
> > >   t1: i32 = S_MOV_B32 TargetConstant:i32<0>
> > >       t27: i32 = V_MOV_B32_e32 TargetConstant:i32<0>
> > >     t3: i32,ch = DS_READ_B32_gfx9<Mem:(load (s32) from `i32 addrspace(3)* null`, align 16, addrspace 3)> t27, TargetConstant:i16<0>, TargetConstant:i1<0>, t0
> > >   t21: i1 = S_CMP_LG_U32 t3, t1
> > >         t7: i32 = S_MOV_B32 TargetConstant:i32<1>
> > >       t15: i32,i1 = S_SUB_CO_PSEUDO t7, t1, t21
> > >       t25: ch,glue = CopyToReg t0, Register:i1 $scc, t21
> > >     t23: i32 = S_CSELECT_B32 t15, t1, t25:1
> > >   t11: ch,glue = CopyToReg t0, Register:i32 $vgpr0, t23
> > >   t12: ch = SI_RETURN Register:i32 $vgpr0, t11, t11:1
> > > ```
> > > Note that t21 is defined as a physreg by S_CMP_LG_U32, but used as a virtreg by S_SUB_CO_PSEUDO. Maybe this does not happen on other targets, so EmitCopyFromReg has not previously had to support it?
> > I don't understand this part or why it's necessary. What happens if you remove this?
> > What happens if you remove this?
> 
> It fails in test/CodeGen/AMDGPU/setcc-multiple-use.ll:
> ```
> *** Final schedule ***
> SU(4): t27: i32 = V_MOV_B32_e32 TargetConstant:i32<0>
> SU(3): t3: i32,ch = DS_READ_B32_gfx9<Mem:(load (s32) from `i32 addrspace(3)* null`, align 16, addrspace 3)> t27, TargetConstant:i16<0>, TargetConstant:i1<0>, t0
> SU(5): t1: i32 = S_MOV_B32 TargetConstant:i32<0>
> SU(2): t21: i1 = S_CMP_LG_U32 t3, t1
> SU(7): t7: i32 = S_MOV_B32 TargetConstant:i32<1>
> SU(6): t15: i32,i1 = S_SUB_CO_PSEUDO t7, t1, t21
> SU(8): t21: i1 = S_CMP_LG_U32 t3, t1
> SU(1): t23: i32 = S_CSELECT_B32 t15, t1, t25:1
>     t25: ch,glue = CopyToReg t0, Register:i1 $scc, t21
> SU(0): t12: ch = SI_RETURN Register:i32 $vgpr0, t11, t11:1
>     t11: ch,glue = CopyToReg t0, Register:i32 $vgpr0, t23
> 
> llc: /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/Register.h:78: static unsigned int llvm::Register::virtReg2Index(llvm::Register): Assertion `isVirtualRegister(Reg) && "Not a virtual register"' failed.
> PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
> Stack dump:
> 0.	Program arguments: /home/jayfoad2/llvm-debug/bin/llc -march=amdgcn -mcpu=gfx1030 -o - /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/setcc-multiple-use.ll -debug -print-after-all
> 1.	Running pass 'CallGraph Pass Manager' on module '/home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/setcc-multiple-use.ll'.
> 2.	Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function '@f'
>  #0 0x0000000008211f1a llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/jayfoad2/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:565:11
>  #1 0x00000000082120eb PrintStackTraceSignalHandler(void*) /home/jayfoad2/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1
>  #2 0x00000000082106ab llvm::sys::RunSignalHandlers() /home/jayfoad2/git/llvm-project/llvm/lib/Support/Signals.cpp:102:5
>  #3 0x0000000008212861 SignalHandler(int) /home/jayfoad2/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
>  #4 0x00007f58ede623c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x143c0)
>  #5 0x00007f58ed8f503b raise /build/glibc-sMfBJT/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
>  #6 0x00007f58ed8d4859 abort /build/glibc-sMfBJT/glibc-2.31/stdlib/abort.c:81:7
>  #7 0x00007f58ed8d4729 get_sysdep_segment_value /build/glibc-sMfBJT/glibc-2.31/intl/loadmsgcat.c:509:8
>  #8 0x00007f58ed8d4729 _nl_load_domain /build/glibc-sMfBJT/glibc-2.31/intl/loadmsgcat.c:970:34
>  #9 0x00007f58ed8e6006 (/lib/x86_64-linux-gnu/libc.so.6+0x34006)
> #10 0x0000000004a63efb llvm::Register::virtReg2Index(llvm::Register) /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/Register.h:79:12
> #11 0x0000000004a63dfd llvm::VirtReg2IndexFunctor::operator()(llvm::Register) const /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/TargetRegisterInfo.h:1264:5
> #12 0x0000000004a63c72 llvm::IndexedMap<std::pair<llvm::PointerUnion<llvm::TargetRegisterClass const*, llvm::RegisterBank const*>, llvm::MachineOperand*>, llvm::VirtReg2IndexFunctor>::operator[](llvm::Register) const /home/jayfoad2/git/llvm-project/llvm/include/llvm/ADT/IndexedMap.h:52:7
> #13 0x0000000004a63965 llvm::MachineRegisterInfo::getRegClass(llvm::Register) const /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/MachineRegisterInfo.h:643:5
> #14 0x0000000006f17405 llvm::MachineRegisterInfo::constrainRegClass(llvm::Register, llvm::TargetRegisterClass const*, unsigned int) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:86:60
> #15 0x0000000007e235e7 llvm::InstrEmitter::AddRegisterOperand(llvm::MachineInstrBuilder&, llvm::SDValue, unsigned int, llvm::MCInstrDesc const*, llvm::DenseMap<llvm::SDValue, llvm::Register, llvm::DenseMapInfo<llvm::SDValue, void>, llvm::detail::DenseMapPair<llvm::SDValue, llvm::Register> >&, bool, bool, bool) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp:320:34
> #16 0x0000000007e23b3d llvm::InstrEmitter::AddOperand(llvm::MachineInstrBuilder&, llvm::SDValue, unsigned int, llvm::MCInstrDesc const*, llvm::DenseMap<llvm::SDValue, llvm::Register, llvm::DenseMapInfo<llvm::SDValue, void>, llvm::detail::DenseMapPair<llvm::SDValue, llvm::Register> >&, bool, bool, bool) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp:374:3
> #17 0x0000000007e27b8b llvm::InstrEmitter::EmitMachineNode(llvm::SDNode*, bool, bool, llvm::DenseMap<llvm::SDValue, llvm::Register, llvm::DenseMapInfo<llvm::SDValue, void>, llvm::detail::DenseMapPair<llvm::SDValue, llvm::Register> >&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp:1045:49
> #18 0x0000000007e20a80 llvm::InstrEmitter::EmitNode(llvm::SDNode*, bool, bool, llvm::DenseMap<llvm::SDValue, llvm::Register, llvm::DenseMapInfo<llvm::SDValue, void>, llvm::detail::DenseMapPair<llvm::SDValue, llvm::Register> >&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.h:143:7
> #19 0x0000000007e4211c llvm::ScheduleDAGSDNodes::EmitSchedule(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&)::$_1::operator()(llvm::SDNode*, bool, bool, llvm::DenseMap<llvm::SDValue, llvm::Register, llvm::DenseMapInfo<llvm::SDValue, void>, llvm::detail::DenseMapPair<llvm::SDValue, llvm::Register> >&) const /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp:868:13
> #20 0x0000000007e4157b llvm::ScheduleDAGSDNodes::EmitSchedule(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp:942:10
> #21 0x0000000007f60480 llvm::SelectionDAGISel::CodeGenAndEmitDAG() /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1010:42
> #22 0x0000000007f5e92f llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, true>, llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, true>, bool&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:732:1
> #23 0x0000000007f5e387 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:0:7
> #24 0x0000000007f5b5a3 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:512:3
> #25 0x0000000004e2162a AMDGPUDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) /home/jayfoad2/git/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp:134:3
> #26 0x0000000006e48ae7 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:8
> #27 0x0000000007550fdc llvm::FPPassManager::runOnFunction(llvm::Function&) /home/jayfoad2/git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1430:23
> #28 0x00000000066b1ceb (anonymous namespace)::CGPassManager::RunPassOnSCC(llvm::Pass*, llvm::CallGraphSCC&, llvm::CallGraph&, bool&, bool&) /home/jayfoad2/git/llvm-project/llvm/lib/Analysis/CallGraphSCCPass.cpp:179:20
> #29 0x00000000066b15e8 (anonymous namespace)::CGPassManager::RunAllPassesOnSCC(llvm::CallGraphSCC&, llvm::CallGraph&, bool&) /home/jayfoad2/git/llvm-project/llvm/lib/Analysis/CallGraphSCCPass.cpp:476:10
> #30 0x00000000066b0f08 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) /home/jayfoad2/git/llvm-project/llvm/lib/Analysis/CallGraphSCCPass.cpp:542:18
> #31 0x00000000075519a4 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/jayfoad2/git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1545:23
> #32 0x00000000075514c8 llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/jayfoad2/git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:535:16
> #33 0x0000000007556681 llvm::legacy::PassManager::run(llvm::Module&) /home/jayfoad2/git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1672:3
> #34 0x00000000046048c6 compileModule(char**, llvm::LLVMContext&) /home/jayfoad2/git/llvm-project/llvm/tools/llc/llc.cpp:735:41
> #35 0x0000000004602b47 main /home/jayfoad2/git/llvm-project/llvm/tools/llc/llc.cpp:419:13
> #36 0x00007f58ed8d60b3 __libc_start_main /build/glibc-sMfBJT/glibc-2.31/csu/../csu/libc-start.c:342:3
> #37 0x000000000460229e _start (/home/jayfoad2/llvm-debug/bin/llc+0x460229e)
> Aborted (core dumped)
> ```
> The problem seems to be that the S_CMP_LG_U32 node gets cloned. Then it is emitted, and InstrEmitter::EmitMachineNode calls EmitCopyFromReg to copy the result from scc to a virtreg. But because it is a clone, EmitCopyFromReg skips the loop over all uses of the node and leaves MatchReg set to true (wrongly), so it does not bother to insert a copy after all:
> ```
>   // If all uses are reading from the src physical register and copying the
>   // register is either impossible or very expensive, then don't create a copy.
>   if (MatchReg && SrcRC->getCopyCost() < 0) {
> ```
> Then later, when emitting the S_SUB_CO_PSEUDO, we try to constrain the result of t21 to the operand's regclass, and fail because it is a physreg but we expected a virtreg.
> 
> I did try an alternative fix to EmitCopyFromReg, but it caused lots of code quality regressions including on X86 and ARM:
> ```
> diff --git a/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp b/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
> index 1f4256662031..efc0b0efee9e 100644
> --- a/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
> +++ b/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
> @@ -97,17 +97,18 @@ EmitCopyFromReg(SDNode *Node, unsigned ResNo, bool IsClone, bool IsCloned,
>    }
>  
>    // If the node is only used by a CopyToReg and the dest reg is a vreg, use
>    // the CopyToReg'd destination register instead of creating a new vreg.
> -  bool MatchReg = true;
> +  bool MatchReg = false;
>    const TargetRegisterClass *UseRC = nullptr;
>    MVT VT = Node->getSimpleValueType(ResNo);
>  
>    // Stick to the preferred register classes for legal types.
>    if (TLI->isTypeLegal(VT))
>      UseRC = TLI->getRegClassFor(VT, Node->isDivergent());
>  
> -  if (!IsClone && !IsCloned)
> +  if (!IsClone && !IsCloned) {
> +    MatchReg = true;
>      for (SDNode *User : Node->uses()) {
>        bool Match = true;
>        if (User->getOpcode() == ISD::CopyToReg &&
>            User->getOperand(2).getNode() == Node &&
> @@ -150,8 +151,9 @@ EmitCopyFromReg(SDNode *Node, unsigned ResNo, bool IsClone, bool IsCloned,
>        MatchReg &= Match;
>        if (VRBase)
>          break;
>      }
> +  }
>  
>    const TargetRegisterClass *SrcRC = nullptr, *DstRC = nullptr;
>    SrcRC = TRI->getMinimalPhysRegClass(SrcReg, VT);
>  
> ```
I've split this part into its own patch: D140417


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124852/new/

https://reviews.llvm.org/D124852



More information about the llvm-commits mailing list