[PATCH] D124852: [AMDGPU] Enable copying SCC
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue May 3 06:45:18 PDT 2022
foad added inline comments.
================
Comment at: llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp:109
- if (!IsClone && !IsCloned)
+ if (true)
for (SDNode *User : Node->uses()) {
----------------
FIXME - I'm not comfortable committing this without a better understanding of what's going on. I think the problem might be that we generate a DAG like this:
```
t0: ch = EntryToken
t1: i32 = S_MOV_B32 TargetConstant:i32<0>
t27: i32 = V_MOV_B32_e32 TargetConstant:i32<0>
t3: i32,ch = DS_READ_B32_gfx9<Mem:(load (s32) from `i32 addrspace(3)* null`, align 16, addrspace 3)> t27, TargetConstant:i16<0>, TargetConstant:i1<0>, t0
t21: i1 = S_CMP_LG_U32 t3, t1
t7: i32 = S_MOV_B32 TargetConstant:i32<1>
t15: i32,i1 = S_SUB_CO_PSEUDO t7, t1, t21
t25: ch,glue = CopyToReg t0, Register:i1 $scc, t21
t23: i32 = S_CSELECT_B32 t15, t1, t25:1
t11: ch,glue = CopyToReg t0, Register:i32 $vgpr0, t23
t12: ch = SI_RETURN Register:i32 $vgpr0, t11, t11:1
```
Note that t21 is defined as a physreg by S_CMP_LG_U32, but used as a virtreg by S_SUB_CO_PSEUDO. Maybe this does not happen on other targets, so EmitCopyFromReg has not previously had to support it?
================
Comment at: llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp:109
- if (!IsClone && !IsCloned)
+ if (true)
for (SDNode *User : Node->uses()) {
----------------
arsenm wrote:
> foad wrote:
> > FIXME - I'm not comfortable committing this without a better understanding of what's going on. I think the problem might be that we generate a DAG like this:
> > ```
> > t0: ch = EntryToken
> > t1: i32 = S_MOV_B32 TargetConstant:i32<0>
> > t27: i32 = V_MOV_B32_e32 TargetConstant:i32<0>
> > t3: i32,ch = DS_READ_B32_gfx9<Mem:(load (s32) from `i32 addrspace(3)* null`, align 16, addrspace 3)> t27, TargetConstant:i16<0>, TargetConstant:i1<0>, t0
> > t21: i1 = S_CMP_LG_U32 t3, t1
> > t7: i32 = S_MOV_B32 TargetConstant:i32<1>
> > t15: i32,i1 = S_SUB_CO_PSEUDO t7, t1, t21
> > t25: ch,glue = CopyToReg t0, Register:i1 $scc, t21
> > t23: i32 = S_CSELECT_B32 t15, t1, t25:1
> > t11: ch,glue = CopyToReg t0, Register:i32 $vgpr0, t23
> > t12: ch = SI_RETURN Register:i32 $vgpr0, t11, t11:1
> > ```
> > Note that t21 is defined as a physreg by S_CMP_LG_U32, but used as a virtreg by S_SUB_CO_PSEUDO. Maybe this does not happen on other targets, so EmitCopyFromReg has not previously had to support it?
> I don't understand this part or why it's necessary. What happens if you remove this?
> What happens if you remove this?
It fails in test/CodeGen/AMDGPU/setcc-multiple-use.ll:
```
*** Final schedule ***
SU(4): t27: i32 = V_MOV_B32_e32 TargetConstant:i32<0>
SU(3): t3: i32,ch = DS_READ_B32_gfx9<Mem:(load (s32) from `i32 addrspace(3)* null`, align 16, addrspace 3)> t27, TargetConstant:i16<0>, TargetConstant:i1<0>, t0
SU(5): t1: i32 = S_MOV_B32 TargetConstant:i32<0>
SU(2): t21: i1 = S_CMP_LG_U32 t3, t1
SU(7): t7: i32 = S_MOV_B32 TargetConstant:i32<1>
SU(6): t15: i32,i1 = S_SUB_CO_PSEUDO t7, t1, t21
SU(8): t21: i1 = S_CMP_LG_U32 t3, t1
SU(1): t23: i32 = S_CSELECT_B32 t15, t1, t25:1
t25: ch,glue = CopyToReg t0, Register:i1 $scc, t21
SU(0): t12: ch = SI_RETURN Register:i32 $vgpr0, t11, t11:1
t11: ch,glue = CopyToReg t0, Register:i32 $vgpr0, t23
llc: /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/Register.h:78: static unsigned int llvm::Register::virtReg2Index(llvm::Register): Assertion `isVirtualRegister(Reg) && "Not a virtual register"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/jayfoad2/llvm-debug/bin/llc -march=amdgcn -mcpu=gfx1030 -o - /home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/setcc-multiple-use.ll -debug -print-after-all
1. Running pass 'CallGraph Pass Manager' on module '/home/jayfoad2/git/llvm-project/llvm/test/CodeGen/AMDGPU/setcc-multiple-use.ll'.
2. Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function '@f'
#0 0x0000000008211f1a llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/jayfoad2/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:565:11
#1 0x00000000082120eb PrintStackTraceSignalHandler(void*) /home/jayfoad2/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1
#2 0x00000000082106ab llvm::sys::RunSignalHandlers() /home/jayfoad2/git/llvm-project/llvm/lib/Support/Signals.cpp:102:5
#3 0x0000000008212861 SignalHandler(int) /home/jayfoad2/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
#4 0x00007f58ede623c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x143c0)
#5 0x00007f58ed8f503b raise /build/glibc-sMfBJT/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
#6 0x00007f58ed8d4859 abort /build/glibc-sMfBJT/glibc-2.31/stdlib/abort.c:81:7
#7 0x00007f58ed8d4729 get_sysdep_segment_value /build/glibc-sMfBJT/glibc-2.31/intl/loadmsgcat.c:509:8
#8 0x00007f58ed8d4729 _nl_load_domain /build/glibc-sMfBJT/glibc-2.31/intl/loadmsgcat.c:970:34
#9 0x00007f58ed8e6006 (/lib/x86_64-linux-gnu/libc.so.6+0x34006)
#10 0x0000000004a63efb llvm::Register::virtReg2Index(llvm::Register) /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/Register.h:79:12
#11 0x0000000004a63dfd llvm::VirtReg2IndexFunctor::operator()(llvm::Register) const /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/TargetRegisterInfo.h:1264:5
#12 0x0000000004a63c72 llvm::IndexedMap<std::pair<llvm::PointerUnion<llvm::TargetRegisterClass const*, llvm::RegisterBank const*>, llvm::MachineOperand*>, llvm::VirtReg2IndexFunctor>::operator[](llvm::Register) const /home/jayfoad2/git/llvm-project/llvm/include/llvm/ADT/IndexedMap.h:52:7
#13 0x0000000004a63965 llvm::MachineRegisterInfo::getRegClass(llvm::Register) const /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/MachineRegisterInfo.h:643:5
#14 0x0000000006f17405 llvm::MachineRegisterInfo::constrainRegClass(llvm::Register, llvm::TargetRegisterClass const*, unsigned int) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:86:60
#15 0x0000000007e235e7 llvm::InstrEmitter::AddRegisterOperand(llvm::MachineInstrBuilder&, llvm::SDValue, unsigned int, llvm::MCInstrDesc const*, llvm::DenseMap<llvm::SDValue, llvm::Register, llvm::DenseMapInfo<llvm::SDValue, void>, llvm::detail::DenseMapPair<llvm::SDValue, llvm::Register> >&, bool, bool, bool) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp:320:34
#16 0x0000000007e23b3d llvm::InstrEmitter::AddOperand(llvm::MachineInstrBuilder&, llvm::SDValue, unsigned int, llvm::MCInstrDesc const*, llvm::DenseMap<llvm::SDValue, llvm::Register, llvm::DenseMapInfo<llvm::SDValue, void>, llvm::detail::DenseMapPair<llvm::SDValue, llvm::Register> >&, bool, bool, bool) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp:374:3
#17 0x0000000007e27b8b llvm::InstrEmitter::EmitMachineNode(llvm::SDNode*, bool, bool, llvm::DenseMap<llvm::SDValue, llvm::Register, llvm::DenseMapInfo<llvm::SDValue, void>, llvm::detail::DenseMapPair<llvm::SDValue, llvm::Register> >&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp:1045:49
#18 0x0000000007e20a80 llvm::InstrEmitter::EmitNode(llvm::SDNode*, bool, bool, llvm::DenseMap<llvm::SDValue, llvm::Register, llvm::DenseMapInfo<llvm::SDValue, void>, llvm::detail::DenseMapPair<llvm::SDValue, llvm::Register> >&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.h:143:7
#19 0x0000000007e4211c llvm::ScheduleDAGSDNodes::EmitSchedule(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&)::$_1::operator()(llvm::SDNode*, bool, bool, llvm::DenseMap<llvm::SDValue, llvm::Register, llvm::DenseMapInfo<llvm::SDValue, void>, llvm::detail::DenseMapPair<llvm::SDValue, llvm::Register> >&) const /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp:868:13
#20 0x0000000007e4157b llvm::ScheduleDAGSDNodes::EmitSchedule(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp:942:10
#21 0x0000000007f60480 llvm::SelectionDAGISel::CodeGenAndEmitDAG() /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1010:42
#22 0x0000000007f5e92f llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, true>, llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void>, false, true>, bool&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:732:1
#23 0x0000000007f5e387 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:0:7
#24 0x0000000007f5b5a3 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:512:3
#25 0x0000000004e2162a AMDGPUDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) /home/jayfoad2/git/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp:134:3
#26 0x0000000006e48ae7 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:8
#27 0x0000000007550fdc llvm::FPPassManager::runOnFunction(llvm::Function&) /home/jayfoad2/git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1430:23
#28 0x00000000066b1ceb (anonymous namespace)::CGPassManager::RunPassOnSCC(llvm::Pass*, llvm::CallGraphSCC&, llvm::CallGraph&, bool&, bool&) /home/jayfoad2/git/llvm-project/llvm/lib/Analysis/CallGraphSCCPass.cpp:179:20
#29 0x00000000066b15e8 (anonymous namespace)::CGPassManager::RunAllPassesOnSCC(llvm::CallGraphSCC&, llvm::CallGraph&, bool&) /home/jayfoad2/git/llvm-project/llvm/lib/Analysis/CallGraphSCCPass.cpp:476:10
#30 0x00000000066b0f08 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) /home/jayfoad2/git/llvm-project/llvm/lib/Analysis/CallGraphSCCPass.cpp:542:18
#31 0x00000000075519a4 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/jayfoad2/git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1545:23
#32 0x00000000075514c8 llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/jayfoad2/git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:535:16
#33 0x0000000007556681 llvm::legacy::PassManager::run(llvm::Module&) /home/jayfoad2/git/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1672:3
#34 0x00000000046048c6 compileModule(char**, llvm::LLVMContext&) /home/jayfoad2/git/llvm-project/llvm/tools/llc/llc.cpp:735:41
#35 0x0000000004602b47 main /home/jayfoad2/git/llvm-project/llvm/tools/llc/llc.cpp:419:13
#36 0x00007f58ed8d60b3 __libc_start_main /build/glibc-sMfBJT/glibc-2.31/csu/../csu/libc-start.c:342:3
#37 0x000000000460229e _start (/home/jayfoad2/llvm-debug/bin/llc+0x460229e)
Aborted (core dumped)
```
The problem seems to be that the S_CMP_LG_U32 node gets cloned. Then it is emitted, and InstrEmitter::EmitMachineNode calls EmitCopyFromReg to copy the result from scc to a virtreg. But because it is a clone, EmitCopyFromReg skips the loop over all uses of the node and leaves MatchReg set to true (wrongly), so it does not bother to insert a copy after all:
```
// If all uses are reading from the src physical register and copying the
// register is either impossible or very expensive, then don't create a copy.
if (MatchReg && SrcRC->getCopyCost() < 0) {
```
Then later, when emitting the S_SUB_CO_PSEUDO, we try to constrain the result of t21 to the operand's regclass, and fail because it is a physreg but we expected a virtreg.
I did try an alternative fix to EmitCopyFromReg, but it caused lots of code quality regressions including on X86 and ARM:
```
diff --git a/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp b/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
index 1f4256662031..efc0b0efee9e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
@@ -97,17 +97,18 @@ EmitCopyFromReg(SDNode *Node, unsigned ResNo, bool IsClone, bool IsCloned,
}
// If the node is only used by a CopyToReg and the dest reg is a vreg, use
// the CopyToReg'd destination register instead of creating a new vreg.
- bool MatchReg = true;
+ bool MatchReg = false;
const TargetRegisterClass *UseRC = nullptr;
MVT VT = Node->getSimpleValueType(ResNo);
// Stick to the preferred register classes for legal types.
if (TLI->isTypeLegal(VT))
UseRC = TLI->getRegClassFor(VT, Node->isDivergent());
- if (!IsClone && !IsCloned)
+ if (!IsClone && !IsCloned) {
+ MatchReg = true;
for (SDNode *User : Node->uses()) {
bool Match = true;
if (User->getOpcode() == ISD::CopyToReg &&
User->getOperand(2).getNode() == Node &&
@@ -150,8 +151,9 @@ EmitCopyFromReg(SDNode *Node, unsigned ResNo, bool IsClone, bool IsCloned,
MatchReg &= Match;
if (VRBase)
break;
}
+ }
const TargetRegisterClass *SrcRC = nullptr, *DstRC = nullptr;
SrcRC = TRI->getMinimalPhysRegClass(SrcReg, VT);
```
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124852/new/
https://reviews.llvm.org/D124852
More information about the llvm-commits
mailing list