[llvm] b9ba053 - [AMDGPU] Don't S_MOV_B32 into $scc
Diana Picus via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 14 01:25:39 PDT 2023
Author: Diana Picus
Date: 2023-04-14T10:24:43+02:00
New Revision: b9ba05360e585729458363a53f6cba71d1a8ebb6
URL: https://github.com/llvm/llvm-project/commit/b9ba05360e585729458363a53f6cba71d1a8ebb6
DIFF: https://github.com/llvm/llvm-project/commit/b9ba05360e585729458363a53f6cba71d1a8ebb6.diff
LOG: [AMDGPU] Don't S_MOV_B32 into $scc
The peephole optimizer tries to replace
```
%n:sgpr_32 = S_MOV_B32 x
$scc = COPY %n
```
with a `S_MOV_B32` directly into `$scc`.
This crashes because `S_MOV_B32` cannot take `$scc` as input.
We currently generate code like this from GlobalISel when lowering a
G_BRCOND with a constant condition. We should probably look into
removing this kind of branch altogether, but until then we should at
least not crash.
This patch fixes the issue by making sure we don't apply the peephole
optimization when trying to move into a physical register that
doesn't belong to the correct register class.
Differential Revision: https://reviews.llvm.org/D148117
Added:
llvm/test/CodeGen/AMDGPU/peephole-fold-imm.mir
Modified:
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
Removed:
llvm/test/CodeGen/AMDGPU/fold_16bit_imm.mir
################################################################################
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 7ffcd1ba260d..cee9c28aa06e 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -3090,7 +3090,12 @@ bool SIInstrInfo::FoldImmediate(MachineInstr &UseMI, MachineInstr &DefMI,
assert(UseMI.getOperand(1).getReg().isVirtual());
}
- UseMI.setDesc(get(NewOpc));
+ const MCInstrDesc &NewMCID = get(NewOpc);
+ if (DstReg.isPhysical() &&
+ !RI.getRegClass(NewMCID.operands()[0].RegClass)->contains(DstReg))
+ return false;
+
+ UseMI.setDesc(NewMCID);
UseMI.getOperand(1).ChangeToImmediate(Imm.getSExtValue());
UseMI.addImplicitDefUseOperands(*UseMI.getParent()->getParent());
return true;
diff --git a/llvm/test/CodeGen/AMDGPU/fold_16bit_imm.mir b/llvm/test/CodeGen/AMDGPU/peephole-fold-imm.mir
similarity index 68%
rename from llvm/test/CodeGen/AMDGPU/fold_16bit_imm.mir
rename to llvm/test/CodeGen/AMDGPU/peephole-fold-imm.mir
index 97e8d0d25a8e..099aaa449b1c 100644
--- a/llvm/test/CodeGen/AMDGPU/fold_16bit_imm.mir
+++ b/llvm/test/CodeGen/AMDGPU/peephole-fold-imm.mir
@@ -1,6 +1,51 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
# RUN: llc -mtriple=amdgcn--amdhsa -mcpu=gfx908 -verify-machineinstrs -run-pass peephole-opt -o - %s | FileCheck -check-prefix=GCN %s
+---
+name: fold_simm_virtual
+body: |
+ bb.0:
+
+ ; GCN-LABEL: name: fold_simm_virtual
+ ; GCN: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
+ ; GCN-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 0
+ ; GCN-NEXT: SI_RETURN_TO_EPILOG
+ %0:sreg_32 = S_MOV_B32 0
+ %1:sreg_32 = COPY killed %0
+ SI_RETURN_TO_EPILOG
+
+...
+
+---
+name: fold_simm_physical
+body: |
+ bb.0:
+
+ ; GCN-LABEL: name: fold_simm_physical
+ ; GCN: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
+ ; GCN-NEXT: $sgpr1 = S_MOV_B32 0
+ ; GCN-NEXT: SI_RETURN_TO_EPILOG
+ %0:sreg_32 = S_MOV_B32 0
+ $sgpr1 = COPY killed %0
+ SI_RETURN_TO_EPILOG
+
+...
+
+---
+name: dont_fold_simm_scc
+body: |
+ bb.0:
+
+ ; GCN-LABEL: name: dont_fold_simm_scc
+ ; GCN: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
+ ; GCN-NEXT: $scc = COPY killed [[S_MOV_B32_]]
+ ; GCN-NEXT: SI_RETURN_TO_EPILOG
+ %0:sreg_32 = S_MOV_B32 0
+ $scc = COPY killed %0
+ SI_RETURN_TO_EPILOG
+
+...
+
---
name: fold_simm_16_sub_to_lo
body: |
More information about the llvm-commits
mailing list