[llvm] [AMDGPU] Fix SIFoldOperandsImpl::tryFoldZeroHighBits when met non-reg src1 operand. (PR #133761)

Valery Pykhtin via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 31 10:43:09 PDT 2025


https://github.com/vpykhtin created https://github.com/llvm/llvm-project/pull/133761

This happens when a constant is propagated to a V_AND 0xFFFF, reg instruction.

Fixes failures like:

```
llc: /github/llvm-project/llvm/include/llvm/CodeGen/MachineOperand.h:366: llvm::Register llvm::MachineOperand::getReg() const: Assertion `isReg() && "This is not a register operand!"' failed.
Stack dump:
0.      Program arguments: /github/llvm-project/build/Debug/bin/llc -mtriple=amdgcn -mcpu=gfx1101 -verify-machineinstrs -run-pass si-fold-operands /github/llvm-project/llvm/test/CodeGen/AMDGPU/fold-zero-high-bits-skips-non-reg.mir -o -
1.      Running pass 'Function Pass Manager' on module '/github/llvm-project/llvm/test/CodeGen/AMDGPU/fold-zero-high-bits-skips-non-reg.mir'.
2.      Running pass 'SI Fold Operands' on function '@test_tryFoldZeroHighBits_skips_nonreg'
...
#12 0x00007f5a55005cfc llvm::MachineOperand::getReg() const /work/vpykhtin/github/llvm-project/llvm/include/llvm/CodeGen/MachineOperand.h:0:5
#13 0x00007f5a555c6bf5 (anonymous namespace)::SIFoldOperandsImpl::tryFoldZeroHighBits(llvm::MachineInstr&) const /github/llvm-project/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp:1459:36
#14 0x00007f5a555c63ad (anonymous namespace)::SIFoldOperandsImpl::run(llvm::MachineFunction&) /github/llvm-project/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp:2455:11
#15 0x00007f5a555c6780 (anonymous namespace)::SIFoldOperandsLegacy::runOnMachineFunction
```

>From 885dcd85bfab2e1f149a52d32837525e0f3f03b9 Mon Sep 17 00:00:00 2001
From: Valery Pykhtin <valery.pykhtin at amd.com>
Date: Mon, 31 Mar 2025 17:35:32 +0000
Subject: [PATCH] [AMDGPU] Fix SIFoldOperandsImpl::tryFoldZeroHighBits when met
 non-reg src1 operand.

---
 llvm/lib/Target/AMDGPU/SIFoldOperands.cpp         |  2 +-
 .../AMDGPU/fold-zero-high-bits-skips-non-reg.mir  | 15 +++++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/fold-zero-high-bits-skips-non-reg.mir

diff --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
index cc15dd7cb495c..46bd5d8044c45 100644
--- a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -1453,7 +1453,7 @@ bool SIFoldOperandsImpl::tryFoldZeroHighBits(MachineInstr &MI) const {
     return false;
 
   std::optional<int64_t> Src0Imm = getImmOrMaterializedImm(MI.getOperand(1));
-  if (!Src0Imm || *Src0Imm != 0xffff)
+  if (!Src0Imm || *Src0Imm != 0xffff || !MI.getOperand(2).isReg())
     return false;
 
   Register Src1 = MI.getOperand(2).getReg();
diff --git a/llvm/test/CodeGen/AMDGPU/fold-zero-high-bits-skips-non-reg.mir b/llvm/test/CodeGen/AMDGPU/fold-zero-high-bits-skips-non-reg.mir
new file mode 100644
index 0000000000000..f0b0d1b7948dd
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/fold-zero-high-bits-skips-non-reg.mir
@@ -0,0 +1,15 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1101 -verify-machineinstrs -run-pass si-fold-operands %s -o - | FileCheck %s
+---
+name: test_tryFoldZeroHighBits_skips_nonreg
+tracksRegLiveness: true
+body: |
+  bb.0:
+    ; CHECK-LABEL: name: test_tryFoldZeroHighBits_skips_nonreg
+    ; CHECK: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+    ; CHECK-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_MOV_B32_e32_]], %subreg.sub0, [[V_MOV_B32_e32_]], %subreg.sub1
+    ; CHECK-NEXT: [[V_AND_B32_e64_:%[0-9]+]]:vgpr_32 = V_AND_B32_e64 65535, 0, implicit $exec
+  %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+  %1:vreg_64 = REG_SEQUENCE %0, %subreg.sub0, %0, %subreg.sub1
+  %2:vgpr_32 = V_AND_B32_e64 65535, %1.sub0, implicit $exec
+



More information about the llvm-commits mailing list