[llvm] [X86][APX] Fix segfault in foldMemoryOperandImpl for two-address NDD fold (PR #190562)

Sun Apr 5 16:46:40 PDT 2026

llvmbot wrote:




@llvm/pr-subscribers-backend-x86

Author: Michael (MichaelShires)

<details>
<summary>Changes</summary>

The NoNDDM code path in foldMemoryOperandImpl assumed NewMI->getOperand(1) is always a register. When IsTwoAddr is true, fuseTwoAddrInst replaces operands 0-4 with memory address components, so getOperand(1) is the immediate, not a register. Calling setReg() causes a segfault in removeOperandFromUseList.

Skip the NoNDDM COPY block  when IsTwoAddr is true, since the two-address fold already correctly handles the dest==src1 constraint. 

I believe the issue was introduced with #189222 , the 'NoNDDM' block calls 'NewMI->getOperand(1).setReg()', but after 'fuseTwoAddrInst', operand 1 is an immediate, not a register.

Passes all APX regression tests. Unit test included in commit. Fixes issue #190557.

First time submitting a PR to the LLVM project, please let me know if I need to fix something! Tagging @phoebewang and @RKSimon  as potential review candidates.

---
Full diff: https://github.com/llvm/llvm-project/pull/190562.diff


2 Files Affected:

- (modified) llvm/lib/Target/X86/X86InstrInfo.cpp (+1-1) 
- (added) llvm/test/CodeGen/X86/apx/ndd-fold-twoaddr-crash.ll (+71) 


``````````diff

diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp
index 3b48a333147b5..5b6858f59e6d6 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.cpp
+++ b/llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -7592,7 +7592,7 @@ MachineInstr *X86InstrInfo::foldMemoryOperandImpl(
         NewMI->getOperand(0).setSubReg(X86::sub_32bit);
     }
 
-    if (NoNDDM) {
+    if (NoNDDM && !IsTwoAddr) {
       Register SrcReg = MI.getOperand(1).getReg();
       if (MI.killsRegister(SrcReg, /*TRI=*/nullptr))
         return NewMI;
diff --git a/llvm/test/CodeGen/X86/apx/ndd-fold-twoaddr-crash.ll b/llvm/test/CodeGen/X86/apx/ndd-fold-twoaddr-crash.ll
new file mode 100644
index 0000000000000..77eaea404a4a1
--- /dev/null
+++ b/llvm/test/CodeGen/X86/apx/ndd-fold-twoaddr-crash.ll
@@ -0,0 +1,71 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=x86_64-unknown -mattr=+ndd -verify-machineinstrs | FileCheck %s
+;
+; Verify that folding NDD instructions to non-NDD with memory operands does not
+; crash when the fold goes through the two-address path (IsTwoAddr=true).
+; This is a regression test for a segfault in foldMemoryOperandImpl where
+; NewMI->getOperand(1).setReg() was called on a memory operand instead of a
+; register operand after fuseTwoAddrInst.
+
+define void @ndd_fold_twoaddr(ptr %0, i64 %1, ptr %2, i64 %3, i1 %4) {
+; CHECK-LABEL: ndd_fold_twoaddr:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    pushq %rbp
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    pushq %r15
+; CHECK-NEXT:    .cfi_def_cfa_offset 24
+; CHECK-NEXT:    pushq %r14
+; CHECK-NEXT:    .cfi_def_cfa_offset 32
+; CHECK-NEXT:    pushq %r13
+; CHECK-NEXT:    .cfi_def_cfa_offset 40
+; CHECK-NEXT:    pushq %r12
+; CHECK-NEXT:    .cfi_def_cfa_offset 48
+; CHECK-NEXT:    pushq %rbx
+; CHECK-NEXT:    .cfi_def_cfa_offset 56
+; CHECK-NEXT:    subq $24, %rsp
+; CHECK-NEXT:    .cfi_def_cfa_offset 80
+; CHECK-NEXT:    .cfi_offset %rbx, -56
+; CHECK-NEXT:    .cfi_offset %r12, -48
+; CHECK-NEXT:    .cfi_offset %r13, -40
+; CHECK-NEXT:    .cfi_offset %r14, -32
+; CHECK-NEXT:    .cfi_offset %r15, -24
+; CHECK-NEXT:    .cfi_offset %rbp, -16
+; CHECK-NEXT:    movl %r8d, %ebx
+; CHECK-NEXT:    movq %rcx, %r14
+; CHECK-NEXT:    movq %rdx, %r15
+; CHECK-NEXT:    movq %rsi, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill
+; CHECK-NEXT:    movq %rdi, %r13
+; CHECK-NEXT:    movq $0, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Folded Spill
+; CHECK-NEXT:    movq %rcx, %r12
+; CHECK-NEXT:    movq %rcx, %rbp
+; CHECK-NEXT:    .p2align 4
+; CHECK-NEXT:  .LBB0_2: # =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:    movq %rbp, %rdi
+; CHECK-NEXT:    movq %r13, %rsi
+; CHECK-NEXT:    callq *%r15
+; CHECK-NEXT:    addq %r14, %rbp
+; CHECK-NEXT:    testb $1, %bl
+; CHECK-NEXT:    jne .LBB0_2
+; CHECK-NEXT:  # %bb.1: # in Loop: Header=BB0_2 Depth=1
+; CHECK-NEXT:    movq {{[-0-9]+}}(%r{{[sb]}}p), %rax # 8-byte Reload
+; CHECK-NEXT:    addq %rax, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Folded Reload
+; CHECK-NEXT:    addq %rax, %r12
+; CHECK-NEXT:    movq %r12, %rbp
+; CHECK-NEXT:    jmp .LBB0_2
+  br label %6
+
+6:
+  %7 = phi ptr [ %10, %9 ], [ null, %5 ]
+  %8 = icmp ugt ptr %7, null
+  br label %11
+
+9:
+  %10 = getelementptr i8, ptr %7, i64 %1
+  br label %6
+
+11:
+  %12 = phi ptr [ %13, %11 ], [ %7, %6 ]
+  %13 = getelementptr i8, ptr %12, i64 %3
+  %14 = tail call i32 %2(ptr %13, ptr %0)
+  br i1 %4, label %11, label %9
+}

``````````

</details>


https://github.com/llvm/llvm-project/pull/190562