[llvm] [SDAG] Harden assumption in getMemsetStringVal (PR #126207)

Fri Feb 7 01:38:50 PST 2025

llvmbot wrote:



@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-backend-arm

Author: Cullen Rhodes (c-rhodes)

<details>
<summary>Changes</summary>

In 5235973ee03aca4148ecabe5eff64da2af1e034e, an ICE was fixed in
getMemsetStringVal where f128 wasn't handled. It was noted at the time
[1] that the code below this also looks suspect, since it assumes the
element type of VT is either an f32 or f64.

This part of getMemsetStringVal relates to memcpy operations where the
source is a copy from a zero constant. The VT in question is determined
by TargetLowering::findOptimalMemOpLowering, which in turn calls a
further TLI hook getOptimalMemOpType.

For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32
or Other. For Other, TargetLowering::findOptimalMemOpLowering will then
pick an integer VT. So on AArch64 at least, I don't believe the suspect
code can be reached.

For other targets, ARM and x86 are the only ones that return a FP vector
type from getOptimalMemOpType. For both targets, the only such type is
v2f64, but given f64 is already handled it should also be fine.

To defend this, I considered adding an assert as mentioned in [1], but
given getConstantFP handles vector types, I figured using this to fully
handle the FP types makes the code simpler and more robust.

For test coverage I added unreachables to both of the branches handling
FP types in this code, but found neither fired with check-llvm across
all targets.

Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in
5235973ee03aca4148ecabe5eff64da2af1e034e to defend ICE on f128, but at
some point it stopped hitting this code.

AArch64TargetLowering::getOptimalMemOpType was updated in
29200611055f49a0d37243caa5f8bba1df9d57a6, so I suspect this is when it
happened, although I haven't verified this. Although I did find by
updating the test to disable NEON, getOptimalMemOpType returns an f128
and the branch is once again hit.

For the final branch noted as suspect in [1], as far as I can tell this
has never had any test coverage, so I've added a test to the ARM backend
for this.

Fixes: https://github.com/llvm/llvm-project/issues/20521 [1]

---
Full diff: https://github.com/llvm/llvm-project/pull/126207.diff


3 Files Affected:

- (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+1-9) 
- (modified) llvm/test/CodeGen/AArch64/memcpy-f128.ll (+1-1) 
- (modified) llvm/test/CodeGen/ARM/memcpy-inline.ll (+27) 


``````````diff

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 16c3b295426c648..445edf1dea41d24 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -8015,16 +8015,8 @@ static SDValue getMemsetStringVal(EVT VT, const SDLoc &dl, SelectionDAG &DAG,
   if (Slice.Array == nullptr) {
     if (VT.isInteger())
       return DAG.getConstant(0, dl, VT);
-    if (VT == MVT::f32 || VT == MVT::f64 || VT == MVT::f128)
+    if (VT.isFloatingPoint())
       return DAG.getConstantFP(0.0, dl, VT);
-    if (VT.isVector()) {
-      unsigned NumElts = VT.getVectorNumElements();
-      MVT EltVT = (VT.getVectorElementType() == MVT::f32) ? MVT::i32 : MVT::i64;
-      return DAG.getNode(ISD::BITCAST, dl, VT,
-                         DAG.getConstant(0, dl,
-                                         EVT::getVectorVT(*DAG.getContext(),
-                                                          EltVT, NumElts)));
-    }
     llvm_unreachable("Expected type!");
   }
 
diff --git a/llvm/test/CodeGen/AArch64/memcpy-f128.ll b/llvm/test/CodeGen/AArch64/memcpy-f128.ll
index 5b354dd23e01dfa..229c0034b6f636c 100644
--- a/llvm/test/CodeGen/AArch64/memcpy-f128.ll
+++ b/llvm/test/CodeGen/AArch64/memcpy-f128.ll
@@ -1,4 +1,4 @@
-; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s
+; RUN: llc < %s -mtriple=aarch64-linux-gnu -mattr=-neon | FileCheck %s
 
 %structA = type { i128 }
 @stubA = internal unnamed_addr constant %structA zeroinitializer, align 8
diff --git a/llvm/test/CodeGen/ARM/memcpy-inline.ll b/llvm/test/CodeGen/ARM/memcpy-inline.ll
index 596a58afe46e5bb..89db22d1f0ed53c 100644
--- a/llvm/test/CodeGen/ARM/memcpy-inline.ll
+++ b/llvm/test/CodeGen/ARM/memcpy-inline.ll
@@ -12,6 +12,7 @@
 @.str4 = private unnamed_addr constant [18 x i8] c"DHRYSTONE PROGR  \00", align 1
 @.str5 = private unnamed_addr constant [7 x i8] c"DHRYST\00", align 1
 @.str6 = private unnamed_addr constant [14 x i8] c"/tmp/rmXXXXXX\00", align 1
+ at empty = private unnamed_addr constant [31 x i8] zeroinitializer, align 1
 @spool.splbuf = internal global [512 x i8] zeroinitializer, align 16
 
 define i32 @t0() {
@@ -282,5 +283,31 @@ entry:
   ret void
 }
 
+define void @copy_from_zero_constant(ptr nocapture %C) nounwind {
+; CHECK-LABEL: copy_from_zero_constant:
+; CHECK:       @ %bb.0: @ %entry
+; CHECK-NEXT:    vmov.i32 q8, #0x0
+; CHECK-NEXT:    movs r1, #15
+; CHECK-NEXT:    vst1.8 {d16, d17}, [r0], r1
+; CHECK-NEXT:    vst1.8 {d16, d17}, [r0]
+; CHECK-NEXT:    bx lr
+;
+; CHECK-T1-LABEL: copy_from_zero_constant:
+; CHECK-T1:       @ %bb.0: @ %entry
+; CHECK-T1-NEXT:    .save {r7, lr}
+; CHECK-T1-NEXT:    push {r7, lr}
+; CHECK-T1-NEXT:    ldr r1, .LCPI8_0
+; CHECK-T1-NEXT:    movs r2, #31
+; CHECK-T1-NEXT:    bl __aeabi_memcpy
+; CHECK-T1-NEXT:    pop {r7, pc}
+; CHECK-T1-NEXT:    .p2align 2
+; CHECK-T1-NEXT:  @ %bb.1:
+; CHECK-T1-NEXT:  .LCPI8_0:
+; CHECK-T1-NEXT:    .long .Lempty
+entry:
+  tail call void @llvm.memcpy.p0.p0.i64(ptr %C, ptr @empty, i64 31, i1 false)
+  ret void
+}
+
 declare void @llvm.memcpy.p0.p0.i32(ptr nocapture, ptr nocapture, i32, i1) nounwind
 declare void @llvm.memcpy.p0.p0.i64(ptr nocapture, ptr nocapture, i64, i1) nounwind

``````````

</details>


https://github.com/llvm/llvm-project/pull/126207