[llvm] [ValueTracking] Fix Overflow with i1 Constant GEPs (PR #125470)

Mon Feb 3 01:28:10 PST 2025

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Pierre van Houtryve (Pierre-vh)

<details>
<summary>Changes</summary>

The LoadStoreVectorizer can cause ValueTracking to crash with I1 GEPs. ValueTracking creates a 1 bit APInt and then tries to multiply it.

This changes the minimum width of those APInts to 8 bits to avoid the issue.

Fixes SWDEV-507697

---
Full diff: https://github.com/llvm/llvm-project/pull/125470.diff


2 Files Affected:

- (modified) llvm/lib/Analysis/ValueTracking.cpp (+5-2) 
- (added) llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/knownbits-gep-i1.ll (+19) 


``````````diff

diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 6b61a3546e8b7c..b76afbf0a7249b 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1477,8 +1477,11 @@ static void computeKnownBitsFromOperator(const Operator *I,
         // that this is a multiple of the minimum size.
         ScalingFactor.Zero.setLowBits(llvm::countr_zero(TypeSizeInBytes));
       } else if (IndexBits.isConstant()) {
-        APInt IndexConst = IndexBits.getConstant();
-        APInt ScalingFactor(IndexBitWidth, TypeSizeInBytes);
+        // i1 is a valid GEP index, ensure we have enough space to do the
+        // computation in that case.
+        unsigned CalcBitWidth = std::max(IndexBitWidth, 8u);
+        APInt IndexConst = IndexBits.getConstant().zext(CalcBitWidth);
+        APInt ScalingFactor(CalcBitWidth, TypeSizeInBytes);
         IndexConst *= ScalingFactor;
         AccConstIndices += IndexConst.sextOrTrunc(BitWidth);
         continue;
diff --git a/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/knownbits-gep-i1.ll b/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/knownbits-gep-i1.ll
new file mode 100644
index 00000000000000..a2dc00fbb700b3
--- /dev/null
+++ b/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/knownbits-gep-i1.ll
@@ -0,0 +1,19 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -passes=load-store-vectorizer -S -o - %s | FileCheck %s
+
+define amdgpu_kernel void @simple_users_scores() {
+; CHECK-LABEL: define amdgpu_kernel void @simple_users_scores(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[SIMPLEUSER:%.*]] = alloca [4 x i64], i32 0, align 4, addrspace(5)
+; CHECK-NEXT:    [[G:%.*]] = getelementptr i32, ptr addrspace(5) [[SIMPLEUSER]], i1 true
+; CHECK-NEXT:    store <2 x i32> zeroinitializer, ptr addrspace(5) [[G]], align 4
+; CHECK-NEXT:    ret void
+;
+entry:
+  %simpleuser = alloca [4 x i64], i32 0, align 4, addrspace(5)
+  store i32 0, ptr addrspace(5) %simpleuser, align 4
+  %G = getelementptr i32, ptr addrspace(5) %simpleuser, i1 true
+  store i32 0, ptr addrspace(5) %G, align 4
+  ret void
+}

``````````

</details>


https://github.com/llvm/llvm-project/pull/125470