[llvm] [AMDGPU] Fix a potential wrong return value indicating whether a pass modifies a function (PR #88197)

Thu Apr 11 07:18:11 PDT 2024

https://github.com/shiltian updated https://github.com/llvm/llvm-project/pull/88197

>From 0aaf265b6ad151f7c05b1b879c1ff32f74c93a60 Mon Sep 17 00:00:00 2001
From: Shilei Tian <i at tianshilei.me>
Date: Thu, 11 Apr 2024 10:17:58 -0400
Subject: [PATCH] [AMDGPU] Fix a potential wrong return value indicating
 whether a pass modifies a function

When the alloca is too big for vectorization, the function could have already
been modified in previous iteration of the `for` loop.
---
 llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp    |  2 +-
 llvm/test/CodeGen/AMDGPU/half-alloca-promotion.ll | 11 +++++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/AMDGPU/half-alloca-promotion.ll

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index 6f3cdf54dceec7..c0846b123d1870 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -336,7 +336,7 @@ bool AMDGPUPromoteAllocaImpl::run(Function &F, bool PromoteToLDS) {
     if (AllocaCost > VectorizationBudget) {
       LLVM_DEBUG(dbgs() << "  Alloca too big for vectorization: " << *AI
                         << "\n");
-      return false;
+      return Changed;
     }
 
     if (tryPromoteAllocaToVector(*AI)) {
diff --git a/llvm/test/CodeGen/AMDGPU/half-alloca-promotion.ll b/llvm/test/CodeGen/AMDGPU/half-alloca-promotion.ll
new file mode 100644
index 00000000000000..cfec49f3652fbd
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/half-alloca-promotion.ll
@@ -0,0 +1,11 @@
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -passes="amdgpu-promote-alloca-to-vector" -o - %s
+; We don't really need to check anything here because with expensive check, this
+; test case crashes. The correctness of the pass is beyond the scope.
+
+define fastcc void @foo() {
+entry:
+  %det = alloca [4 x i32], align 16, addrspace(5)
+  %trkltPosTmpYZ = alloca [2 x float], align 4, addrspace(5)
+  %trkltCovTmp = alloca [3 x float], align 4, addrspace(5)
+  ret void
+}