[llvm] [AMDGPU] Promote nestedGEP allocas to vectors (PR #141199)
Carl Ritson via llvm-commits
llvm-commits at lists.llvm.org
Mon May 26 00:53:38 PDT 2025
================
@@ -0,0 +1,55 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -passes=amdgpu-promote-alloca < %s | FileCheck %s
+target triple = "amdgcn-amd-amdhsa"
+define amdgpu_ps void @scalar_alloca_ptr_with_vector_gep_of_gep(i32 %j) #0 {
+; CHECK-LABEL: define amdgpu_ps void @scalar_alloca_ptr_with_vector_gep_of_gep(
+; CHECK-SAME: i32 [[J:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[SORTEDFRAGMENTS:%.*]] = freeze <20 x i32> poison
+; CHECK-NEXT: [[TMP0:%.*]] = mul i32 [[J]], 2
+; CHECK-NEXT: [[TMP1:%.*]] = mul i32 [[J]], 2
+; CHECK-NEXT: [[TMP2:%.*]] = add i32 1, [[TMP1]]
+; CHECK-NEXT: [[TMP3:%.*]] = extractelement <20 x i32> [[SORTEDFRAGMENTS]], i32 [[TMP2]]
+; CHECK-NEXT: ret void
+;
+entry:
+ %SortedFragments = alloca [10 x <2 x i32>], align 8, addrspace(5)
+ %0 = getelementptr [10 x <2 x i32>], ptr addrspace(5) %SortedFragments, i32 0, i32 %j
+ %1 = getelementptr i8, ptr addrspace(5) %0, i32 4
+ %2 = load i32, ptr addrspace(5) %1, align 4
+ ret void
----------------
perlfu wrote:
"dead code" in this case refers to the fact that this alloca has no stores, no valid data.
This means the result of the load will always be poison, so load and all associated instructions can be removed if we made the compiler infer that.
Have a look at other existing promote alloca tests for some ideas of putting data into alloca.
https://github.com/llvm/llvm-project/pull/141199
More information about the llvm-commits
mailing list