[llvm] 1fb415f - [AMDGPU][FIX] Proper load-store-vectorizer result with opaque pointers

Fri Apr 15 11:44:24 PDT 2022

Author: Johannes Doerfert
Date: 2022-04-15T13:42:46-05:00
New Revision: 1fb415fee98efff99188507da2eedaf780ff2bab

URL: https://github.com/llvm/llvm-project/commit/1fb415fee98efff99188507da2eedaf780ff2bab
DIFF: https://github.com/llvm/llvm-project/commit/1fb415fee98efff99188507da2eedaf780ff2bab.diff

LOG: [AMDGPU][FIX] Proper load-store-vectorizer result with opaque pointers

The original code relied on the fact that we needed a bitcast
instruction (for non constant base objects). With opaque pointers there
might not be a bitcast. Always check if reordering is required instead.

Fixes: https://github.com/llvm/llvm-project/issues/54896

Differential Revision: https://reviews.llvm.org/D123694

Added: 
    llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/opaque_ptr.ll

Modified: 
    llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp b/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
index fd3706eb17d2a..2c90e8e6f250e 100644

--- a/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
@@ -1297,10 +1297,16 @@ bool Vectorizer::vectorizeLoadChain(
     CV->replaceAllUsesWith(V);
   }
 
-  // Bitcast might not be an Instruction, if the value being loaded is a
-  // constant. In that case, no need to reorder anything.
-  if (Instruction *BitcastInst = dyn_cast<Instruction>(Bitcast))
-    reorder(BitcastInst);
+  // Since we might have opaque pointers we might end up using the pointer
+  // operand of the first load (wrt. memory loaded) for the vector load. Since
+  // this first load might not be the first in the block we potentially need to
+  // reorder the pointer operand (and its operands). If we have a bitcast though
+  // it might be before the load and should be the reorder start instruction.
+  // "Might" because for opaque pointers the "bitcast" is just the first loads
+  // pointer operand, as oppposed to something we inserted at the right position
+  // ourselves.
+  Instruction *BCInst = dyn_cast<Instruction>(Bitcast);
+  reorder((BCInst && BCInst != L0->getPointerOperand()) ? BCInst : LI);
 
   eraseInstructions(Chain);
 

diff  --git a/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/opaque_ptr.ll b/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/opaque_ptr.ll
new file mode 100644
index 0000000000000..87d43e7d4ab3c
--- /dev/null
+++ b/llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/opaque_ptr.ll
@@ -0,0 +1,24 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -mtriple=amdgcn-amd-amdhsa -basic-aa -load-store-vectorizer -S -o - %s | FileCheck %s
+
+; Vectorize and emit valid code (Issue #54896).
+
+%S = type { i64, i64 }
+ at S = external global %S
+
+define i64 @order() {
+; CHECK-LABEL: @order(
+; CHECK-NEXT:    [[IDX0:%.*]] = getelementptr inbounds [[S:%.*]], ptr @S, i32 0, i32 0
+; CHECK-NEXT:    [[TMP1:%.*]] = load <2 x i64>, ptr [[IDX0]], align 8
+; CHECK-NEXT:    [[L01:%.*]] = extractelement <2 x i64> [[TMP1]], i32 0
+; CHECK-NEXT:    [[L12:%.*]] = extractelement <2 x i64> [[TMP1]], i32 1
+; CHECK-NEXT:    [[ADD:%.*]] = add i64 [[L01]], [[L12]]
+; CHECK-NEXT:    ret i64 [[ADD]]
+;
+  %idx1 = getelementptr inbounds %S, ptr @S, i32 0, i32 1
+  %l1 = load i64, i64* %idx1, align 8
+  %idx0 = getelementptr inbounds %S, ptr @S, i32 0, i32 0
+  %l0 = load i64, i64* %idx0, align 8
+  %add = add i64 %l0, %l1
+  ret i64 %add
+}