[llvm] 3fd1cc2 - [SLP] Add Preheader to CSE blocks after hoisting CSE-able instrs.
Florian Hahn via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 12 07:53:51 PDT 2022
Author: Florian Hahn
Date: 2022-09-12T15:53:31+01:00
New Revision: 3fd1cc2574364ce438ac34da56c848ad5903d774
URL: https://github.com/llvm/llvm-project/commit/3fd1cc2574364ce438ac34da56c848ad5903d774
DIFF: https://github.com/llvm/llvm-project/commit/3fd1cc2574364ce438ac34da56c848ad5903d774.diff
LOG: [SLP] Add Preheader to CSE blocks after hoisting CSE-able instrs.
Adding the pre-header to CSEBlocks ensures instructions are CSE'd even
after hoisting.
This was original discovered by @atrick a while ago.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D133649
Added:
Modified:
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
llvm/test/Transforms/SLPVectorizer/X86/cse.ll
Removed:
################################################################################
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 391dcea52d44..99ef5cf42153 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -9032,6 +9032,7 @@ void BoUpSLP::optimizeGatherSequence() {
// We can hoist this instruction. Move it to the pre-header.
I->moveBefore(PreHeader->getTerminator());
+ CSEBlocks.insert(PreHeader);
}
// Make a list of all reachable blocks in our CSE queue.
diff --git a/llvm/test/Transforms/SLPVectorizer/X86/cse.ll b/llvm/test/Transforms/SLPVectorizer/X86/cse.ll
index 77c8f8c26067..5f8a1cfbad86 100644
--- a/llvm/test/Transforms/SLPVectorizer/X86/cse.ll
+++ b/llvm/test/Transforms/SLPVectorizer/X86/cse.ll
@@ -353,19 +353,17 @@ define void @cse_for_hoisted_instructions_in_preheader(i32* %dst, i32 %a, i1 %c)
; CHECK-NEXT: entry:
; CHECK-NEXT: [[TMP0:%.*]] = insertelement <2 x i32> poison, i32 [[A:%.*]], i32 0
; CHECK-NEXT: [[TMP1:%.*]] = insertelement <2 x i32> [[TMP0]], i32 [[A]], i32 1
-; CHECK-NEXT: [[TMP2:%.*]] = insertelement <2 x i32> poison, i32 [[A]], i32 0
-; CHECK-NEXT: [[TMP3:%.*]] = insertelement <2 x i32> [[TMP2]], i32 [[A]], i32 1
; CHECK-NEXT: br label [[LOOP:%.*]]
; CHECK: loop:
-; CHECK-NEXT: [[TMP4:%.*]] = or <2 x i32> <i32 22, i32 22>, [[TMP1]]
+; CHECK-NEXT: [[TMP2:%.*]] = or <2 x i32> <i32 22, i32 22>, [[TMP1]]
; CHECK-NEXT: [[GEP_0:%.*]] = getelementptr inbounds i32, i32* [[DST:%.*]], i64 0
-; CHECK-NEXT: [[TMP5:%.*]] = or <2 x i32> [[TMP4]], <i32 3, i32 3>
-; CHECK-NEXT: [[TMP6:%.*]] = bitcast i32* [[GEP_0]] to <2 x i32>*
-; CHECK-NEXT: store <2 x i32> [[TMP5]], <2 x i32>* [[TMP6]], align 4
-; CHECK-NEXT: [[TMP7:%.*]] = or <2 x i32> [[TMP3]], <i32 3, i32 3>
+; CHECK-NEXT: [[TMP3:%.*]] = or <2 x i32> [[TMP2]], <i32 3, i32 3>
+; CHECK-NEXT: [[TMP4:%.*]] = bitcast i32* [[GEP_0]] to <2 x i32>*
+; CHECK-NEXT: store <2 x i32> [[TMP3]], <2 x i32>* [[TMP4]], align 4
+; CHECK-NEXT: [[TMP5:%.*]] = or <2 x i32> [[TMP1]], <i32 3, i32 3>
; CHECK-NEXT: [[GEP_2:%.*]] = getelementptr inbounds i32, i32* [[DST]], i64 10
-; CHECK-NEXT: [[TMP8:%.*]] = bitcast i32* [[GEP_2]] to <2 x i32>*
-; CHECK-NEXT: store <2 x i32> [[TMP7]], <2 x i32>* [[TMP8]], align 4
+; CHECK-NEXT: [[TMP6:%.*]] = bitcast i32* [[GEP_2]] to <2 x i32>*
+; CHECK-NEXT: store <2 x i32> [[TMP5]], <2 x i32>* [[TMP6]], align 4
; CHECK-NEXT: br i1 [[C:%.*]], label [[LOOP]], label [[EXIT:%.*]]
; CHECK: exit:
; CHECK-NEXT: ret void
More information about the llvm-commits
mailing list