[llvm] 3fd1cc2 - [SLP] Add Preheader to CSE blocks after hoisting CSE-able instrs.

Florian Hahn via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 12 07:53:51 PDT 2022


Author: Florian Hahn
Date: 2022-09-12T15:53:31+01:00
New Revision: 3fd1cc2574364ce438ac34da56c848ad5903d774

URL: https://github.com/llvm/llvm-project/commit/3fd1cc2574364ce438ac34da56c848ad5903d774
DIFF: https://github.com/llvm/llvm-project/commit/3fd1cc2574364ce438ac34da56c848ad5903d774.diff

LOG: [SLP] Add Preheader to CSE blocks after hoisting CSE-able instrs.

Adding the pre-header to CSEBlocks ensures instructions are CSE'd even
after hoisting.

This was original discovered by @atrick a while ago.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D133649

Added: 
    

Modified: 
    llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
    llvm/test/Transforms/SLPVectorizer/X86/cse.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 391dcea52d44..99ef5cf42153 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -9032,6 +9032,7 @@ void BoUpSLP::optimizeGatherSequence() {
 
     // We can hoist this instruction. Move it to the pre-header.
     I->moveBefore(PreHeader->getTerminator());
+    CSEBlocks.insert(PreHeader);
   }
 
   // Make a list of all reachable blocks in our CSE queue.

diff  --git a/llvm/test/Transforms/SLPVectorizer/X86/cse.ll b/llvm/test/Transforms/SLPVectorizer/X86/cse.ll
index 77c8f8c26067..5f8a1cfbad86 100644
--- a/llvm/test/Transforms/SLPVectorizer/X86/cse.ll
+++ b/llvm/test/Transforms/SLPVectorizer/X86/cse.ll
@@ -353,19 +353,17 @@ define void @cse_for_hoisted_instructions_in_preheader(i32* %dst, i32 %a, i1 %c)
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[TMP0:%.*]] = insertelement <2 x i32> poison, i32 [[A:%.*]], i32 0
 ; CHECK-NEXT:    [[TMP1:%.*]] = insertelement <2 x i32> [[TMP0]], i32 [[A]], i32 1
-; CHECK-NEXT:    [[TMP2:%.*]] = insertelement <2 x i32> poison, i32 [[A]], i32 0
-; CHECK-NEXT:    [[TMP3:%.*]] = insertelement <2 x i32> [[TMP2]], i32 [[A]], i32 1
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
-; CHECK-NEXT:    [[TMP4:%.*]] = or <2 x i32> <i32 22, i32 22>, [[TMP1]]
+; CHECK-NEXT:    [[TMP2:%.*]] = or <2 x i32> <i32 22, i32 22>, [[TMP1]]
 ; CHECK-NEXT:    [[GEP_0:%.*]] = getelementptr inbounds i32, i32* [[DST:%.*]], i64 0
-; CHECK-NEXT:    [[TMP5:%.*]] = or <2 x i32> [[TMP4]], <i32 3, i32 3>
-; CHECK-NEXT:    [[TMP6:%.*]] = bitcast i32* [[GEP_0]] to <2 x i32>*
-; CHECK-NEXT:    store <2 x i32> [[TMP5]], <2 x i32>* [[TMP6]], align 4
-; CHECK-NEXT:    [[TMP7:%.*]] = or <2 x i32> [[TMP3]], <i32 3, i32 3>
+; CHECK-NEXT:    [[TMP3:%.*]] = or <2 x i32> [[TMP2]], <i32 3, i32 3>
+; CHECK-NEXT:    [[TMP4:%.*]] = bitcast i32* [[GEP_0]] to <2 x i32>*
+; CHECK-NEXT:    store <2 x i32> [[TMP3]], <2 x i32>* [[TMP4]], align 4
+; CHECK-NEXT:    [[TMP5:%.*]] = or <2 x i32> [[TMP1]], <i32 3, i32 3>
 ; CHECK-NEXT:    [[GEP_2:%.*]] = getelementptr inbounds i32, i32* [[DST]], i64 10
-; CHECK-NEXT:    [[TMP8:%.*]] = bitcast i32* [[GEP_2]] to <2 x i32>*
-; CHECK-NEXT:    store <2 x i32> [[TMP7]], <2 x i32>* [[TMP8]], align 4
+; CHECK-NEXT:    [[TMP6:%.*]] = bitcast i32* [[GEP_2]] to <2 x i32>*
+; CHECK-NEXT:    store <2 x i32> [[TMP5]], <2 x i32>* [[TMP6]], align 4
 ; CHECK-NEXT:    br i1 [[C:%.*]], label [[LOOP]], label [[EXIT:%.*]]
 ; CHECK:       exit:
 ; CHECK-NEXT:    ret void


        


More information about the llvm-commits mailing list