[PATCH] D59710: [SLP] remove lower limit for forming reduction patterns
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 1 08:36:07 PDT 2019
RKSimon added inline comments.
================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/horizontal-list.ll:43
; THRESHOLD-NEXT: [[CONV:%.*]] = sitofp i32 [[MUL]] to float
-; THRESHOLD-NEXT: [[TMP1:%.*]] = load <2 x float>, <2 x float>* bitcast ([20 x float]* @arr to <2 x float>*), align 16
-; THRESHOLD-NEXT: [[TMP2:%.*]] = load <2 x float>, <2 x float>* bitcast ([20 x float]* @arr1 to <2 x float>*), align 16
-; THRESHOLD-NEXT: [[TMP3:%.*]] = fmul fast <2 x float> [[TMP2]], [[TMP1]]
-; THRESHOLD-NEXT: [[TMP4:%.*]] = extractelement <2 x float> [[TMP3]], i32 0
-; THRESHOLD-NEXT: [[ADD:%.*]] = fadd fast float [[TMP4]], [[CONV]]
-; THRESHOLD-NEXT: [[TMP5:%.*]] = extractelement <2 x float> [[TMP3]], i32 1
-; THRESHOLD-NEXT: [[ADD_1:%.*]] = fadd fast float [[TMP5]], [[ADD]]
-; THRESHOLD-NEXT: [[TMP6:%.*]] = load <2 x float>, <2 x float>* bitcast (float* getelementptr inbounds ([20 x float], [20 x float]* @arr, i64 0, i64 2) to <2 x float>*), align 8
-; THRESHOLD-NEXT: [[TMP7:%.*]] = load <2 x float>, <2 x float>* bitcast (float* getelementptr inbounds ([20 x float], [20 x float]* @arr1, i64 0, i64 2) to <2 x float>*), align 8
-; THRESHOLD-NEXT: [[TMP8:%.*]] = fmul fast <2 x float> [[TMP7]], [[TMP6]]
-; THRESHOLD-NEXT: [[TMP9:%.*]] = extractelement <2 x float> [[TMP8]], i32 0
-; THRESHOLD-NEXT: [[ADD_2:%.*]] = fadd fast float [[TMP9]], [[ADD_1]]
-; THRESHOLD-NEXT: [[TMP10:%.*]] = extractelement <2 x float> [[TMP8]], i32 1
-; THRESHOLD-NEXT: [[ADD_3:%.*]] = fadd fast float [[TMP10]], [[ADD_2]]
+; THRESHOLD-NEXT: [[TMP1:%.*]] = load float, float* getelementptr inbounds ([20 x float], [20 x float]* @arr, i64 0, i64 0), align 16
+; THRESHOLD-NEXT: [[TMP2:%.*]] = load float, float* getelementptr inbounds ([20 x float], [20 x float]* @arr1, i64 0, i64 0), align 16
----------------
ABataev wrote:
> What about this one? This also looks like a regression
Sanjay and I hve checked with godbolt/llvm-mca and this looks like a definite win (checked on bdver2, haswell and btver2). Top is scalar, middle is trunk and bottom is patched IR:
bdver2: https://godbolt.org/z/jwCPgI
haswell: https://godbolt.org/z/R-h8o_
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D59710/new/
https://reviews.llvm.org/D59710
More information about the llvm-commits
mailing list