[PATCH] D57779: [SLP] Add support for throttling.

Dinar Temirbulatov via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue May 4 19:00:59 PDT 2021


dtemirbulatov added inline comments.


================
Comment at: llvm/test/Transforms/SLPVectorizer/X86/powof2div.ll:85-90
+; AVX1-NEXT:    [[TMP9:%.*]] = insertelement <4 x i32> poison, i32 [[DIV]], i32 0
+; AVX1-NEXT:    [[TMP10:%.*]] = insertelement <4 x i32> [[TMP9]], i32 [[DIV6]], i32 1
+; AVX1-NEXT:    [[TMP11:%.*]] = insertelement <4 x i32> [[TMP10]], i32 [[DIV11]], i32 2
+; AVX1-NEXT:    [[TMP12:%.*]] = insertelement <4 x i32> [[TMP11]], i32 [[DIV16]], i32 3
+; AVX1-NEXT:    [[TMP13:%.*]] = bitcast i32* [[A]] to <4 x i32>*
+; AVX1-NEXT:    store <4 x i32> [[TMP12]], <4 x i32>* [[TMP13]], align 4
----------------
ABataev wrote:
> Still looks like it does not respect mintreesize
hmm, this is not the case here, the tree height is 5 here, divide node cost is 20 and after deleting this not node, extracting from "add" node costs 4 and inserting after scalar divide cost 4 and the final tree cost is -4. llvm-mca for -mattr=+avx shows 1305 cycles before and 1609 cycles after. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57779/new/

https://reviews.llvm.org/D57779



More information about the llvm-commits mailing list