[PATCH] D101460: [SLP]Try to vectorize tiny trees with shuffled gathers of extractelements.
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 28 09:47:34 PDT 2021
RKSimon added inline comments.
================
Comment at: llvm/test/Transforms/SLPVectorizer/AArch64/accelerate-vector-functions-inseltpoison.ll:43
+; NOACCELERATE-NEXT: [[TMP7:%.*]] = tail call fast float @llvm.sin.f32(float [[VECEXT_3]])
+; NOACCELERATE-NEXT: [[VECINS_3:%.*]] = insertelement <4 x float> [[VECINS_2]], float [[TMP7]], i32 3
; NOACCELERATE-NEXT: ret <4 x float> [[VECINS_3]]
----------------
why do many of these libm vectorizations result in a v2f32 and 2 * f32 scalar calls? I'd expect either 2 x v2f32 or a v4f32.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D101460/new/
https://reviews.llvm.org/D101460
More information about the llvm-commits
mailing list