[PATCH] D101460: [SLP]Try to vectorize tiny trees with shuffled gathers of extractelements.

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 28 09:47:34 PDT 2021


RKSimon added inline comments.


================
Comment at: llvm/test/Transforms/SLPVectorizer/AArch64/accelerate-vector-functions-inseltpoison.ll:43
+; NOACCELERATE-NEXT:    [[TMP7:%.*]] = tail call fast float @llvm.sin.f32(float [[VECEXT_3]])
+; NOACCELERATE-NEXT:    [[VECINS_3:%.*]] = insertelement <4 x float> [[VECINS_2]], float [[TMP7]], i32 3
 ; NOACCELERATE-NEXT:    ret <4 x float> [[VECINS_3]]
----------------
why do many of these libm vectorizations result in a v2f32 and 2 * f32 scalar calls? I'd expect either 2 x v2f32 or a v4f32.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101460/new/

https://reviews.llvm.org/D101460



More information about the llvm-commits mailing list