[PATCH] D146839: [TLI][AArch64] Extend SLEEF vectorized functions mapping with VLA functions
Sander de Smalen via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 28 14:13:35 PDT 2023
sdesmalen accepted this revision.
sdesmalen added a comment.
This revision is now accepted and ready to land.
LGTM, thanks for addressing all comments.
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sleef-calls-aarch64.ll:175
+ ; NEON: [[TMP5:%.*]] = call <2 x double> @_ZGVnN2vv_atan2(<2 x double> [[TMP4:%.*]], <2 x double> [[TMP4:%.*]])
+ ; SVE: [[TMP5:%.*]] = call <vscale x 2 x double> @_ZGVsMxvv_atan2(<vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x i1> {{.*}})
; CHECK: ret void
----------------
When enabling tail-folding (enabled with `-sve-tail-folding=all`) the loop no longer vectorizes because there is an Invalid cost for the call to atan2. This is unexpected, because intrinsic is predicated and so could handle the tail-folded loop.
Nothing that needs fixing in this patch, I'm just pointing out that there is more work to follow-on from this patch to benefit from the masked vector-function variants.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D146839/new/
https://reviews.llvm.org/D146839
More information about the llvm-commits
mailing list