[PATCH] D146839: [TLI][AArch64] Extend SLEEF vectorized functions mapping with VLA functions

Tue Mar 28 14:13:35 PDT 2023

sdesmalen accepted this revision.
sdesmalen added a comment.
This revision is now accepted and ready to land.

LGTM, thanks for addressing all comments.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sleef-calls-aarch64.ll:175
+  ; NEON:     [[TMP5:%.*]] = call <2 x double> @_ZGVnN2vv_atan2(<2 x double> [[TMP4:%.*]], <2 x double> [[TMP4:%.*]])
+  ; SVE:      [[TMP5:%.*]] = call <vscale x 2 x double> @_ZGVsMxvv_atan2(<vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x double> [[TMP4:%.*]], <vscale x 2 x i1> {{.*}})
   ; CHECK:    ret void
----------------
When enabling tail-folding (enabled with `-sve-tail-folding=all`) the loop no longer vectorizes because there is an Invalid cost for the call to atan2. This is unexpected, because intrinsic is predicated and so could handle the tail-folded loop.

Nothing that needs fixing in this patch, I'm just pointing out that there is more work to follow-on from this patch to benefit from the masked vector-function variants.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146839/new/

https://reviews.llvm.org/D146839