[PATCH] D111657: [SVE][CodeGen] Enable reciprocal estimates for scalable fdiv/fsqrt
Paul Walker via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 21 03:25:19 PDT 2021
paulwalker-arm added inline comments.
================
Comment at: llvm/test/CodeGen/AArch64/sve-fp-reciprocal.ll:24
+; CHECK-NEXT: frecps z1.s, z1.s, z2.s
+; CHECK-NEXT: fmul z1.s, p0/m, z1.s, z2.s
+; CHECK-NEXT: fmul z0.s, p0/m, z0.s, z1.s
----------------
david-arm wrote:
> It's interesting that the fmul here is the predicated form, whereas for fdiv_recip_4f32 it's the unpredicated form. This has nothing to do with your patch though, but perhaps worth investigating in the future?
@david-arm The predicate is generated for the unpacked types so that inactive lanes can never trigger floating point exceptions. So this is expected behaviour.
@kmclaughlin This does raise an interesting point though. Is it safe to use the reciprocal instructions for unpacked types? With the answer depending on whether these instruction can generate exceptions.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D111657/new/
https://reviews.llvm.org/D111657
More information about the llvm-commits
mailing list