[PATCH] D111657: [SVE][CodeGen] Enable reciprocal estimates for scalable fdiv/fsqrt

Paul Walker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 21 03:25:19 PDT 2021


paulwalker-arm added inline comments.


================
Comment at: llvm/test/CodeGen/AArch64/sve-fp-reciprocal.ll:24
+; CHECK-NEXT:    frecps z1.s, z1.s, z2.s
+; CHECK-NEXT:    fmul z1.s, p0/m, z1.s, z2.s
+; CHECK-NEXT:    fmul z0.s, p0/m, z0.s, z1.s
----------------
david-arm wrote:
> It's interesting that the fmul here is the predicated form, whereas for fdiv_recip_4f32 it's the unpredicated form. This has nothing to do with your patch though, but perhaps worth investigating in the future?
@david-arm The predicate is generated for the unpacked types so that inactive lanes can never trigger floating point exceptions.  So this is expected behaviour.

@kmclaughlin This does raise an interesting point though.  Is it safe to use the reciprocal instructions for unpacked types?  With the answer depending on whether these instruction can generate exceptions.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111657/new/

https://reviews.llvm.org/D111657



More information about the llvm-commits mailing list