[PATCH] D111657: [SVE][CodeGen] Enable reciprocal estimates for scalable fdiv/fsqrt

David Sherwood via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 13 01:51:21 PDT 2021


david-arm accepted this revision.
david-arm added a comment.
This revision is now accepted and ready to land.

LGTM! I think the patch looks good to go as is, but if you do manage to work out why we're adding the compares and selects for `@fsqrt_4f32` and remove them that would be awesome. :)



================
Comment at: llvm/test/CodeGen/AArch64/sve-fp-reciprocal.ll:24
+; CHECK-NEXT:    frecps z1.s, z1.s, z2.s
+; CHECK-NEXT:    fmul z1.s, p0/m, z1.s, z2.s
+; CHECK-NEXT:    fmul z0.s, p0/m, z0.s, z1.s
----------------
It's interesting that the fmul here is the predicated form, whereas for fdiv_recip_4f32 it's the unpredicated form. This has nothing to do with your patch though, but perhaps worth investigating in the future?


================
Comment at: llvm/test/CodeGen/AArch64/sve-fp-reciprocal.ll:55
+
+define <vscale x 2 x double> @fdiv_f64(<vscale x 2 x double> %a, <vscale x 2 x double> %b) {
+; CHECK-LABEL: fdiv_f64:
----------------
nit: Just for clarity is it worth renaming these to `@fdiv_2f64` and `@fdiv_recip_2f64` to be consistent with the f32 versions?


================
Comment at: llvm/test/CodeGen/AArch64/sve-fp-reciprocal.ll:136
+; CHECK-NEXT:    fmul z1.s, z0.s, z1.s
+; CHECK-NEXT:    fcmeq p0.s, p0/z, z0.s, #0.0
+; CHECK-NEXT:    sel z0.s, p0, z0.s, z1.s
----------------
Again, I don't think this is caused by your patch, but it's probably worth investigating why we're selecting between the original input and the estimate based on a zero input. It feels inconsistent with `@fsqrt_4f32` where we don't seem to worry about the input.


================
Comment at: llvm/test/CodeGen/AArch64/sve-fp-reciprocal.ll:143
+
+define <vscale x 2 x double> @fsqrt_f64(<vscale x 2 x double> %a, <vscale x 2 x double> %b) {
+; CHECK-LABEL: fsqrt_f64:
----------------
nit: Again, maybe for consistency it's better to use the name `@fsqrt_2f64` here and below?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111657/new/

https://reviews.llvm.org/D111657



More information about the llvm-commits mailing list