[PATCH] D111657: [SVE][CodeGen] Enable reciprocal estimates for scalable fdiv/fsqrt

Thu Oct 14 09:20:03 PDT 2021

paulwalker-arm added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:8150
+       (ST->hasSVE() &&
+        (VT == MVT::nxv2f32 || VT == MVT::nxv4f32 || VT == MVT::nxv2f64))) {
     if (ExtraSteps == TargetLoweringBase::ReciprocalEstimate::Unspecified)
----------------
Out of interest is there a reason we ignore f16 vectors here?

================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:1899-1923
+  def : Pat<(nxv2f32 (AArch64frecps (nxv2f32 ZPR:$Zs1), (nxv2f32 ZPR:$Zs2))),
+            (FRECPS_ZZZ_S ZPR:$Zs1, ZPR:$Zs2)>;
+  def : Pat<(nxv4f32 (AArch64frecps (nxv4f32 ZPR:$Zs1), (nxv4f32 ZPR:$Zs2))),
+            (FRECPS_ZZZ_S ZPR:$Zs1, ZPR:$Zs2)>;
+  def : Pat<(nxv2f64 (AArch64frecps (nxv2f64 ZPR:$Zs1), (nxv2f64 ZPR:$Zs2))),
+            (FRECPS_ZZZ_D ZPR:$Zs1, ZPR:$Zs2)>;
+  def : Pat<(nxv2f32 (AArch64frsqrts (nxv2f32 ZPR:$Zs1), (nxv2f32 ZPR:$Zs2))),
----------------
Are these required?  The patterns should already exist within the instruction definition classes.  All that's needed is to add c++ code to lower the intrinsics to these `AArch64ISD` nodes, which is something we've done for other operations so as not to have duplicate patterns.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111657/new/

https://reviews.llvm.org/D111657