[PATCH] D118617: [AArch64][SVE] Remove false register dependency for unary FP convert operations

Peter Waller via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 1 04:20:24 PST 2022


peterwaller-arm added a comment.

Thanks for the patch, I think this is looking reasonable to my knowledge. I've spotted the cause of the issues you've mentioned.



================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:1677
   def : Pat<(nxv2f32 (AArch64fcvte_mt (nxv2i1 PPR:$Pg), (nxv2f16 ZPR:$Zs), (nxv2f32 ZPR:$Zd))),
             (FCVT_ZPmZ_HtoS ZPR:$Zd, PPR:$Pg, ZPR:$Zs)>;
 
----------------
I believe this pattern should have the same tweak as discussed below -- `SVEAllActive` and `FCVT_ZPmZ_HtoS` => `FCVT_ZPmZ_HtoS_UNDEF`. 


================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:1682
+  def : Pat<(nxv2f16 (AArch64fcvtr_mt (nxv2i1 (SVEAllActive):$Pg), (nxv2f32 ZPR:$Zs), (i64 timm0_1), (nxv2f16 ZPR:$Zd))),
             (FCVT_ZPmZ_StoH ZPR:$Zd, PPR:$Pg, ZPR:$Zs)>;
 
----------------
The change you're making is adding the condition that `$Pg` must be `(SVEAllActive)`.

This has the effect of:
* making these patterns more specific than the patterns you are introducing in `sve_fp_2op_p_zd{,r}`, so 👍, they still match when they are needed for example in `llvm/test/CodeGen/AArch64/sve-fcvt.ll`, to avoid introducing sign extension.
* if I'm not mistaken, `(SVEAllActive):$Pg` is the case where the `_UNDEF` pseudos are applicable. UNDEF in this case meaning that since the predicate is all active, it's fine if the inactive lanes are undef (since they're unused), and pseudo expansion can do its thing (and materialize a movprfx).
  * Therefore, for these patterns you have modified, you want to introduce UNDEF `FCVT_ZPmZ_StoH` => `FCVT_ZPmZ_StoH_UNDEF`. This fixes the `scvtf_stod_movprfx` case.
* One thing to beware is that you're introducing the constraint that the `$Pg` is all true. This implies there are some cases where this may have matched before but won't anymore. I believe this to be OK since floating point conversion is necessarily unpredicated in the LangRef instructions, which will cause the predicate to be all active. But we should check if it's possible to specify a predicate through the aarch64.sve.* intrinsics (or ACLE).
  * Please can you and other reviewers consider for a moment if this matters. My initial pass on this says 'no' but more time or knowledge may indicate 'yes'.

To fix the unsigned cases below (line 1705), they also need the `SVEAllActive` constraint on `$Pg` and `_UNDEF` adding.


 




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118617/new/

https://reviews.llvm.org/D118617



More information about the llvm-commits mailing list