[PATCH] D117689: [AArch64][SVE] Fold vselect into predicated fmul

Paul Walker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 25 07:54:41 PST 2022


paulwalker-arm added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:501-518
+  multiclass vselect_to_pred_inst<SDPatternOperator Op, string Inst, ValueType Ty, ValueType PredTy> {
+    def : Pat<(vselect (PredTy PPR:$Pg), (Ty ZPR:$Zn), (Ty (Op (PredTy (AArch64ptrue 31)), (Ty ZPR:$Zn), (Ty ZPR:$Zm)))),
+              (!cast<Instruction>(Inst) PPR:$Pg, ZPR:$Zn, ZPR:$Zm)>;
+}
+
+  defm : vselect_to_pred_inst<AArch64fmul_p, "FMUL_ZPmZ_H", nxv8f16, nxv8i1>;
+  defm : vselect_to_pred_inst<AArch64fmul_p, "FMUL_ZPmZ_S", nxv4f32, nxv4i1>;
----------------
MattDevereau wrote:
> paulwalker-arm wrote:
> > MattDevereau wrote:
> > > paulwalker-arm wrote:
> > > > paulwalker-arm wrote:
> > > > > peterwaller-arm wrote:
> > > > > > You might be able to condense it to something like this suggestion, by introducing another multiclass in InstrFormats `SVE_VSelect_3_Op_Pat`, analogous to `SVE_3_Op_Pat_SelZero`. What you have looks reasonable to me, you might want a second opinion from someone else as to whether my proposal here makes sense.
> > > > > If possible I prefer the patterns to be added directly to the class that defines the instruction (i.e. sve_fp_2op_p_zds in this case)?
> > > > Just to extend on this.  Perhaps you can create a PatFrags that contains both the intrinsic and vselect DAG patterns? That way you won't need to change the sve_fp_2op_p_zds class, you'll just pass in a different op in place of int_aarch64_sve_fadd for example.
> > > I don't understand what you mean by the intrinsic. I've not used the `int_aarch64_sve_fadd` naming so far so I'm not sure which part that refers to, and the only DAG nodes I saw when working on this was AArch64fadd_p. Are you suggesting I use the PatFrags to add the match to `defm FADD_ZPmZ` on line 454 when it uses `sve_fp_2op_p_zds`?
> > The `sve_fp_2op_p_zds` multiclass is used to define the FADD_ZPmZ instruction.  It takes an operator that represents the instruction at the DAG level.  Currently this is set to `int_aarch64_sve_fadd`.  With this patch you're effectively saying there's another way to represent this instruction within the DAG (i.e. `(vselect (p) (a) (fadd (a) (b)))`) and so I'm thinking you can create a `PatFrags` that contains bother operators which is them passed into `sve_fp_2op_p_zds` in place of the existing `int_aarch64_sve_fadd`.  This way you shouldn't need to create any extra isel patterns.
> Me and Peter had a dive into this, and we assumed you were getting at something like that.
> We came up with this PatFrags:
> ```
> def myfirstpatfrag : PatFrags<(ops node:$Pg, node:$Op1, node:$Op2), [
>   (int_aarch64_sve_fmul node:$Pg, node:$Op1, node:$Op2),
>   (vselect node:$Pg, node:$Op1, 
>     (AArch64fmul_p (nxv2i1 (AArch64ptrue 31)), node:$Op1, node:$Op2)
>   ),
> ]>;
> ```
> but we haven't managed to find a way to generalise the nxv2i1 part so far.
You should be able to drop the `(nxv2i1 )` part as the predicate type should be inferable.  I just gave this a quick try and it works but you also need to tighten up the definition of the predicated binops (i.e. SDT_AArch64Arith, which is used to define AArch64fmul_p)

I extended it to include `SDTCVecEltisVT<1,i1>, SDTCisSameNumEltsAs<0,1>,....`, which looks to do the trick.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117689/new/

https://reviews.llvm.org/D117689



More information about the llvm-commits mailing list