[PATCH] D72799: [SVE] Add SVE2 patterns for unpredicated multiply instructions

Thu Jan 16 09:44:33 PST 2020

sdesmalen requested changes to this revision.
sdesmalen added a comment.
This revision now requires changes to proceed.

Thanks for efforts on this patch @dancgr!

As you can see, I have requested some changes to the patch to avoid diverging too much while we try to upstream the ACLE.
We also need to make sure we can represent the C/C++ intrinsics with the LLVM IR intrinsics, which is a reason to have predicated smulh/umulh.

================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1186

-def int_aarch64_sve_mul        : AdvSIMD_Pred2VectorArg_Intrinsic;
-def int_aarch64_sve_smulh      : AdvSIMD_Pred2VectorArg_Intrinsic;
-def int_aarch64_sve_umulh      : AdvSIMD_Pred2VectorArg_Intrinsic;
+def int_aarch64_sve_smulh      : AdvSIMD_2VectorArg_Intrinsic;
+def int_aarch64_sve_umulh      : AdvSIMD_2VectorArg_Intrinsic;
----------------
`smulh` needs to be predicated `AdvSIMD_Pred2VectorArg_Intrinsic`, so that we can implement the predicated ACLE intrinsic with this LLVM IR intrinsic.
You can choose to optimize this predicated form (with ptrue predicate mask) into an unpredicated form using a ISel pattern.

================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1187
+def int_aarch64_sve_smulh      : AdvSIMD_2VectorArg_Intrinsic;
+def int_aarch64_sve_umulh      : AdvSIMD_2VectorArg_Intrinsic;
+def int_aarch64_sve_pmul       : AdvSIMD_2VectorArg_Intrinsic;
----------------
Same for `umulh`.

================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1192
+
+def int_aarch64_sve_mul_z      : AdvSIMD_Pred2VectorArg_Intrinsic;
+def int_aarch64_sve_smulh_z    : AdvSIMD_Pred2VectorArg_Intrinsic;
----------------
dancgr wrote:
> efriedma wrote:
> > This name seems strange.  Why "z"?
> I just followed the name similar to the one for logical predicated instructions. I'm not sure if that is the correct way to name predicated intrinsics. I marked the place with a comment.
Adding `_z` here implies that the false lanes will be zeroed, which isn't the case, as this is used with merging predication.
These intrinsics are to implement the ACLE, which does the zeroing explicitly with a select. These intrinsics are merging, unless explicitly specified.

================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:1555

 def int_aarch64_sve_and_z   : AdvSIMD_Pred2VectorArg_Intrinsic;
 def int_aarch64_sve_bic_z   : AdvSIMD_Pred2VectorArg_Intrinsic;
----------------
dancgr wrote:
> efriedma wrote:
> > dancgr wrote:
> > > @efriedma Here!
> > I think here the "_z" is meant to indicate that the predicate operand is "<Pg>/Z".
> Ok, I will update it to _m to follow the same pattern.
The reason that these intrinsics have _z in the name is because they implement an operation with zeroing predication (which maps directly to predicate AND that zeroes the false lanes).

Please don't change these suffixes, as this will give us trouble while we're in the process of implementing the ACLE upstream. We would need to update our Clang implementation to target intrinsics with slightly different names.

================
Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:2636
+
+multiclass sve2_int_mul2<bits<3> opc, string asm, SDPatternOperator op> {
+  def _B : sve2_int_mul<0b00, opc, asm, ZPR8>;
----------------
nit: `sve2_int_mul2` -> `sve2_int_mul_single` ?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72799/new/

https://reviews.llvm.org/D72799