[PATCH] D69956: [AArch64][SVE] Integer reduction instructions pattern/intrinsics.

Fri Nov 29 07:02:08 PST 2019

sdesmalen added a comment.

Thanks for making these changes @dancgr! The patch is starting to look good, just a few more suggestions.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10517
+
+  if (VT == MVT::i8 || VT == MVT::i16 || VT == MVT::i32 || VT == MVT::i64) {
+    auto bitSize = VT.getSizeInBits();
----------------
Given the function name is `LowerSVEIntReduction`, is it better to assert that `DataVT.getVectorElementType().isScalarInteger()` ?

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10520
+
+    EVT OutputVT = EVT::getVectorVT(Ctx, VT, 128 / bitSize);
+
----------------
This code only works on legal vectors because of the `128/bitSize`, so you'll need to add a
```if (!TLI.isTypeLegal(DataVT))
   return SDValue```
for when e.g. `<vscale x 2 x i32>` is passed in as a type.

nit: Rather than using `128` directly, can we create something like AArch64::NeonBitsPerVector?

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10525
+    SDValue Zero = DAG.getConstant(0, dl, MVT::i64);
+
+    SDValue Result = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl, VT, Reduce, Zero);
----------------
nit: these empty spaces between the lines don't improve readability.

================
Comment at: llvm/lib/Target/AArch64/AArch64InstrInfo.td:2125
 defm LDRB : LoadUI<0b00, 1, 0b01, FPR8Op, uimm12s1, "ldr",
-                   [(set FPR8Op:$Rt,
+                   [(set (untyped FPR8Op:$Rt),
                          (load (am_indexed8 GPR64sp:$Rn, uimm12s1:$offset)))]>;
----------------
Are these changes still necessary?

================
Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:5825
+
+  def : SVE_2_Op_Pat<v1i8, op, nxv16i1, nxv16i8, !cast<Instruction>(NAME # _B)>;
+  def : SVE_2_Op_Pat<v1i16, op, nxv8i1, nxv8i16, !cast<Instruction>(NAME # _H)>;
----------------
dancgr wrote:
> huntergr wrote:
> > So this is where we implemented the INSERT_SUBREG pattern, like this:
> > 
> > ```
> >   def : Pat<(v16i8 (op (nxv16i1 PPR3bAny:$Pg), (nxv16i8 ZPR8:$Zn))),
> >             (INSERT_SUBREG (v16i8 (IMPLICIT_DEF)), (!cast<Instruction>(NAME#_B) PPR3bAny:$Pg, ZPR8:$Zn), bsub)>;
> > ```
> > 
> > Similarly for the other types and reduce multiclasses.
> > 
> > While we prefer to use `SVE_2_Op_Pat` and similar helpers where possible, sometimes we have to use more involved patterns in this file. Doing so avoids the need to add new legal types just to support the reductions.
> I will update the solution for this. I will be putting that in a re-usable pattern and simplify the lowering.
Thanks, that looks quite neat!

================
Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:296

+class SVE_2_Op_Pat_Reduce<ValueType vtd, SDPatternOperator op, ValueType vt1,
+                   ValueType vt2, Instruction inst, SubRegIndex sub>
----------------
nit: Given that the other reduction uses the normal pattern, can we rename this to `SVE_2_Op_Pat_Reduce_To_Neon` or something?

================
Comment at: llvm/test/CodeGen/AArch64/sve-int-reduce-pred.ll:33
+}
+
+define i64 @uaddv_i8(<vscale x 16 x i1> %pg, <vscale x 16 x i8> %a) {
----------------
For the ACLE `saddv_i64` is also needed, can you add this case as well? (it will just map directly to a `uaddv` instruction directly, so that should be a simple change where you call `LowerSVEIntReduction`

================
Comment at: llvm/test/CodeGen/AArch64/sve-int-reduce-pred.ll:121
+  %out = call i8 @llvm.aarch64.sve.umaxv.nxv16i8(<vscale x 16 x i1> %pg,
+                                                       <vscale x 16 x i8> %a)
+  ret i8 %out
----------------
nit: strange indentation here.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69956/new/

https://reviews.llvm.org/D69956