[PATCH] D99699: [AArch64][SVE] Lowering sve.dot to DOT node

Thu Apr 1 04:24:56 PDT 2021

junparser added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:2152
+
+  if (N->getOpcode() != AArch64ISD::DUP && N->getOpcode() != ISD::SPLAT_VECTOR)
+    return false;
----------------
joechrisellis wrote:
> A cursory glance at `ISD::isConstantSplatVectorAllZeros` suggests that the SPLAT_VECTOR case is entirely captured earlier on, I think. I reckon you can get away with just:
> 
> ```
> if (N->getOpcode() != AArch64ISD::DUP)
>   return false;
> ```
> 
> 
the ISD::isConstantSplatVectorAllZeros does not check ConstantFPSDNode for SPLAT_VECTOR, that's I added it here.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:2152
+
+  if (N->getOpcode() != AArch64ISD::DUP && N->getOpcode() != ISD::SPLAT_VECTOR)
+    return false;
----------------
junparser wrote:
> joechrisellis wrote:
> > A cursory glance at `ISD::isConstantSplatVectorAllZeros` suggests that the SPLAT_VECTOR case is entirely captured earlier on, I think. I reckon you can get away with just:
> > 
> > ```
> > if (N->getOpcode() != AArch64ISD::DUP)
> >   return false;
> > ```
> > 
> > 
> the ISD::isConstantSplatVectorAllZeros does not check ConstantFPSDNode for SPLAT_VECTOR, that's I added it here.
I will add another patch to check check ConstantFPSDNode for SPLAT_VECTOR in ISD::isConstantSplatVectorAllZeros  and then remove the code here? is that ok?

================
Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll:1
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
----------------
david-arm wrote:
> paulwalker-arm wrote:
> > joechrisellis wrote:
> > > The new `CHECK:` lines introduced look simple enough to write out by hand, so I would prefer to not use `utils/update_llc_checks.py` in this case because it has made the diff a lot noisier (e.g. all of the tests have been changed syntactically in this patch, but AFAICT none of them have changed functionally).
> > > 
> > > In this case, I think it should be sufficient to write your new tests like this:
> > > 
> > > ```
> > > define <vscale x 2 x i64> @test_sdot_i64_zero(<vscale x 2 x i64> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) {
> > > ; CHECK-LABEL: test_sdot_i64_zero:
> > > ; CHECK:         sdot z0.d, z1.h, z2.h
> > > ; CHECK-NEXT:    ret
> > > entry:
> > >   %vdot1.i = call <vscale x 2 x i64> @llvm.aarch64.sve.sdot.nxv2i64(<vscale x 2 x i64> zeroinitializer, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c)
> > >   %ret = add <vscale x 2 x i64> %vdot1.i, %a
> > >   ret <vscale x 2 x i64> %ret
> > > }
> > > 
> > > define <vscale x 4 x i32> @test_sdot_i32_zero(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {
> > > ; CHECK-LABEL: test_sdot_i32_zero:
> > > ; CHECK:         sdot z0.s, z1.b, z2.b
> > > ; CHECK-NEXT:    ret
> > > entry:
> > >   %vdot1.i = call <vscale x 4 x i32> @llvm.aarch64.sve.sdot.nxv4i32(<vscale x 4 x i32> zeroinitializer, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c)
> > >   %ret = add <vscale x 4 x i32> %vdot1.i, %a
> > >   ret <vscale x 4 x i32> %ret
> > > }
> > > ```
> > > 
> > > ... and:
> > > 
> > > ```
> > > define <vscale x 2 x i64> @test_udot_i64_zero(<vscale x 2 x i64> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) {
> > > ; CHECK-LABEL: test_udot_i64_zero:
> > > ; CHECK:         udot z0.d, z1.h, z2.h
> > > ; CHECK-NEXT:    ret
> > > entry:
> > >   %vdot1.i = call <vscale x 2 x i64> @llvm.aarch64.sve.udot.nxv2i64(<vscale x 2 x i64> zeroinitializer, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c)
> > >   %ret = add <vscale x 2 x i64> %vdot1.i, %a
> > >   ret <vscale x 2 x i64> %ret
> > > }
> > > 
> > > define <vscale x 4 x i32> @test_udot_i32_zero(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {
> > > ; CHECK-LABEL: test_udot_i32_zero:
> > > ; CHECK:         udot z0.s, z1.b, z2.b
> > > ; CHECK-NEXT:    ret
> > > entry:
> > >   %vdot1.i = call <vscale x 4 x i32> @llvm.aarch64.sve.udot.nxv4i32(<vscale x 4 x i32> zeroinitializer, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c)
> > >   %ret = add <vscale x 4 x i32> %vdot1.i, %a
> > >   ret <vscale x 4 x i32> %ret
> > > }
> > > ```
> > > 
> > > ... and then just revert all other changes in the file. :smile:
> > I think not using update_llc_checks.py was a mistake when landing the original file.  Although using it now makes the patch look bigger it will make it easier to update in the future, which is a good thing.
> I think the LLVM community prefers using utils/update_llc_test_checks.py where possible, in particular for small functions like those here.
I also prefer to use update_llc_checks.py here.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99699/new/

https://reviews.llvm.org/D99699