[PATCH] D99699: [AArch64][SVE] Lowering sve.dot to DOT node

Thu Apr 1 04:33:28 PDT 2021

dmgreen added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:2152
+
+  if (N->getOpcode() != AArch64ISD::DUP && N->getOpcode() != ISD::SPLAT_VECTOR)
+    return false;
----------------
junparser wrote:
> junparser wrote:
> > joechrisellis wrote:
> > > A cursory glance at `ISD::isConstantSplatVectorAllZeros` suggests that the SPLAT_VECTOR case is entirely captured earlier on, I think. I reckon you can get away with just:
> > > 
> > > ```
> > > if (N->getOpcode() != AArch64ISD::DUP)
> > >   return false;
> > > ```
> > > 
> > > 
> > the ISD::isConstantSplatVectorAllZeros does not check ConstantFPSDNode for SPLAT_VECTOR, that's I added it here.
> I will add another patch to check check ConstantFPSDNode for SPLAT_VECTOR in ISD::isConstantSplatVectorAllZeros  and then remove the code here? is that ok?
I had not realized that isConstantSplatVectorAllZeros didn't already handle SPLAT_VECTOR with FP constants. Adding it there sounds like a good idea, either here or in a separate patch.

================
Comment at: llvm/test/CodeGen/AArch64/sve-intrinsics-int-arith.ll:1
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t | FileCheck %s
----------------
junparser wrote:
> david-arm wrote:
> > paulwalker-arm wrote:
> > > joechrisellis wrote:
> > > > The new `CHECK:` lines introduced look simple enough to write out by hand, so I would prefer to not use `utils/update_llc_checks.py` in this case because it has made the diff a lot noisier (e.g. all of the tests have been changed syntactically in this patch, but AFAICT none of them have changed functionally).
> > > > 
> > > > In this case, I think it should be sufficient to write your new tests like this:
> > > > 
> > > > ```
> > > > define <vscale x 2 x i64> @test_sdot_i64_zero(<vscale x 2 x i64> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) {
> > > > ; CHECK-LABEL: test_sdot_i64_zero:
> > > > ; CHECK:         sdot z0.d, z1.h, z2.h
> > > > ; CHECK-NEXT:    ret
> > > > entry:
> > > >   %vdot1.i = call <vscale x 2 x i64> @llvm.aarch64.sve.sdot.nxv2i64(<vscale x 2 x i64> zeroinitializer, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c)
> > > >   %ret = add <vscale x 2 x i64> %vdot1.i, %a
> > > >   ret <vscale x 2 x i64> %ret
> > > > }
> > > > 
> > > > define <vscale x 4 x i32> @test_sdot_i32_zero(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {
> > > > ; CHECK-LABEL: test_sdot_i32_zero:
> > > > ; CHECK:         sdot z0.s, z1.b, z2.b
> > > > ; CHECK-NEXT:    ret
> > > > entry:
> > > >   %vdot1.i = call <vscale x 4 x i32> @llvm.aarch64.sve.sdot.nxv4i32(<vscale x 4 x i32> zeroinitializer, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c)
> > > >   %ret = add <vscale x 4 x i32> %vdot1.i, %a
> > > >   ret <vscale x 4 x i32> %ret
> > > > }
> > > > ```
> > > > 
> > > > ... and:
> > > > 
> > > > ```
> > > > define <vscale x 2 x i64> @test_udot_i64_zero(<vscale x 2 x i64> %a, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c) {
> > > > ; CHECK-LABEL: test_udot_i64_zero:
> > > > ; CHECK:         udot z0.d, z1.h, z2.h
> > > > ; CHECK-NEXT:    ret
> > > > entry:
> > > >   %vdot1.i = call <vscale x 2 x i64> @llvm.aarch64.sve.udot.nxv2i64(<vscale x 2 x i64> zeroinitializer, <vscale x 8 x i16> %b, <vscale x 8 x i16> %c)
> > > >   %ret = add <vscale x 2 x i64> %vdot1.i, %a
> > > >   ret <vscale x 2 x i64> %ret
> > > > }
> > > > 
> > > > define <vscale x 4 x i32> @test_udot_i32_zero(<vscale x 4 x i32> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c) {
> > > > ; CHECK-LABEL: test_udot_i32_zero:
> > > > ; CHECK:         udot z0.s, z1.b, z2.b
> > > > ; CHECK-NEXT:    ret
> > > > entry:
> > > >   %vdot1.i = call <vscale x 4 x i32> @llvm.aarch64.sve.udot.nxv4i32(<vscale x 4 x i32> zeroinitializer, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c)
> > > >   %ret = add <vscale x 4 x i32> %vdot1.i, %a
> > > >   ret <vscale x 4 x i32> %ret
> > > > }
> > > > ```
> > > > 
> > > > ... and then just revert all other changes in the file. :smile:
> > > I think not using update_llc_checks.py was a mistake when landing the original file.  Although using it now makes the patch look bigger it will make it easier to update in the future, which is a good thing.
> > I think the LLVM community prefers using utils/update_llc_test_checks.py where possible, in particular for small functions like those here.
> I also prefer to use update_llc_checks.py here.
Yeah, using the update script has benefits in terms of maintainability and consistency between test. As well as not missing things in the generated assembly.

Feel free regenerate the existing test and commit that separately, so just the changes from the patch are shown here.  I often to so far as to commit the new tests with the current trunk codegen, so that in the review it is obvious what is changing in the codegen.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99699/new/

https://reviews.llvm.org/D99699