[PATCH] D88577: [AArch64] Generate dot for v16i8 sum reduction to i32
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 1 09:55:28 PDT 2020
dmgreen added a comment.
> EDIT: SLP vectorizer generates the pattern in this case.
Ah I see.
https://llvm.org/docs/Phabricator.html#phabricator-request-review-web has some details on creating diffs with extra context.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10970
+ MachineSDNode *ABD =
+ DAG.getMachineNode(AArch64::UDOTv16i8, SDLoc(Op0), Zeros.getValueType(),
+ Zeros, Ones, Op0.getOperand(0));
----------------
mivnay wrote:
> dmgreen wrote:
> > We ideally shouldn't be just producing a machine node here. Can you add a AArch64ISD::UDOT node?
> >
> > We should be doing the same for SDOT too.
> I have added the support for SDOT.
>
> > Can you add a AArch64ISD::UDOT node?
>
> Can this be done in a later patch?
It's best not create machine nodes directly in the DAG...
Maybe create a int_aarch64_neon_sdot intrinsic node instead? That will avoid the need for a new node, and go through the existing tablegen patterns.
It would be nice to be able to fold add(x, UDOT(0, y)) -> UDOT(x, y), which is where having a node will really become useful. That needn't be done here though.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10949-10951
+ if (!ST->hasDotProd() || N->getValueType(0) != MVT::i32) {
+ return SDValue();
+ }
----------------
Any single statement blocks do not need { } brackets.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10964-10965
+
+ if (Op0VT.isScalableVector() || Op0VT.getVectorElementType() != MVT::i8 ||
+ Op0VT.getVectorNumElements() != 16) {
+ return SDValue();
----------------
This second isScalable check isn't needed, and maybe move the num elements == 16 check up to the VT check, by checking for v16i32 directly?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10972
+ SDValue Zeros = DAG.getConstant(
+ 0, SDLoc(Op0), EVT::getVectorVT(*DAG.getContext(), MVT::i32, 4));
+
----------------
EVT::getVectorVT(*DAG.getContext(), MVT::i32, 4) -> MVT::v4i32
I wasn't aware that you could use getConstant directly on vector types like this. Neat.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:10981
+ N->getValueType(0), SDValue(ABD, 0));
+ DAG.ReplaceAllUsesOfValueWith(SDValue(N, 0), FinalABD);
+ return FinalABD;
----------------
Why is this is calling ReplaceAllUsesOfValueWith? Is returning the value not enough?
================
Comment at: llvm/test/CodeGen/AArch64/neon-dot-product.ll:261
+; CHECK-LABEL: test_udot_v16i8_2:
+; CHECK: udot {{v[0-9]+}}.4s, {{v[0-9]+}}.16b, {{v[0-9]+}}.16b
+ %0 = bitcast i8* %a1 to <16 x i8>*
----------------
Can you update the tests to show all the instructions? The 1's and 0's and addv's are important here too.
I would just use the update_llc_test_checks script.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D88577/new/
https://reviews.llvm.org/D88577
More information about the llvm-commits
mailing list