[PATCH] D89162: [SVE] Lower fixed length VECREDUCE_SEQ_FADD operation
Paul Walker via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 23 07:50:12 PDT 2020
paulwalker-arm added a comment.
In D89162#2347814 <https://reviews.llvm.org/D89162#2347814>, @cameron.mcinally wrote:
> The new tests would be broken without the legalisation changes, so I'm assuming that those are enough coverage. Maybe I'm missing something though...
Are you sure? I took your patch for a test drive and removed all but the `TLI.getOperationAction` related change from Legalize*.{cpp, h} and the tests passed.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16293
+ SDValue Rdx = DAG.getNode(AArch64ISD::FADDA_PRED, DL,
+ getPackedSVEVectorVT(ResVT),
+ Pg, AccOp, VecOp);
----------------
I think this can be ContainerVT?
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:16297
+ return DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, ResVT, Rdx,
+ DAG.getConstant(0, DL, MVT::i64));
+}
----------------
You use this in a couple of places, so might be worth extracting into `Zero = ` ?
================
Comment at: llvm/test/CodeGen/AArch64/sve-fixed-length-fp-reduce.ll:65
+
+; NOTE: NEON expansions used to ensure ordering.
+ %op = load <32 x half>, <32 x half>* %a
----------------
What does this mean?
I was going to say you're missing VBITS_EQ_256 lines here but then noticed the output. What's you're feeling on this? I see two routes:
Stick with the expanded output for this patch, in which case just having `VBITS_EQ_256-COUNT-33: fadd` is good enough.
For better code generation we're just missing the ability to split VECREDUCE_SEQ_FADD operations, which I don't think should be that hard.
`VECREDUCE_SEQ_FADD A, X_512 => VECREDUCE_SEQ_FADD (VECREDUCE_SEQ_FADD X, X_LO), X_HI` ?
I'm happy with either approach.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D89162/new/
https://reviews.llvm.org/D89162
More information about the llvm-commits
mailing list