[PATCH] D130442: [RISCV] Peephole optimization to fold merge.vvm and unmasked intrinsics.
Philip Reames via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 28 13:29:22 PDT 2022
reames added a comment.
In D130442#3685894 <https://reviews.llvm.org/D130442#3685894>, @craig.topper wrote:
> In D130442#3685850 <https://reviews.llvm.org/D130442#3685850>, @reames wrote:
>
>> Have you looked at doing this as a dag combine instead of a post-isel peephole? I believe we should have SDNodes with masks for all of these, doing the local peephole in a combine and letting it iterate seems more straight forward. (I'm probably missing something, right?)
>
> Most of the _VL nodes don't have a passthru operand right now. Nor do they have a policy operand. Important since this patch is focusing on the tail undisturbed case. That would need to change before this could be a DAG combine.
Not a strongly held opinion but... maybe we should just do that?
================
Comment at: llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp:2559
+ // Skip if True has merge operand.
+ if (RISCVII::hasMergeOp(TII->get(TrueOpc).TSFlags))
+ continue;
----------------
craig.topper wrote:
> reames wrote:
> > I believe you have a missing check here. Don't we need the passthrough for True to be False? Otherwise, it seems like the vmerge might be merging two unrelated values?
> >
> > We could potentially phrase this as an eager demanded lane style push back of the mask which wouldn't require that check, but that doesn't seem consistent with your code below.
> This is only handling the case for True being an unmasked, tail agnostic instruction. That's why it calls `lookupMaskedIntrinsicByUnmaskedTA` below.
Right, but that's insufficient.
Converting:
vadd v1, v2, v3
vmul v4, v5, v6
vmerge v7, v1, v4, v0
To:
vadd v7 v2, v3, v0
vmul v4, v5, v6
Doesn't seem like what we want?
There's a variant here which would be legal, but it doesn't seem to be the transform implemented here. That would be:
vadd v7 v2, v3, v0
v0 = not v0
vmul v7, v5, v6, v0
================
Comment at: llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp:2608
+ // Try to transform Result to unmasked intrinsic.
+ doPeepholeMaskedRVV(Result);
+ MadeChange = true;
----------------
craig.topper wrote:
> reames wrote:
> > This doesn't make any sense. The only way this could succeed is if the mask was all ones; we shouldn't have a merge with an all ones mask to begin with right?
> A vp.merge intrinsic can be emitted by the BSC vectorizer to represent a tail undisturbed merge. This happens in reduction loops where the last iteration has less elements than the previous iteration. So we need to merge the upper elements from a previous iteration under a tail undisturbed policy. The mask on the vp.merge would be all 1s.
Wouldn't an insert_sub_vector be a better representation here? Or is your point that we had an insert_sub_vector, and that got lowered to the vmerge? I'm guessing yes, that's your point because thinking that through seems to make sense.
I guess we don't have a dedicated insert_sub_reg analogous instruction. VADD could work - in analogy to scalar move idioms - but I don't have a strong reason to prefer that over vmerge if that's what we're currently using.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D130442/new/
https://reviews.llvm.org/D130442
More information about the llvm-commits
mailing list