[PATCH] D108115: [llvm][sve] Lowering for VLS masked truncating stores
David Truby via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 7 08:42:21 PDT 2021
DavidTruby added inline comments.
================
Comment at: llvm/test/CodeGen/Thumb2/mve-satmul-loops.ll:1462
; CHECK-NEXT: vqshrnb.s32 q2, q2, #15
+; CHECK-NEXT: vmovlb.s16 q2, q2
; CHECK-NEXT: vpst
----------------
dmgreen wrote:
> SjoerdMeijer wrote:
> > DavidTruby wrote:
> > > These changes aren't correct, I have pinged @dmgreen and @SjoerdMeijer for review to see if they know what's happening here. My suspicion is that there's a target specific combine in Thumb2 for masked truncating stores that my general combine is blocking; I guess the easy fix here is to add an exemption for Thumb2 in the TLI hook but I'm not sure if that's the right thing to do?
> > Sorry for the delay! Yeah, this extra codegen is not ideal (I think we agree this is a perf regression, not a correctness, but that aside).
> >
> > If we could avoid this that would be best. As I understood, it would involve adding this target hook to the ARM backend:
> >
> > virtual bool canCombineTruncStore(EVT ValVT, EVT MemVT,
> > bool LegalOperations) const override {
> > ...
> > }
> >
> > Would you mind adding this to ARM backend? I have no preference if you include that here (I think that would make sense), or do that separately and create a dependent patch. But I think it would good if we can see the evidence this regression disappears before committing this.
> We don't want canCombineTruncStore. I think you need to add demanded bits for the truncating masked store. As in https://github.com/llvm/llvm-project/blob/b50a60c234433545fc1c9b39f193373f560ea869/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L18115
Thanks for the pointer, adding this seems to have fixed the regression
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D108115/new/
https://reviews.llvm.org/D108115
More information about the llvm-commits
mailing list