[PATCH] D108115: [llvm][sve] Lowering for VLS masked truncating stores

Tue Sep 7 08:42:21 PDT 2021

DavidTruby added inline comments.

================
Comment at: llvm/test/CodeGen/Thumb2/mve-satmul-loops.ll:1462
 ; CHECK-NEXT:    vqshrnb.s32 q2, q2, #15
+; CHECK-NEXT:    vmovlb.s16 q2, q2
 ; CHECK-NEXT:    vpst
----------------
dmgreen wrote:
> SjoerdMeijer wrote:
> > DavidTruby wrote:
> > > These changes aren't correct, I have pinged @dmgreen and @SjoerdMeijer for review to see if they know what's happening here. My suspicion is that there's a target specific combine in Thumb2 for masked truncating stores that my general combine is blocking; I guess the easy fix here is to add an exemption for Thumb2 in the TLI hook but I'm not sure if that's the right thing to do?
> > Sorry for the delay! Yeah, this extra codegen is not ideal (I think we agree this is a perf regression, not a correctness, but that aside).
> > 
> > If we could avoid this that would be best. As I understood, it would involve adding this target hook to the ARM backend:
> > 
> >   virtual bool canCombineTruncStore(EVT ValVT, EVT MemVT,
> >                                     bool LegalOperations) const override {
> >     ...
> >   } 
> > 
> > Would you mind adding this to ARM backend? I have no preference if you include that here (I think that would make sense), or do that separately and create a dependent patch. But I think it would good if we can see the evidence this regression disappears before committing this.
> We don't want canCombineTruncStore. I think you need to add demanded bits for the truncating masked store. As in https://github.com/llvm/llvm-project/blob/b50a60c234433545fc1c9b39f193373f560ea869/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L18115
Thanks for the pointer, adding this seems to have fixed the regression

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108115/new/

https://reviews.llvm.org/D108115