[PATCH] D135386: [RISCV] Use mask agnostic policy for isel patterns where the merge operand is IMPLICIT_DEF.

Thu Oct 6 13:38:36 PDT 2022

craig.topper added inline comments.

================
Comment at: llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td:1957
                                             VLOpFrag)),
             (!cast<Instruction>("PseudoVSLIDEDOWN_VI_"#vti.LMul.MX)
                 (vti.Vector (IMPLICIT_DEF)), vti.RegClass:$rs1, uimm5:$rs2,
----------------
reames wrote:
> Slide down has special wording around the handling of the mask.  Could you move this to it's own patch?  I think this needs careful consideration on it's own.
> 
> (Same for slide up)
> Slide down has special wording around the handling of the mask.  Could you move this to it's own patch?  I think this needs careful consideration on it's own.
> 
> (Same for slide up)

Can you paste the special wording you are referring to?

This patch doesn't touch slideup because slideup always reads the destination register for the elements less than OFFSET. So it doesn't have an equivalent of this pattern.

================
Comment at: llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll:14
 ; CHECK-NEXT:    flh ft0, %lo(.LCPI0_0)(a1)
-; CHECK-NEXT:    vsetvli zero, a0, e16, mf4, ta, mu
+; CHECK-NEXT:    vsetvli zero, a0, e16, mf4, ta, ma
 ; CHECK-NEXT:    vfabs.v v9, v8, v0.t
----------------
reames wrote:
> Hm, this example and several others in the diff appear to be a regression in practice.  
> 
> It looks like we're missing a couple of cases in InsertVSETVLI.  The general pattern is promoting a MA use to MU if doing so would allow avoiding an extra vsetvli.  I see two forms in this test case.
> 
> There's a general profitability question though.  Does switching from MA to MU ever cost enough execution wise to be worth the extra vsetvli?  I can't think of any cases, but maybe?
Definitely in this case, using MU can't have any performance effect. Using mu creates a false dependency on v9 for the vfcvt.x.f.v. But v9 was already used to create the mask, so last writer of v9 already finished executing.

Though there's nothing forcing the vfcvt.x.f.v to have the same destination as the vfabs.v. That's just lucky register allocation.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135386/new/

https://reviews.llvm.org/D135386