[PATCH] D97475: [RISCV] Support EXTRACT_SUBVECTOR on vector masks

Mon Mar 1 01:11:57 PST 2021

frasercrmck marked 3 inline comments as done.
frasercrmck added inline comments.

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:541

+        setOperationAction(ISD::SETCC, VT, Custom);
+
----------------
craig.topper wrote:
> This is because LegalizeVectorOps keys from the result type, but LegalizeDAG keys from operand type? And EXTRACT_SUBVECTOR lowering is called from LegalizeDAG?
Yeah, that's it. I wish it was all more transparent.

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:2633
+  if (SubVecVT != Op.getSimpleValueType())
+    Slidedown = DAG.getBitcast(Op.getSimpleValueType(), Slidedown);
+
----------------
craig.topper wrote:
> You can blindly call getBitcast if you want, it contains a check to skip the getNode if the types match.
Yeah that might be a good idea, thanks.

================
Comment at: llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll:399
+; CHECK-NEXT:    vsetvli a0, zero, e8,mf4,ta,mu
+; CHECK-NEXT:    vand.vi v25, v25, 1
+; CHECK-NEXT:    vmsne.vi v0, v25, 0
----------------
craig.topper wrote:
> frasercrmck wrote:
> > craig.topper wrote:
> > > This vand.vi isn't necessary. But I'm not sure the best way to remove it in the general case of truncate. For this specific transform we could just emit the compare directly instead of going through truncate?
> > It should be possible to detect as nothing should be affecting the sign bits from the "original" zext. But it doesn't feel right to me to do that in the lowering of truncate. Is it not possible to hook into the demanded bits functions?
> > 
> > I'll go with directly using the compare for now. Perhaps something to revisit.
> X86 calls ComputeNumSignBits from LowerTruncateVecI1 for a similar issue. But that might be because its using SHL rather than AND.
> 
> Eventually we should be able to rely on SimplifyDemandedBits and/or computeKnownBits to optimize this in a DAG combine after lowering. We'd need to fix computeKnownBits/SimplifyDemandedBits to not bail out on scalable vectors. Then we need to add VSELECT_VL, and VMV_V_X_VL to computeKnownBitsForTargetNode. And AND_VL to SimplifyDemandedBits.
Well, at least we're not going to run out of work any time soon.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97475/new/

https://reviews.llvm.org/D97475