[PATCH] D97475: [RISCV] Support EXTRACT_SUBVECTOR on vector masks

Fri Feb 26 13:50:12 PST 2021

craig.topper accepted this revision.
craig.topper added a comment.
This revision is now accepted and ready to land.

LGTM. I'll leave it up to you if you want to make that getBitcast change.

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:541

+        setOperationAction(ISD::SETCC, VT, Custom);
+
----------------
This is because LegalizeVectorOps keys from the result type, but LegalizeDAG keys from operand type? And EXTRACT_SUBVECTOR lowering is called from LegalizeDAG?

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:2633
+  if (SubVecVT != Op.getSimpleValueType())
+    Slidedown = DAG.getBitcast(Op.getSimpleValueType(), Slidedown);
+
----------------
You can blindly call getBitcast if you want, it contains a check to skip the getNode if the types match.

================
Comment at: llvm/test/CodeGen/RISCV/rvv/extract-subvector.ll:399
+; CHECK-NEXT:    vsetvli a0, zero, e8,mf4,ta,mu
+; CHECK-NEXT:    vand.vi v25, v25, 1
+; CHECK-NEXT:    vmsne.vi v0, v25, 0
----------------
frasercrmck wrote:
> craig.topper wrote:
> > This vand.vi isn't necessary. But I'm not sure the best way to remove it in the general case of truncate. For this specific transform we could just emit the compare directly instead of going through truncate?
> It should be possible to detect as nothing should be affecting the sign bits from the "original" zext. But it doesn't feel right to me to do that in the lowering of truncate. Is it not possible to hook into the demanded bits functions?
> 
> I'll go with directly using the compare for now. Perhaps something to revisit.
X86 calls ComputeNumSignBits from LowerTruncateVecI1 for a similar issue. But that might be because its using SHL rather than AND.

Eventually we should be able to rely on SimplifyDemandedBits and/or computeKnownBits to optimize this in a DAG combine after lowering. We'd need to fix computeKnownBits/SimplifyDemandedBits to not bail out on scalable vectors. Then we need to add VSELECT_VL, and VMV_V_X_VL to computeKnownBitsForTargetNode. And AND_VL to SimplifyDemandedBits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97475/new/

https://reviews.llvm.org/D97475