[PATCH] D130659: [RISCV] Update lowerFROUND to use masked instructions.

Thu Jul 28 09:08:54 PDT 2022

craig.topper added a comment.

In D130659#3685199 <https://reviews.llvm.org/D130659#3685199>, @reames wrote:

> LGTM to me too.
>
> Side note - this seems reasonably likely to be profitable for this specific case since all of the instructions are very likely to be scheduled together, but have you thought about the general applicability of a transform like this?  Having the mask register class be so constrained could make this unprofitable if the instructions were intermixed with other computation using a different mask.
>
> The "general transform" I'm noting here is that we have select mask, f(x), x -> f(x, mask).  This is principle applies to any code sequence, and is not at all specific to floating point.

The beginning of a general patch for that was posted https://reviews.llvm.org/D130442 the first version folds tail undisturbed vmerge.

================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:1919
+      DAG.getConstantFP(MaxVal, DL, ContainerVT.getVectorElementType());
+  SDValue MaxValSplat = DAG.getNode(RISCVISD::VFMV_V_F_VL, DL, ContainerVT,
+                                    DAG.getUNDEF(ContainerVT), MaxValNode, VL);
----------------
reames wrote:
> Why is the explicit splat here needed?  getConstantFP seems to do generate a splat internally, and the old code relied on that behavior.  Why change it to create the scalar constant (explicitly) and then splat (explicitly)?
The old code used getConstantFP with either a fixed vector type or scalable vector type. For fixed vector it would create a build_vector that would later be converted by lowerBUILD_VECTOR to VFMV_V_F_VL with a VL based on the fixed vector type. For scalable vector getConstantFP would create a SPLAT_VECTOR that would be treated as a VLMAX vfmv.v.f during isel.

Since we are converting fixed length vectors to scalable in order to use _VL nodes, we need to match how BUILD_VECTOR would be converted. Using getConstantFP with the ContainerVT would create a VLMax splat instead of using the VL from the fixed vector we started with. It would still work since the .vx pattern match in isel ignores the VL on the splat, but if it wasn't folded to a .vx instruction it would create a vsetvli toggle.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130659/new/

https://reviews.llvm.org/D130659