[PATCH] D79187: [DAGCombiner] fold (fp_round (int_to_fp X)) -> (int_to_fp X)

Fri May 1 04:26:03 PDT 2020

lenary updated this revision to Diff 261453.
lenary added a comment.

Address @efriedma's feedback and guidance:

- Use `hasOperation` to ensure legality. This causes the optimisation not to happen for the testcase on on RV64*F, in this example, I think because `i32` is not a legal type on that architecture. I'll follow-up this patch if I can get that working.

I have also refactored the check, and added an even greater commentary of why I
believe the transform to be correct, which I think means it applies in more
cases. This is based off the observation/assumption that `fp_round` always
removes precision of the final float, so that precision loss can always be
"moved" into the int_to_fp operation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79187/new/

https://reviews.llvm.org/D79187

Files:
  llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  llvm/lib/Target/RISCV/RISCVInstrInfoF.td
  llvm/test/CodeGen/RISCV/fp-convert-indirect.ll


Index: llvm/test/CodeGen/RISCV/fp-convert-indirect.ll
===================================================================

--- llvm/test/CodeGen/RISCV/fp-convert-indirect.ll
+++ llvm/test/CodeGen/RISCV/fp-convert-indirect.ll
@@ -11,19 +11,15 @@
 ;; These testcases check that we merge sequences of `fcvt.d.wu; fcvt.s.d` into
 ;; `fcvt.s.wu`.
 ;;
-;; TODO: Unfortunately, though this only uses 32-bit FP instructions, we cannot
-;; do this optimisation without the D extension as we need 64-bit FP values to
-;; be legal to get the right operands to match.
+;; These folds are actually implemented in the DAGCombiner, because otherwise
+;; without the D extension the intermediate `double` will be legalised away and
+;; the conversions will be turned into libcalls.
 
 define float @fcvt_s_w_via_d(i32 %a) nounwind {
 ; RV32IF-LABEL: fcvt_s_w_via_d:
 ; RV32IF:       # %bb.0:
-; RV32IF-NEXT:    addi sp, sp, -16
-; RV32IF-NEXT:    sw ra, 12(sp)
-; RV32IF-NEXT:    call __floatsidf
-; RV32IF-NEXT:    call __truncdfsf2
-; RV32IF-NEXT:    lw ra, 12(sp)
-; RV32IF-NEXT:    addi sp, sp, 16
+; RV32IF-NEXT:    fcvt.s.w ft0, a0
+; RV32IF-NEXT:    fmv.x.w a0, ft0
 ; RV32IF-NEXT:    ret
 ;
 ; RV32IFD-LABEL: fcvt_s_w_via_d:
@@ -56,12 +52,8 @@
 define float @fcvt_s_wu_via_d(i32 %a) nounwind {
 ; RV32IF-LABEL: fcvt_s_wu_via_d:
 ; RV32IF:       # %bb.0:
-; RV32IF-NEXT:    addi sp, sp, -16
-; RV32IF-NEXT:    sw ra, 12(sp)
-; RV32IF-NEXT:    call __floatunsidf
-; RV32IF-NEXT:    call __truncdfsf2
-; RV32IF-NEXT:    lw ra, 12(sp)
-; RV32IF-NEXT:    addi sp, sp, 16
+; RV32IF-NEXT:    fcvt.s.wu ft0, a0
+; RV32IF-NEXT:    fmv.x.w a0, ft0
 ; RV32IF-NEXT:    ret
 ;
 ; RV32IFD-LABEL: fcvt_s_wu_via_d:
Index: llvm/lib/Target/RISCV/RISCVInstrInfoF.td
===================================================================
--- llvm/lib/Target/RISCV/RISCVInstrInfoF.td
+++ llvm/lib/Target/RISCV/RISCVInstrInfoF.td
@@ -392,12 +392,6 @@
 // [u]int->float. Match GCC and default to using dynamic rounding mode.
 def : Pat<(sint_to_fp GPR:$rs1), (FCVT_S_W $rs1, 0b111)>;
 def : Pat<(uint_to_fp GPR:$rs1), (FCVT_S_WU $rs1, 0b111)>;
-
-// [u]int->double->float
-def : Pat<(fpround (f64 (sint_to_fp GPR:$rs1))),
-          (FCVT_S_W GPR:$rs1, 0b111)>;
-def : Pat<(fpround (f64 (uint_to_fp GPR:$rs1))),
-          (FCVT_S_WU GPR:$rs1, 0b111)>;
 } // Predicates = [HasStdExtF, IsRV32]
 
 let Predicates = [HasStdExtF, IsRV64] in {
Index: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
===================================================================
--- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -13462,6 +13462,31 @@
   if (N0.getOpcode() == ISD::FP_EXTEND && VT == N0.getOperand(0).getValueType())
     return N0.getOperand(0);
 
+  // fold (fp_round ({u,s}int_to_fp x)) -> ({u,s}int_to_fp x)
+  // but only when the ({u,s}int_to_fp x) remains precise
+  if (N0.getOpcode() == ISD::SINT_TO_FP || N0.getOpcode() == ISD::UINT_TO_FP) {
+    SDValue IntNode = N0->getOperand(0);
+    EVT IntTy = IntNode.getValueType();
+
+    // The intuition behind this check is that the original DAG took an integer,
+    // and converted it to the resulting float, but via a more precise float.
+    // After the transform, the integer is converted directly to the resulting
+    // float, without the intermediate precise float.
+    //
+    // Because the intermediate float is rounded to the resulting float, we know
+    // the resulting float is less precise than the intermediate float.
+    // Therefore, the relative precision of the int to either float does not
+    // matter, as in all cases, the `fp_round` will be removing precision,
+    // regardless of whether the original `int_to_fp` loses precision or not.
+    //
+    // The only thing we do need to check is that we can perform the int_to_fp
+    // with this integer operand.
+    if (hasOperation(N0.getOpcode(), IntTy)) {
+      SDLoc DL(N);
+      return DAG.getNode(N0.getOpcode(), DL, VT, IntNode);
+    }
+  }
+
   // fold (fp_round (fp_round x)) -> (fp_round x)
   if (N0.getOpcode() == ISD::FP_ROUND) {
     const bool NIsTrunc = N->getConstantOperandVal(1) == 1;


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D79187.261453.patch
Type: text/x-patch
Size: 4163 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200501/6aa5fa08/attachment.bin>