[PATCH] D110579: [AMDGPU] Add two new intrinsics to control fp_trunc rounding mode

Fri Jan 14 01:51:08 PST 2022

foad added a comment.

It seems like people are mostly happy with the design now, so I am being a bit more picky with my review comments!

================
Comment at: llvm/docs/LangRef.rst:23858
+The '``llvm.fptrunc.round``' intrinsic truncates
+:ref:`floating-point <t_floating>` ``value`` to type ``ty2``.
+
----------------
Add "... with a specified rounding mode". Apart from that I think most of the Overview/Arguments/Semantics text could be copied verbatim from the definition of the fptrunc-to instruction?

================
Comment at: llvm/docs/LangRef.rst:23868
+The second argument specifies the rounding mode as described in the constrained
+intrinsics section.
+
----------------
Need to disallow "dynamic" if that's what we decided.

================
Comment at: llvm/docs/LangRef.rst:23874
+The result produced is a floating point value truncated to be smaller in size
+than the operand.
----------------
Probably need to add the text from fptrunc-to: "This instruction is assumed to execute in the default floating-point environment" (because of the stuff about exceptions) and then say *except* for the rounding mode.

================
Comment at: llvm/include/llvm/IR/Intrinsics.td:905
+
+// Trunc a floating point number with a specific rounding mode
+def int_fptrunc_round : DefaultAttrsIntrinsic<[ llvm_anyfloat_ty ],
----------------
"Truncate"

================
Comment at: llvm/include/llvm/IR/Intrinsics.td:908
+                                              [ llvm_anyfloat_ty, llvm_metadata_ty ],
+                                              [ IntrNoMem ]>;
+
----------------
Add IntrWillReturn.

================
Comment at: llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp:2257
+
+    SmallVector<llvm::SrcOp, 4> VRegs;
+    VRegs.push_back(getOrCreateVReg(*CI.getArgOperand(0)));
----------------
VRegs isn't used for anything so remove it.

================
Comment at: llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp:2269
+    ArrayRef<Register> ResultRegs;
+    if (!CI.getType()->isVoidTy())
+      ResultRegs = getOrCreateVRegs(CI);
----------------
We know it's not void, and has exactly one result, so simplify this.

================
Comment at: llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp:2273
+    MachineInstrBuilder MIB =
+        MIRBuilder.buildIntrinsic(ID, ResultRegs, !CI.doesNotAccessMemory());
+
----------------
We know it does not access memory.

================
Comment at: llvm/lib/IR/Verifier.cpp:4750
+        convertStrToRoundingMode(cast<MDString>(MD)->getString());
+    Assert(RoundMode.hasValue(), "invalid rounding mode argument", Call);
+    break;
----------------
Check that the rounding mode is not "dynamic"?

================
Comment at: llvm/lib/IR/Verifier.cpp:4748
+
+    Optional<RoundingMode> RoundMode =
+        convertStrToRoundingMode(cast<MDString>(MD)->getString());
----------------
MD could be nullptr here, so you need to handle that.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp:698
   }
+  case AMDGPUISD::FPTRUNC_ROUND_UPWARD: {
+    SDLoc DL(N);
----------------
Can this be done with patterns in the tablegen files instead of C++ code? Then they should work for both SelectionDAG and GlobalISel.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:3252
     return selectG_SBFX_UBFX(I);
+  case AMDGPU::G_FPTRUNC_ROUND_UPWARD:
+  case AMDGPU::G_FPTRUNC_ROUND_DOWNWARD: {
----------------
Can this be done with patterns in the tablegen files instead of C++ code?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4932
+  else
+    llvm_unreachable("unsupported rounding mode");
+
----------------
Should this be "return false" to indicate a legalization failure in the normal way?

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:4600
+    OpdsMapping[0] = getVGPROpMapping(MI.getOperand(0).getReg(), MRI, *TRI);
+    OpdsMapping[1] = getSGPROpMapping(MI.getOperand(1).getReg(), MRI, *TRI);
+    break;
----------------
sgpr seems wrong. This is specifying the mapping for the result value and the input value (not the rounding mode which is no longer an operand). I think this should just use the same mapping as G_FPTRUNC (line 3640).

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:6825
+    else
+      llvm_unreachable("unsupported rounding mode");
+
----------------
I'm not quite sure but I think you should return SDValue() here to indicate a lowering/legalization failure?

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.h:134

+  SDValue LowerExperimentalFPRound(SDValue Op, unsigned IntrID,
+                                   SelectionDAG &DAG) const;
----------------
No longer needed.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110579/new/

https://reviews.llvm.org/D110579