[PATCH] D110579: [AMDGPU] Add two new intrinsics to control fp_trunc rounding mode

Wed Oct 20 12:12:29 PDT 2021

jpages added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp:4500-4502
+    case Intrinsic::experimental_fptrunc_round_upward:
+    case Intrinsic::experimental_fptrunc_round_downward: {
+      unsigned Bank =
----------------
arsenm wrote:
> The backend should not directly consume generic intrinsics. These need to be routed through a generic opcode which is subject to normal legalization
What is the reasoning behind this requirement?

I guess I would need something like experimental_fptrunc_round_upward -> amdgpu_experimental_fptrunc_round_upward -> codegen?

================
Comment at: llvm/lib/Target/AMDGPU/SIModeRegister.cpp:439
+      // Build a MI to do the conversion
+      BuildMI(MBB, MI, DL, TII->get(AMDGPU::V_CVT_F16_F32_e64), Dest.getReg())
+          .addImm(0)
----------------
foad wrote:
> If you arrange for the pseudo to have exactly the same operands as the real instruction, then you don't need to build or delete instructions, all you need to do is change the opcode, which you can do with MI.setDesc().
I think this could work. I tried to "replace" my pseudo-instructions in Phase1 but it's not a good idea to add/delete instructions in the middle of an iteration on the basic block.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110579/new/

https://reviews.llvm.org/D110579