[llvm] [AMDGPU][True16] Support V_CEIL_F16. (PR #73108)

Mirko BrkuĊĦanin via llvm-commits llvm-commits at lists.llvm.org
Thu Nov 23 05:09:20 PST 2023


================
@@ -5250,10 +5250,15 @@ unsigned SIInstrInfo::getVALUOp(const MachineInstr &MI) const {
   case AMDGPU::S_FLOOR_F32: return AMDGPU::V_FLOOR_F32_e64;
   case AMDGPU::S_TRUNC_F32: return AMDGPU::V_TRUNC_F32_e64;
   case AMDGPU::S_RNDNE_F32: return AMDGPU::V_RNDNE_F32_e64;
-  case AMDGPU::S_CEIL_F16: return AMDGPU::V_CEIL_F16_t16_e64;
-  case AMDGPU::S_FLOOR_F16: return AMDGPU::V_FLOOR_F16_t16_e64;
-  case AMDGPU::S_TRUNC_F16: return AMDGPU::V_TRUNC_F16_t16_e64;
-  case AMDGPU::S_RNDNE_F16: return AMDGPU::V_RNDNE_F16_t16_e64;
+  case AMDGPU::S_CEIL_F16:
+    return ST.useRealTrue16Insts() ? AMDGPU::V_CEIL_F16_t16_e64
+                                   : AMDGPU::V_CEIL_F16_fake16_e64;
----------------
mbrkusanin wrote:

Not every instruction is covered, only one per group. "cmp_f16" test from llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-f16.mir was supposed to cover s_*_f16 insts.
For the fake16 everything should work, but for t16 regclass is different, so this change is not enough. So, yeah we should have at least one test for the t16 ones.

https://github.com/llvm/llvm-project/pull/73108


More information about the llvm-commits mailing list