[llvm] AMDGPU: Add tonearest and towardzero roundings for intrinsic llvm.fptrunc.round (PR #104486)

Fri Aug 16 13:10:42 PDT 2024

================
@@ -3,6 +3,15 @@
 ; RUN: llc -global-isel=0 -mtriple=amdgcn -mcpu=gfx1010 < %s | FileCheck -check-prefixes=CHECK,SDAG %s
 ; RUN: llc -global-isel -mtriple=amdgcn -mcpu=gfx1030 < %s | FileCheck -check-prefixes=CHECK,GISEL %s
 
+define amdgpu_gs half @v_fptrunc_round_f32_to_f16_tonearest(float %a) {
+; CHECK-LABEL: v_fptrunc_round_f32_to_f16_tonearest:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    v_cvt_f16_f32_e32 v0, v0
+; CHECK-NEXT:    ; return to shader part epilog
+  %res = call half @llvm.fptrunc.round.f16.f32(float %a, metadata !"round.tonearest")
----------------
changpeng wrote:

For f32 -> bf16, you won't be able to rtz for v_cvt_pk_bf16_f32 (hardware uses round to nearest even all the times).
So if you really want to rtz, you will have to do in the compiler.

https://github.com/llvm/llvm-project/pull/104486