[llvm] [AMDGPU][MC][True16] Support V_RCP/SQRT/RSQ/LOG/EXP_F16. (PR #81131)

Joe Nash via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 9 12:08:55 PST 2024


================
@@ -45,49 +46,64 @@
 # GFX11: v_bfrev_b32_e64 v255, 0xaf123456        ; encoding: [0xff,0x00,0xb8,0xd5,0xff,0x00,0x00,0x00,0x56,0x34,0x12,0xaf]
 0xff,0x00,0xb8,0xd5,0xff,0x00,0x00,0x00,0x56,0x34,0x12,0xaf
 
-# GFX11: v_ceil_f16_e64 v5, v1                   ; encoding: [0x05,0x00,0xdc,0xd5,0x01,0x01,0x00,0x00]
+# GFX11-REAL16: v_ceil_f16_e64 v5.l, v1.l        ; encoding: [0x05,0x00,0xdc,0xd5,0x01,0x01,0x00,0x00]
+# GFX11-FAKE16: v_ceil_f16_e64 v5, v1            ; encoding: [0x05,0x00,0xdc,0xd5,0x01,0x01,0x00,0x00]
 0x05,0x00,0xdc,0xd5,0x01,0x01,0x00,0x00
 
-# GFX11: v_ceil_f16_e64 v5, v255                 ; encoding: [0x05,0x00,0xdc,0xd5,0xff,0x01,0x00,0x00]
+# GFX11-REAL16: v_ceil_f16_e64 v5.l, v255.l      ; encoding: [0x05,0x00,0xdc,0xd5,0xff,0x01,0x00,0x00]
+# GFX11-FAKE16: v_ceil_f16_e64 v5, v255          ; encoding: [0x05,0x00,0xdc,0xd5,0xff,0x01,0x00,0x00]
 0x05,0x00,0xdc,0xd5,0xff,0x01,0x00,0x00
 
-# GFX11: v_ceil_f16_e64 v5, s1                   ; encoding: [0x05,0x00,0xdc,0xd5,0x01,0x00,0x00,0x00]
+# GFX11-REAL16: v_ceil_f16_e64 v5.l, s1          ; encoding: [0x05,0x00,0xdc,0xd5,0x01,0x00,0x00,0x00]
+# GFX11-FAKE16: v_ceil_f16_e64 v5, s1            ; encoding: [0x05,0x00,0xdc,0xd5,0x01,0x00,0x00,0x00]
----------------
Sisyph wrote:

It would work like this ( this is how it works downstream)
 v_add_nc_i16 v0.l, v1.h, s0 op_sel:[1,1,0]
                                                                                                                                       
>v_add_nc_i16 v0.l, v1.h, s0 op_sel:[1,1,0] ; encoding: [0x00,0x18,0x0d,0xd7,0x01,0x01,0x00,0x00]
     
v_add_nc_i16 v0.l, v1.h, s0 op_sel:[1,0,0]
                                                              
>v_add_nc_i16 v0.l, v1.h, s0 op_sel:[1,0,0] ; encoding: [0x00,0x08,0x0d,0xd7,0x01,0x01,0x00,0x00]      

Suffixes on the vgprs are required. If you want to use a hi/lo sgpr, you must pass the op_sel:[] operand, and the bits in that operand corresponding to vgprs must match the suffixes on the vgprs (there is verification they match).

It is a bit awkward but not impossible I think to support suffixes on the sgprs. For vgprs there is an actual register number for hi vs lo, and when you parse src0 you set the register number. For SGPRs you would want to set the bit in src0_modifiers.

https://github.com/llvm/llvm-project/pull/81131


More information about the llvm-commits mailing list