[flang-commits] [flang] [clang] [compiler-rt] [libcxx] [llvm] [lld] [lldb] [clang-tools-extra] [libc] [AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (PR #78414)

Tue Jan 23 07:21:20 PST 2024

Mirko =?utf-8?q?Brkušanin?= <Mirko.Brkusanin at amd.com>,
Mirko =?utf-8?q?Brkušanin?= <Mirko.Brkusanin at amd.com>
Message-ID:
In-Reply-To: <llvm.org/llvm/llvm-project/pull/78414 at github.com>


================
@@ -305,6 +305,11 @@ class VOP3OpSel_gfx10<bits<10> op, VOPProfile p> : VOP3e_gfx10<op, p> {
 
 class VOP3OpSel_gfx11_gfx12<bits<10> op, VOPProfile p> : VOP3OpSel_gfx10<op, p>;
 
+class VOP3FP8OpSel_gfx11_gfx12<bits<10> op, VOPProfile p> : VOP3e_gfx10<op, p> {
+  let Inst{11} = !if(p.HasSrc0, src0_modifiers{2}, 0);
+  let Inst{12} = !if(p.HasSrc0, src0_modifiers{3}, 0);
----------------
Sisyph wrote:

Thanks! I do think that patch will help a lot. I also think it handles the case where we use dst_op_sel to store the other bit instead of src1. If the CVT_F32_FP8 instruction was VOP3P, we would need a special case, but since it is VOP3, we want all the op_sel bits to be zero and we want dst_op_sel to be zero.

https://github.com/llvm/llvm-project/pull/78414