[PATCH] D106023: [AMDGPU] Mark relevant rematerializable VOP2 instructions
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 15 15:20:32 PDT 2021
rampitec added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/VOP2Instructions.td:651
+let FPDPRounding = 1, isReMaterializable = 1 in {
def V_MADMK_F16 : VOP2_Pseudo <"v_madmk_f16", VOP_MADMK_F16, [], "">;
defm V_LDEXP_F16 : VOP2Inst <"v_ldexp_f16", VOP_F16_F16_I32, AMDGPUldexp>;
----------------
arsenm wrote:
> This preserves high bits on gfx9
GFX9 manual: VOP1/VOP2 will write zero to unused bits unless SDWA specifies otherwise, and VOP1/VOP2 ops encoded as VOP3 will write zero.
So I assume it does not.
================
Comment at: llvm/lib/Target/AMDGPU/VOP2Instructions.td:652
def V_MADMK_F16 : VOP2_Pseudo <"v_madmk_f16", VOP_MADMK_F16, [], "">;
defm V_LDEXP_F16 : VOP2Inst <"v_ldexp_f16", VOP_F16_F16_I32, AMDGPUldexp>;
+} // End FPDPRounding = 1, isReMaterializable = 1
----------------
arsenm wrote:
> This one does not (but does on gfx10)
Thanks for catching!
================
Comment at: llvm/lib/Target/AMDGPU/VOP2Instructions.td:765
+let SubtargetPredicate = HasFmaakFmamkF32Insts, isReMaterializable = 1 in {
def V_FMAMK_F32 : VOP2_Pseudo<"v_fmamk_f32", VOP_MADMK_F32, [], "">;
----------------
arsenm wrote:
> This and the other fma flavors preserve the high bits on gfx9
This is f32, both f16 fma shall also zero hi bits because these are VOP2 only.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D106023/new/
https://reviews.llvm.org/D106023
More information about the llvm-commits
mailing list