[PATCH] D99376: [AMDGPU] Mark additional VOP3 as commutable
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 30 02:08:42 PDT 2021
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/VOP3Instructions.td:371
+ defm V_MIN3_U32 : VOP3Inst <"v_min3_u32", VOP3_Profile<VOP_I32_I32_I32_I32>, AMDGPUumin3>;
+ defm V_MIN3_F32 : VOP3Inst <"v_min3_f32", VOP3_Profile<VOP_F32_F32_F32_F32>, AMDGPUfmin3>;
+ defm V_MAX3_I32 : VOP3Inst <"v_max3_i32", VOP3_Profile<VOP_I32_I32_I32_I32>, AMDGPUsmax3>;
----------------
You have to be super careful with fp min/max/med because the NaN handling is not commutative. You could commute them with suitable "nnan" or other IEEE-related flags, but it's probably not worth it. So I would suggest dropping them.
================
Comment at: llvm/lib/Target/AMDGPU/VOP3Instructions.td:632
+ defm V_MED3_U16 : VOP3Inst <"v_med3_u16", VOP3_Profile<VOP_I16_I16_I16_I16, VOP3_OPSEL>, AMDGPUumed3>;
+ defm V_MED3_F16 : VOP3Inst <"v_med3_f16", VOP3_Profile<VOP_F16_F16_F16_F16, VOP3_OPSEL>, AMDGPUfmed3>;
+ defm V_MIN3_I16 : VOP3Inst <"v_min3_i16", VOP3_Profile<VOP_I16_I16_I16_I16, VOP3_OPSEL>, AMDGPUsmin3>;
----------------
Likewise, drop f16 min/max/med.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99376/new/
https://reviews.llvm.org/D99376
More information about the llvm-commits
mailing list