[PATCH] D99376: [AMDGPU] Mark additional VOP3 as commutable

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 30 02:08:42 PDT 2021


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/VOP3Instructions.td:371
+  defm V_MIN3_U32 : VOP3Inst <"v_min3_u32", VOP3_Profile<VOP_I32_I32_I32_I32>, AMDGPUumin3>;
+  defm V_MIN3_F32 : VOP3Inst <"v_min3_f32", VOP3_Profile<VOP_F32_F32_F32_F32>, AMDGPUfmin3>;
+  defm V_MAX3_I32 : VOP3Inst <"v_max3_i32", VOP3_Profile<VOP_I32_I32_I32_I32>, AMDGPUsmax3>;
----------------
You have to be super careful with fp min/max/med because the NaN handling is not commutative. You could commute them with suitable "nnan" or other IEEE-related flags, but it's probably not worth it. So I would suggest dropping them.


================
Comment at: llvm/lib/Target/AMDGPU/VOP3Instructions.td:632
+  defm V_MED3_U16 : VOP3Inst <"v_med3_u16", VOP3_Profile<VOP_I16_I16_I16_I16, VOP3_OPSEL>, AMDGPUumed3>;
+  defm V_MED3_F16 : VOP3Inst <"v_med3_f16", VOP3_Profile<VOP_F16_F16_F16_F16, VOP3_OPSEL>, AMDGPUfmed3>;
+  defm V_MIN3_I16 : VOP3Inst <"v_min3_i16", VOP3_Profile<VOP_I16_I16_I16_I16, VOP3_OPSEL>, AMDGPUsmin3>;
----------------
Likewise, drop f16 min/max/med.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99376/new/

https://reviews.llvm.org/D99376



More information about the llvm-commits mailing list