[PATCH] D154517: AMDGPU: Always use v_rcp_f16 and v_rsq_f16

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Oct 13 05:18:02 PDT 2023


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4152-4155
+  // For f16 require arcp only.
+  // For f32 require afn+arcp.
+  if (!AllowInaccurateRcp && (ResTy != LLT::scalar(16) ||
+                              !MI.getFlag(MachineInstr::FmArcp)))
----------------
Comments don't match the code. https://github.com/llvm/llvm-project/pull/68982 fixes the comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9172-9175
+  // For f16 require arcp only.
+  // For f32 require afn+arcp.
+  if (!AllowInaccurateRcp && (VT != MVT::f16 || !Flags.hasAllowReciprocal()))
+    return SDValue();
----------------
Same.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154517/new/

https://reviews.llvm.org/D154517



More information about the llvm-commits mailing list