[PATCH] D154517: AMDGPU: Always use v_rcp_f16 and v_rsq_f16
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 13 05:18:02 PDT 2023
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4152-4155
+ // For f16 require arcp only.
+ // For f32 require afn+arcp.
+ if (!AllowInaccurateRcp && (ResTy != LLT::scalar(16) ||
+ !MI.getFlag(MachineInstr::FmArcp)))
----------------
Comments don't match the code. https://github.com/llvm/llvm-project/pull/68982 fixes the comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9172-9175
+ // For f16 require arcp only.
+ // For f32 require afn+arcp.
+ if (!AllowInaccurateRcp && (VT != MVT::f16 || !Flags.hasAllowReciprocal()))
+ return SDValue();
----------------
Same.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D154517/new/
https://reviews.llvm.org/D154517
More information about the llvm-commits
mailing list