[PATCH] D153472: AMDGPU: Correctly expand f64 sqrt intrinsic
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 25 04:37:06 PDT 2023
Pierre-vh accepted this revision as: Pierre-vh.
Pierre-vh added a comment.
This revision is now accepted and ready to land.
LGTM with some nits
================
Comment at: llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:1183-1184
std::optional<unsigned> Flags = std::nullopt);
+ /// Build and insert a \p Res = G_IS_FPCLASS \p Pred\p Src, \p Mask
+ MachineInstrBuilder buildIsFPClass(const DstOp &Res, const SrcOp &Src,
----------------
================
Comment at: llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:1186
+ MachineInstrBuilder buildIsFPClass(const DstOp &Res, const SrcOp &Src,
+ unsigned Mask) {
+ return buildInstr(TargetOpcode::G_IS_FPCLASS, {Res},
----------------
Nit: can't mask just be `int64_t` directly?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp:4844
+ // For double type, the SQRT and RSQ instructions don't have required
+ // precision, we apply Goldschmidt's algorithm to improve the result:
+ //
----------------
Add that they fail OCL conformance + many users avoid using it because of that, just for some context.
================
Comment at: llvm/test/Analysis/CostModel/AMDGPU/arith-fp.ll:55
; ALL-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V16F32 = call <16 x float> @llvm.sqrt.v16f32(<16 x float> undef)
-; ALL-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)
-; ALL-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)
-; ALL-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4F64 = call <4 x double> @llvm.sqrt.v4f64(<4 x double> undef)
-; ALL-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %V8F64 = call <8 x double> @llvm.sqrt.v8f64(<8 x double> undef)
+; ALL-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %F64 = call double @llvm.sqrt.f64(double undef)
+; ALL-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2F64 = call <2 x double> @llvm.sqrt.v2f64(<2 x double> undef)
----------------
Why did this change?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D153472/new/
https://reviews.llvm.org/D153472
More information about the llvm-commits
mailing list