[PATCH] D153472: AMDGPU: Correctly expand f64 sqrt intrinsic
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 23 02:03:05 PDT 2023
Pierre-vh added a comment.
> I am tempted to do this in an IR expansion instead. In the IR
> we could take advantage of computeKnownFPClass to avoid
> the 0-or-inf argument check.
Wouldn't this be a good fit for CGP, and avoid repeating logic for GISel+DAGISel?
Is there a drawback to doing it in IR?
================
Comment at: llvm/test/CodeGen/AMDGPU/fsqrt.f64.ll:44
-; GCN-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GCN-NEXT: v_sqrt_f64_e64 v[0:1], -|v[0:1]|
-; GCN-NEXT: s_setpc_b64 s[30:31]
----------------
I think I need some context, why is `v_sqrt_f64` so bad that this expansion is preferred? Accuracy/semantics?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D153472/new/
https://reviews.llvm.org/D153472
More information about the llvm-commits
mailing list