[PATCH] D153472: AMDGPU: Correctly expand f64 sqrt intrinsic

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 23 02:03:05 PDT 2023


Pierre-vh added a comment.

> I am tempted to do this in an IR expansion instead. In the IR
> we could take advantage of computeKnownFPClass to avoid
> the 0-or-inf argument check.

Wouldn't this be a good fit for CGP, and avoid repeating logic for GISel+DAGISel?
Is there a drawback to doing it in IR?



================
Comment at: llvm/test/CodeGen/AMDGPU/fsqrt.f64.ll:44
-; GCN-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GCN-NEXT:    v_sqrt_f64_e64 v[0:1], -|v[0:1]|
-; GCN-NEXT:    s_setpc_b64 s[30:31]
----------------
I think I need some context, why is `v_sqrt_f64` so bad that this expansion is preferred? Accuracy/semantics?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153472/new/

https://reviews.llvm.org/D153472



More information about the llvm-commits mailing list