[all-commits] [llvm/llvm-project] b73942: AMDGPU/EG, CM: Implement fsqrt using recip(rsqrt(x)...
Jan Vesely via All-commits
all-commits at lists.llvm.org
Mon Feb 10 05:25:15 PST 2020
Branch: refs/heads/release/10.x
Home: https://github.com/llvm/llvm-project
Commit: b73942dbc144c11dc94fd32a7d8025a22e7e1d6b
https://github.com/llvm/llvm-project/commit/b73942dbc144c11dc94fd32a7d8025a22e7e1d6b
Author: Jan Vesely <jan.vesely at rutgers.edu>
Date: 2020-02-10 (Mon, 10 Feb 2020)
Changed paths:
M llvm/lib/Target/AMDGPU/CaymanInstructions.td
M llvm/lib/Target/AMDGPU/EvergreenInstructions.td
M llvm/lib/Target/AMDGPU/R600Instructions.td
M llvm/test/CodeGen/AMDGPU/fsqrt.ll
Log Message:
-----------
AMDGPU/EG,CM: Implement fsqrt using recip(rsqrt(x)) instead of x * rsqrt(x)
The old version might be faster on EG (RECIP_IEEE is Trans only),
but it'd need extra corner case checks.
This gives correct corner case behaviour and saves a register.
Fixes OCL CTS sqrt test (1-thread, scalar) on Turks.
Reviewer: arsenm
Differential Revision: https://reviews.llvm.org/D74017
(cherry picked from commit e6686adf8a743564f0c455c34f04752ab08cf642)
More information about the All-commits
mailing list