[PATCH] D28508: [NVPTX] Implement NVPTXTargetLowering::getSqrtEstimate.
escha via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 31 15:31:58 PST 2017
escha added a comment.
That really surprises me that it's faster! I would expect SFU functions like RCP/RSQRT to dwarf the cost of a multiply, especially for double.
Also, do be careful that rcp(rsqrt(x)) and x * rsqrt(x) have different precisions under some implementations (because fmul is 0.5 ULP, while rcp/rsqrt may be as low as 2.5 ULP each).
More information about the llvm-commits