[PATCH] D28508: [NVPTX] Implement NVPTXTargetLowering::getSqrtEstimate.

escha via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 31 15:31:58 PST 2017

escha added a comment.

That really surprises me that it's faster! I would expect SFU functions like RCP/RSQRT to dwarf the cost of a multiply, especially for double.

Also, do be careful that rcp(rsqrt(x)) and x * rsqrt(x) have different precisions under some implementations (because fmul is 0.5 ULP, while rcp/rsqrt may be as low as 2.5 ULP each).



More information about the llvm-commits mailing list