[PATCH] D28508: [NVPTX] Implement NVPTXTargetLowering::getSqrtEstimate.

escha via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 31 15:31:58 PST 2017


escha added a comment.

That really surprises me that it's faster! I would expect SFU functions like RCP/RSQRT to dwarf the cost of a multiply, especially for double.

Also, do be careful that rcp(rsqrt(x)) and x * rsqrt(x) have different precisions under some implementations (because fmul is 0.5 ULP, while rcp/rsqrt may be as low as 2.5 ULP each).


Repository:
  rL LLVM

https://reviews.llvm.org/D28508





More information about the llvm-commits mailing list