[PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare
Changpeng Fang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 3 10:02:01 PST 2020
cfang marked 3 inline comments as done.
cfang added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:611
//
-// 1/x -> rcp(x) when fast unsafe rcp is legal or fpmath >= 2.5ULP with
-// denormals flushed.
+// 1/x -> rcp(x) when fdiv is allowed to be re-associated or rcp is accurate.
//
----------------
arsenm wrote:
> This has nothing to do with reassociation
Division re-association: a/b -> a * rcp(b), and one special case is 1.0/b => 1.0*rcp(b) = rcp(b).
This is how 1.0/x -> rcp(x) associated with "re-association".
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:614
+// a/b -> a*rcp(b) when fdiv is allowed to be re-associated.
+static Value *lowerUsingRcp (Value *Num, Value *Den, bool CanReassociateFDiv,
+ bool RcpIsAccurate, IRBuilder<> Builder,
----------------
arsenm wrote:
> This should not be referred to ass lowering
I am thinking of a different name. Do you have a meaningful name for the function in mind?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:624
if (const ConstantFP *CLHS = dyn_cast<ConstantFP>(Num)) {
- if (FastUnsafeRcpLegal || Ty->isFloatTy() || Ty->isHalfTy()) {
+ if (CanReassociateFDiv || RcpIsAccurate) {
if (CLHS->isExactlyValue(1.0)) {
----------------
arsenm wrote:
> We aren't fdiv here. We're handling an fdiv, and not splitting it into a multiple and rcp
As explained in a previous comment, 1.0/x -> 1.0*rcp(x) = rcp(x) is a special case of re-association.
As a result, if the options specify re-association, we can do 1.0/x -> rcp(x).
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73588/new/
https://reviews.llvm.org/D73588
More information about the llvm-commits
mailing list