[PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare
Changpeng Fang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 30 15:16:43 PST 2020
cfang marked 2 inline comments as done.
cfang added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:714
const bool UseFDivFast = Ty->isFloatTy() && !NeedHighAccuracy &&
- !FastUnsafeRcpLegal;
+ !CanReassociateFDiv;
----------------
arsenm wrote:
> cfang wrote:
> > arsenm wrote:
> > > fdiv.fast doesn't' care about the reassociation
> > You are right. This is just the optimization priority issue.
> >
> > If we can reassociate fdiv, x/y -> x * rcp(y) is faster than fdiv.fast so we don't do fdiv.fast.
> The comment and variable name are misleading, as no reassociate is going on here. This needs an explanation here
I am going to write an explanation here.
But I am confused about fdiv.fast intrinsic:
1.0/x -> fdiv.fast (1.0, x) when denormals are supported. Because I think does not support fdiv.fast.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73588/new/
https://reviews.llvm.org/D73588
More information about the llvm-commits
mailing list