[PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 31 08:20:12 PST 2020


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:714
   const bool UseFDivFast = Ty->isFloatTy() && !NeedHighAccuracy &&
-                           !FastUnsafeRcpLegal;
+                           !CanReassociateFDiv;
 
----------------
cfang wrote:
> arsenm wrote:
> > cfang wrote:
> > > arsenm wrote:
> > > > fdiv.fast doesn't' care about the reassociation
> > > You are right. This is just the optimization priority issue.
> > > 
> > > If we can reassociate fdiv, x/y -> x * rcp(y) is faster than fdiv.fast so we don't do fdiv.fast.
> > The comment and variable name are misleading, as no reassociate is going on here. This needs an explanation here
> I am going to write an explanation here.
> But I am confused about fdiv.fast intrinsic:
> 1.0/x -> fdiv.fast (1.0,  x) when denormals are supported. Because I think  does not support fdiv.fast.
> 
I'm not sure what the question is. fdiv.fast is used depending on whether the denormal mode needs to be switched or not, and is separate from rcp. If we can use rcp, it's preferable to fdiv.fast


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73588/new/

https://reviews.llvm.org/D73588





More information about the llvm-commits mailing list