[PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare

Thu Jan 30 15:16:43 PST 2020

cfang marked 2 inline comments as done.
cfang added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:714
   const bool UseFDivFast = Ty->isFloatTy() && !NeedHighAccuracy &&
-                           !FastUnsafeRcpLegal;
+                           !CanReassociateFDiv;

----------------
arsenm wrote:
> cfang wrote:
> > arsenm wrote:
> > > fdiv.fast doesn't' care about the reassociation
> > You are right. This is just the optimization priority issue.
> > 
> > If we can reassociate fdiv, x/y -> x * rcp(y) is faster than fdiv.fast so we don't do fdiv.fast.
> The comment and variable name are misleading, as no reassociate is going on here. This needs an explanation here
I am going to write an explanation here.
But I am confused about fdiv.fast intrinsic:
1.0/x -> fdiv.fast (1.0,  x) when denormals are supported. Because I think  does not support fdiv.fast.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73588/new/

https://reviews.llvm.org/D73588