[PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare

Wed Feb 5 15:11:30 PST 2020

cfang marked 3 inline comments as done.
cfang added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:611
 //
-// 1/x -> rcp(x) when fast unsafe rcp is legal or fpmath >= 2.5ULP with
-//                                                denormals flushed.
+// 1/x -> rcp(x) when fdiv is allowed to be re-associated or rcp is accurate.
 //
----------------
arsenm wrote:
> cfang wrote:
> > arsenm wrote:
> > > cfang wrote:
> > > > arsenm wrote:
> > > > > This has nothing to do with reassociation
> > > > Division re-association:  a/b -> a * rcp(b), and one special case is 1.0/b => 1.0*rcp(b) = rcp(b).
> > > > This is how 1.0/x -> rcp(x) associated with "re-association".  
> > > This isn't reassocation. This is just special handling of 1.0/b. Nothing algebraic changes here. There's no multiply introduced here
> > Ok, it seems we have a different understanding here.
> > 
> > I think this is still just a naming issue. Originally the name is something like UnsafeMath? 
> > But I do think arcp also gives the permission to do 1.0/x -> rcp(x) even though no multitply is explicitly generated
> > (1.0 * rcp(x) = rcp(x)). 
> > 
> > Maybe we should go back to use the original name as HasFastUnsafeOptions to clear your confusion?
> It's also the comment and flag being checked. There's no implicit or explicit multiply here, this is just a reciprocal. This pass is not responsible for doing the reassociating allowed by arcp. This should be a check for the approximate function math flag. allowRecriprocal is not relevant here
So we only check unsafe-fast-math or afn here, not arcp?  Thanks.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:716
   const FPMathOperator *FPOp = cast<const FPMathOperator>(&FDiv);
   MDNode *FPMath = FDiv.getMetadata(LLVMContext::MD_fpmath);
+  const float ReqdAccuracy =  FPOp->getFPAccuracy();
----------------
arsenm wrote:
> You don't need this anymore with getFPAccuracy
Do you think we will no longer need fpmath metadata after this point?
in fdiv.fast intrinsic or the  fdiv itself which may not be changed (especially in the vector cases).

================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:7422
+  bool AllowInaccurateRcp = DAG.getTarget().Options.UnsafeFPMath ||
+                            Flags.hasAllowReciprocal();
+
----------------
arsenm wrote:
> Again, I thought we decided that arcp does not mean this can be less accurate. This should be checking for approximate function
Check only afn right?  not arcp && afn?  Thanks.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73588/new/

https://reviews.llvm.org/D73588