[PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 5 14:43:54 PST 2020


arsenm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:611
 //
-// 1/x -> rcp(x) when fast unsafe rcp is legal or fpmath >= 2.5ULP with
-//                                                denormals flushed.
+// 1/x -> rcp(x) when fdiv is allowed to be re-associated or rcp is accurate.
 //
----------------
cfang wrote:
> arsenm wrote:
> > cfang wrote:
> > > arsenm wrote:
> > > > This has nothing to do with reassociation
> > > Division re-association:  a/b -> a * rcp(b), and one special case is 1.0/b => 1.0*rcp(b) = rcp(b).
> > > This is how 1.0/x -> rcp(x) associated with "re-association".  
> > This isn't reassocation. This is just special handling of 1.0/b. Nothing algebraic changes here. There's no multiply introduced here
> Ok, it seems we have a different understanding here.
> 
> I think this is still just a naming issue. Originally the name is something like UnsafeMath? 
> But I do think arcp also gives the permission to do 1.0/x -> rcp(x) even though no multitply is explicitly generated
> (1.0 * rcp(x) = rcp(x)). 
> 
> Maybe we should go back to use the original name as HasFastUnsafeOptions to clear your confusion?
It's also the comment and flag being checked. There's no implicit or explicit multiply here, this is just a reciprocal. This pass is not responsible for doing the reassociating allowed by arcp. This should be a check for the approximate function math flag. allowRecriprocal is not relevant here


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:716
   const FPMathOperator *FPOp = cast<const FPMathOperator>(&FDiv);
   MDNode *FPMath = FDiv.getMetadata(LLVMContext::MD_fpmath);
+  const float ReqdAccuracy =  FPOp->getFPAccuracy();
----------------
You don't need this anymore with getFPAccuracy


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:7422
+  bool AllowInaccurateRcp = DAG.getTarget().Options.UnsafeFPMath ||
+                            Flags.hasAllowReciprocal();
+
----------------
Again, I thought we decided that arcp does not mean this can be less accurate. This should be checking for approximate function


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73588/new/

https://reviews.llvm.org/D73588





More information about the llvm-commits mailing list