[PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare

Thu Jan 30 15:22:48 PST 2020

[AMD Official Use Only - Internal Distribution Only]

>// Faster 2.5 ULP division that does not support denormals.
>SDValue SITargetLowering::lowerFDIV_FAST(SDValue Op, SelectionDAG &DAG) const {

Also, it is the case that fdiv.fast generates "Faster 2.5 ULP division that does not support denormals"?
So we should still use 2.5ULP for fdiv.fast, and 1.0ULP for rcp?
________________________________
From: Changpeng Fang via Phabricator <reviews at reviews.llvm.org>
Sent: Thursday, January 30, 2020 3:16 PM
To: Fang, Changpeng <Changpeng.Fang at amd.com>; Arsenault, Matthew <Matthew.Arsenault at amd.com>; Sumner, Brian <Brian.Sumner at amd.com>
Cc: Zhuravlyov, Konstantin <Konstantin.Zhuravlyov at amd.com>; jv356 at scarletmail.rutgers.edu <jv356 at scarletmail.rutgers.edu>; wei.ding2 at amd.com <wei.ding2 at amd.com>; nhaehnle at gmail.com <nhaehnle at gmail.com>; Liu, Yaxun (Sam) <Yaxun.Liu at amd.com>; Stuttard, David <David.Stuttard at amd.com>; tpr.ll at botech.co.uk <tpr.ll at botech.co.uk>; Tye, Tony <Tony.Tye at amd.com>; hiraditya at msn.com <hiraditya at msn.com>; Kerbow, Austin <Austin.Kerbow at amd.com>; llvm-commits at lists.llvm.org <llvm-commits at lists.llvm.org>; jun.l at samsung.com <jun.l at samsung.com>
Subject: [PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare

[CAUTION: External Email]

cfang marked 2 inline comments as done.
cfang added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:714
   const bool UseFDivFast = Ty->isFloatTy() && !NeedHighAccuracy &&
-                           !FastUnsafeRcpLegal;
+                           !CanReassociateFDiv;

----------------
arsenm wrote:
> cfang wrote:
> > arsenm wrote:
> > > fdiv.fast doesn't' care about the reassociation
> > You are right. This is just the optimization priority issue.
> >
> > If we can reassociate fdiv, x/y -> x * rcp(y) is faster than fdiv.fast so we don't do fdiv.fast.
> The comment and variable name are misleading, as no reassociate is going on here. This needs an explanation here
I am going to write an explanation here.
But I am confused about fdiv.fast intrinsic:
1.0/x -> fdiv.fast (1.0,  x) when denormals are supported. Because I think  does not support fdiv.fast.

CHANGES SINCE LAST ACTION
  https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FD73588%2Fnew%2F&data=02%7C01%7Cchangpeng.fang%40amd.com%7C660361b4057148f8121408d7a5da795d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637160230080757711&sdata=GK5h%2FmYTfaCd3pS79GxznvSwoc99KN8Q61MYwnObyrI%3D&reserved=0

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FD73588&data=02%7C01%7Cchangpeng.fang%40amd.com%7C660361b4057148f8121408d7a5da795d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637160230080767705&sdata=ZOnn6PlbKVDMovSzNXJ316mvyOa7fyqpr%2BUhRGl7bMc%3D&reserved=0

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200130/b722b76e/attachment.html>