[PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare

Arsenault, Matthew via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 31 08:23:01 PST 2020


The comment on lowerFDIV_FAST says 2.5, so I would just go with that for now

From: "Arsenault, Matthew" <Matthew.Arsenault at amd.com>
Date: Friday, January 31, 2020 at 11:18
To: "Fang, Changpeng" <Changpeng.Fang at amd.com>, Changpeng Fang via Phabricator <reviews at reviews.llvm.org>, Brian Sumner <Brian.Sumner at amd.com>
Cc: Konstantin Zhuravlyov <Konstantin.Zhuravlyov at amd.com>, "jv356 at scarletmail.rutgers.edu" <jv356 at scarletmail.rutgers.edu>, Wei Ding <wei.ding2 at amd.com>, "nhaehnle at gmail.com" <nhaehnle at gmail.com>, "Liu, Yaxun (Sam)" <Yaxun.Liu at amd.com>, "Stuttard, David" <David.Stuttard at amd.com>, "tpr.ll at botech.co.uk" <tpr.ll at botech.co.uk>, Tony Tye <Tony.Tye at amd.com>, "hiraditya at msn.com" <hiraditya at msn.com>, "Kerbow, Austin" <Austin.Kerbow at amd.com>, "llvm-commits at lists.llvm.org" <llvm-commits at lists.llvm.org>, "jun.l at samsung.com" <jun.l at samsung.com>
Subject: Re: [PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare

I’m not sure what the ulp is for the full fdiv expansions, so I’m not sure

-Matt

From: "Fang, Changpeng" <Changpeng.Fang at amd.com>
Date: Thursday, January 30, 2020 at 18:22
To: Changpeng Fang via Phabricator <reviews at reviews.llvm.org>, "Arsenault, Matthew" <Matthew.Arsenault at amd.com>, Brian Sumner <Brian.Sumner at amd.com>
Cc: Konstantin Zhuravlyov <Konstantin.Zhuravlyov at amd.com>, "jv356 at scarletmail.rutgers.edu" <jv356 at scarletmail.rutgers.edu>, Wei Ding <wei.ding2 at amd.com>, "nhaehnle at gmail.com" <nhaehnle at gmail.com>, "Liu, Yaxun (Sam)" <Yaxun.Liu at amd.com>, "Stuttard, David" <David.Stuttard at amd.com>, "tpr.ll at botech.co.uk" <tpr.ll at botech.co.uk>, Tony Tye <Tony.Tye at amd.com>, "hiraditya at msn.com" <hiraditya at msn.com>, "Kerbow, Austin" <Austin.Kerbow at amd.com>, "llvm-commits at lists.llvm.org" <llvm-commits at lists.llvm.org>, "jun.l at samsung.com" <jun.l at samsung.com>
Subject: Re: [PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare


[AMD Official Use Only - Internal Distribution Only]

>// Faster 2.5 ULP division that does not support denormals.
>SDValue SITargetLowering::lowerFDIV_FAST(SDValue Op, SelectionDAG &DAG) const {

Also, it is the case that fdiv.fast generates "Faster 2.5 ULP division that does not support denormals"?
So we should still use 2.5ULP for fdiv.fast, and 1.0ULP for rcp?
________________________________
From: Changpeng Fang via Phabricator <reviews at reviews.llvm.org>
Sent: Thursday, January 30, 2020 3:16 PM
To: Fang, Changpeng <Changpeng.Fang at amd.com>; Arsenault, Matthew <Matthew.Arsenault at amd.com>; Sumner, Brian <Brian.Sumner at amd.com>
Cc: Zhuravlyov, Konstantin <Konstantin.Zhuravlyov at amd.com>; jv356 at scarletmail.rutgers.edu <jv356 at scarletmail.rutgers.edu>; wei.ding2 at amd.com <wei.ding2 at amd.com>; nhaehnle at gmail.com <nhaehnle at gmail.com>; Liu, Yaxun (Sam) <Yaxun.Liu at amd.com>; Stuttard, David <David.Stuttard at amd.com>; tpr.ll at botech.co.uk <tpr.ll at botech.co.uk>; Tye, Tony <Tony.Tye at amd.com>; hiraditya at msn.com <hiraditya at msn.com>; Kerbow, Austin <Austin.Kerbow at amd.com>; llvm-commits at lists.llvm.org <llvm-commits at lists.llvm.org>; jun.l at samsung.com <jun.l at samsung.com>
Subject: [PATCH] D73588: AMDGPU: Enhancement on FDIV lowering in AMDGPUCodeGenPrepare

[CAUTION: External Email]

cfang marked 2 inline comments as done.
cfang added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:714
   const bool UseFDivFast = Ty->isFloatTy() && !NeedHighAccuracy &&
-                           !FastUnsafeRcpLegal;
+                           !CanReassociateFDiv;

----------------
arsenm wrote:
> cfang wrote:
> > arsenm wrote:
> > > fdiv.fast doesn't' care about the reassociation
> > You are right. This is just the optimization priority issue.
> >
> > If we can reassociate fdiv, x/y -> x * rcp(y) is faster than fdiv.fast so we don't do fdiv.fast.
> The comment and variable name are misleading, as no reassociate is going on here. This needs an explanation here
I am going to write an explanation here.
But I am confused about fdiv.fast intrinsic:
1.0/x -> fdiv.fast (1.0,  x) when denormals are supported. Because I think  does not support fdiv.fast.



CHANGES SINCE LAST ACTION
  https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FD73588%2Fnew%2F&data=02%7C01%7Cchangpeng.fang%40amd.com%7C660361b4057148f8121408d7a5da795d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637160230080757711&sdata=GK5h%2FmYTfaCd3pS79GxznvSwoc99KN8Q61MYwnObyrI%3D&reserved=0

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FD73588&data=02%7C01%7Cchangpeng.fang%40amd.com%7C660361b4057148f8121408d7a5da795d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637160230080767705&sdata=ZOnn6PlbKVDMovSzNXJ316mvyOa7fyqpr%2BUhRGl7bMc%3D&reserved=0



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200131/a955d71c/attachment.html>


More information about the llvm-commits mailing list