[PATCH] D75179: [AMDGPU][ConstantFolding] Fold llvm.amdgcn.fract intrinsic

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 27 06:36:59 PST 2020


foad marked an inline comment as done.
foad added inline comments.


================
Comment at: llvm/lib/Analysis/ConstantFolding.cpp:1797
+    if (IntrinsicID == Intrinsic::amdgcn_fract) {
+      // TODO what should amdgcn_fract return for tiny negative arguments?
+      // GLSL defines fract(x) as x - floor(x).
----------------
foad wrote:
> foad wrote:
> > arsenm wrote:
> > > foad wrote:
> > > > arsenm wrote:
> > > > > This should match the instruction behavior (although I guess we can ignore the bug on SI)
> > > > Is there a good public reference for that? The Vega ISA Reference Guide doesn't go into much detail.
> > > This is always a problem, and no. I just go by this comment:
> > > 
> > > 
> > > 
> > > ```
> > > 
> > > // V_FRACT is buggy on SI, so the F32 version is never used and (x-floor(x)) is
> > > // used instead. However, SI doesn't have V_FLOOR_F64, so the most efficient
> > > // way to implement it is using V_FRACT_F64.
> > > // The workaround for the V_FRACT bug is:
> > > //    fract(x) = isnan(x) ? x : min(V_FRACT(x), 0.99999999999999999)
> > > 
> > > // Convert floor(x) to (x - fract(x))
> > > ```
> > > 
> > > 
> > OK, so it sounds like the (non-buggy) hardware uses the same trick as the OpenCL definition, to avoid ever returning 1.0. I'll try to confirm that on some real hardware.
> I've confirmed this for f16 and f32 types, on some real gfx9 hardware.
... and confirmed for f64 too.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75179/new/

https://reviews.llvm.org/D75179





More information about the llvm-commits mailing list