[PATCH] D75179: [AMDGPU][ConstantFolding] Fold llvm.amdgcn.fract intrinsic

Thu Feb 27 06:16:07 PST 2020

foad marked 2 inline comments as done.
foad added inline comments.

================
Comment at: llvm/lib/Analysis/ConstantFolding.cpp:1797
+    if (IntrinsicID == Intrinsic::amdgcn_fract) {
+      // TODO what should amdgcn_fract return for tiny negative arguments?
+      // GLSL defines fract(x) as x - floor(x).
----------------
foad wrote:
> arsenm wrote:
> > foad wrote:
> > > arsenm wrote:
> > > > This should match the instruction behavior (although I guess we can ignore the bug on SI)
> > > Is there a good public reference for that? The Vega ISA Reference Guide doesn't go into much detail.
> > This is always a problem, and no. I just go by this comment:
> > 
> > 
> > 
> > ```
> > 
> > // V_FRACT is buggy on SI, so the F32 version is never used and (x-floor(x)) is
> > // used instead. However, SI doesn't have V_FLOOR_F64, so the most efficient
> > // way to implement it is using V_FRACT_F64.
> > // The workaround for the V_FRACT bug is:
> > //    fract(x) = isnan(x) ? x : min(V_FRACT(x), 0.99999999999999999)
> > 
> > // Convert floor(x) to (x - fract(x))
> > ```
> > 
> > 
> OK, so it sounds like the (non-buggy) hardware uses the same trick as the OpenCL definition, to avoid ever returning 1.0. I'll try to confirm that on some real hardware.
I've confirmed this for f16 and f32 types, on some real gfx9 hardware.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75179/new/

https://reviews.llvm.org/D75179