[PATCH] D75179: [AMDGPU][ConstantFolding] Fold llvm.amdgcn.fract intrinsic

Wed Feb 26 09:51:02 PST 2020

foad marked 3 inline comments as done.
foad added inline comments.

================
Comment at: llvm/lib/Analysis/ConstantFolding.cpp:1797
+    if (IntrinsicID == Intrinsic::amdgcn_fract) {
+      // TODO what should amdgcn_fract return for tiny negative arguments?
+      // GLSL defines fract(x) as x - floor(x).
----------------
arsenm wrote:
> foad wrote:
> > arsenm wrote:
> > > This should match the instruction behavior (although I guess we can ignore the bug on SI)
> > Is there a good public reference for that? The Vega ISA Reference Guide doesn't go into much detail.
> This is always a problem, and no. I just go by this comment:
> 
> 
> 
> ```
> 
> // V_FRACT is buggy on SI, so the F32 version is never used and (x-floor(x)) is
> // used instead. However, SI doesn't have V_FLOOR_F64, so the most efficient
> // way to implement it is using V_FRACT_F64.
> // The workaround for the V_FRACT bug is:
> //    fract(x) = isnan(x) ? x : min(V_FRACT(x), 0.99999999999999999)
> 
> // Convert floor(x) to (x - fract(x))
> ```
> 
> 
OK, so it sounds like the (non-buggy) hardware uses the same trick as the OpenCL definition, to avoid ever returning 1.0. I'll try to confirm that on some real hardware.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75179/new/

https://reviews.llvm.org/D75179