[PATCH] D75179: [AMDGPU][ConstantFolding] Fold llvm.amdgcn.fract intrinsic
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 27 06:16:07 PST 2020
foad marked 2 inline comments as done.
foad added inline comments.
================
Comment at: llvm/lib/Analysis/ConstantFolding.cpp:1797
+ if (IntrinsicID == Intrinsic::amdgcn_fract) {
+ // TODO what should amdgcn_fract return for tiny negative arguments?
+ // GLSL defines fract(x) as x - floor(x).
----------------
foad wrote:
> arsenm wrote:
> > foad wrote:
> > > arsenm wrote:
> > > > This should match the instruction behavior (although I guess we can ignore the bug on SI)
> > > Is there a good public reference for that? The Vega ISA Reference Guide doesn't go into much detail.
> > This is always a problem, and no. I just go by this comment:
> >
> >
> >
> > ```
> >
> > // V_FRACT is buggy on SI, so the F32 version is never used and (x-floor(x)) is
> > // used instead. However, SI doesn't have V_FLOOR_F64, so the most efficient
> > // way to implement it is using V_FRACT_F64.
> > // The workaround for the V_FRACT bug is:
> > // fract(x) = isnan(x) ? x : min(V_FRACT(x), 0.99999999999999999)
> >
> > // Convert floor(x) to (x - fract(x))
> > ```
> >
> >
> OK, so it sounds like the (non-buggy) hardware uses the same trick as the OpenCL definition, to avoid ever returning 1.0. I'll try to confirm that on some real hardware.
I've confirmed this for f16 and f32 types, on some real gfx9 hardware.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D75179/new/
https://reviews.llvm.org/D75179
More information about the llvm-commits
mailing list