[llvm-dev] [RFC] Making space for a flush-to-zero flag in FastMathFlags

Craig Topper via llvm-dev llvm-dev at lists.llvm.org
Sun Mar 17 13:46:59 PDT 2019


Can we move HasValueHandle out of the byte used for SubClassOptionalData
and move it to the flags at the bottom of value by
shrinking NumUserOperands to 27?

~Craig


On Sat, Mar 16, 2019 at 12:51 PM Sanjoy Das via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hi,
>
> I need to add a flush-denormals-to-zero (FTZ) flag to FastMathFlags,
> but  we've already used up the 7 bits available in
> Value::SubclassOptionalData (the "backing storage" for
> FPMathOperator::getFastMathFlags()).  These are the possibilities I
> can think of:
>
> 1. Increase the size of FPMathOperator.  This gives us some additional
> bits for FTZ and other fastmath flags we'd want to add in the future.
> Obvious downside is that it increases LLVM's memory footprint.
>
> 2. Steal some low bits from pointers already present in Value and
> expose them as part of SubclassOptionalData.  We can at least steal 3
> bits from the first two words in Value which are both pointers.  The
> LSB of the first pointer needs to be 0, otherwise we could steal 4
> bits.
>
> 3. Allow only specific combinations in FastMathFlags.  In practice, I
> don't think folks are equally interested in all the 2^N combinations
> present in FastMathFlags, so we could compromise and allow only the
> most "typical" 2^7 combinations (e.g. we could nonan and noinf into a
> single bit, under the assumption that users want to enable-disable
> them as a unit).  I'm unsure if establishing the most typical 2^7
> combinations will be straightforward though.
>
> 4. Function level attributes.  Instead of wasting precious
> instruction-level space, we could move all FP math attributes on the
> containing function.  I'm not sure if this will work for all frontends
> and it also raises annoying tradeoffs around inlining and other
> inter-procedural passes.
>
>
> My gut feeling is to go with (2).  It should be semantically
> invisible, have no impact on memory usage, and the ugly bit
> manipulation can be abstracted away.  What do you think?  Any other
> possibilities I missed?
>
>
> Why I need an FTZ flag:  some ARM Neon vector instructions have FTZ
> semantics, which means we can't vectorize instructions when compiling
> for Neon unless we know the user is okay with FTZ.  Today we pretend
> that the "fast" variant of FastMathFlags implies FTZ
> (https://reviews.llvm.org/rL266363), which is not ideal.  Moreover
> (this is the immediate reason), for XLA CPU I'm trying to generate FP
> instructions without nonan and noinf, which breaks vectorization on
> ARM Neon for this reason.  An explicit bit for FTZ will let me
> generate FP operations tagged with FTZ and all fast math flags except
> nonan and noinf, and still have them vectorize on Neon.
>
> -- Sanjoy
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190317/cc7f20af/attachment.html>


More information about the llvm-dev mailing list