<div dir="ltr"><div dir="ltr"><div dir="ltr">Can we move HasValueHandle out of the byte used for SubClassOptionalData and move it to the flags at the bottom of value by shrinking NumUserOperands to 27?</div><div dir="ltr"><br clear="all"><div><div dir="ltr" class="gmail_signature">~Craig</div></div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Mar 16, 2019 at 12:51 PM Sanjoy Das via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br>

<br>

I need to add a flush-denormals-to-zero (FTZ) flag to FastMathFlags,<br>

but  we've already used up the 7 bits available in<br>

Value::SubclassOptionalData (the "backing storage" for<br>

FPMathOperator::getFastMathFlags()).  These are the possibilities I<br>

can think of:<br>

<br>

1. Increase the size of FPMathOperator.  This gives us some additional<br>

bits for FTZ and other fastmath flags we'd want to add in the future.<br>

Obvious downside is that it increases LLVM's memory footprint.<br>

<br>

2. Steal some low bits from pointers already present in Value and<br>

expose them as part of SubclassOptionalData.  We can at least steal 3<br>

bits from the first two words in Value which are both pointers.  The<br>

LSB of the first pointer needs to be 0, otherwise we could steal 4<br>

bits.<br>

<br>

3. Allow only specific combinations in FastMathFlags.  In practice, I<br>

don't think folks are equally interested in all the 2^N combinations<br>

present in FastMathFlags, so we could compromise and allow only the<br>

most "typical" 2^7 combinations (e.g. we could nonan and noinf into a<br>

single bit, under the assumption that users want to enable-disable<br>

them as a unit).  I'm unsure if establishing the most typical 2^7<br>

combinations will be straightforward though.<br>

<br>

4. Function level attributes.  Instead of wasting precious<br>

instruction-level space, we could move all FP math attributes on the<br>

containing function.  I'm not sure if this will work for all frontends<br>

and it also raises annoying tradeoffs around inlining and other<br>

inter-procedural passes.<br>

<br>

<br>

My gut feeling is to go with (2).  It should be semantically<br>

invisible, have no impact on memory usage, and the ugly bit<br>

manipulation can be abstracted away.  What do you think?  Any other<br>

possibilities I missed?<br>

<br>

<br>

Why I need an FTZ flag:  some ARM Neon vector instructions have FTZ<br>

semantics, which means we can't vectorize instructions when compiling<br>

for Neon unless we know the user is okay with FTZ.  Today we pretend<br>

that the "fast" variant of FastMathFlags implies FTZ<br>

(<a href="https://reviews.llvm.org/rL266363" rel="noreferrer" target="_blank">https://reviews.llvm.org/rL266363</a>), which is not ideal.  Moreover<br>

(this is the immediate reason), for XLA CPU I'm trying to generate FP<br>

instructions without nonan and noinf, which breaks vectorization on<br>

ARM Neon for this reason.  An explicit bit for FTZ will let me<br>

generate FP operations tagged with FTZ and all fast math flags except<br>

nonan and noinf, and still have them vectorize on Neon.<br>

<br>

-- Sanjoy<br>

_______________________________________________<br>

LLVM Developers mailing list<br>

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

</blockquote></div>