<div dir="ltr"><div dir="ltr"><div>I don't have any objections to increasing the size of FPMathOperator, but I also don't know what perf impact that would have.</div><div>I made this comment in D39304:</div><div>"I don't think we can just add a field to FPMathOperator because Operator is not intended to be instantiated."</div></div><div>That could just be me not understanding the class hierarchy?<br></div><div><br></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Mar 18, 2019 at 11:56 AM Sanjoy Das <<a href="mailto:sanjoy@playingwithpointers.com">sanjoy@playingwithpointers.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Mon, Mar 18, 2019 at 9:31 AM Sanjay Patel <<a href="mailto:spatel@rotateright.com" target="_blank">spatel@rotateright.com</a>> wrote:<br>

><br>

> We knew the day when we needed another FMF bit was coming back in:<br>

> <a href="https://reviews.llvm.org/D39304" rel="noreferrer" target="_blank">https://reviews.llvm.org/D39304</a><br>

> ...it was just a question of 'when'. :)<br>

><br>

> I'm guessing that an FTZ bit won't be the last new bit needed if we consider permutations between strict FP and fast-math. Even without that, denormals-as-zero (DAZ) might also be useful?<br>

> So rather than continuing to carve these out bit-by-bit, it's worth considering a more general solution: instruction-level metadata.<br>

><br>

> IIUC, the main argument for making FMF part of the instruction was that per-instruction metadata gets expensive if we're applying it to a significant chunk of the instructions.<br>

> But let's think about that - even the most FP-heavy code tops out around 10% FP math ops out of the total instruction count. Typical FP benchmark code is only 2-5% FP ops. The rest is the same load/store/control-flow/ALU stuff found in integer code.<br>

<br>

If this is true, what do you think about option (1)?  It might be<br>

simpler to increase the size of FPMathOperator by a word, giving us 64<br>

more bits of fastmath flags.  We could also have this extra word in<br>

only those instances of FPMathOperator that have a non-zero<br>

FastMathFlags (this would force us to remove setFastMathFlags since<br>

we'd need to know the contents of FastMathFlags at Instruction<br>

construction time).<br>

<br>

-- Sanjoy<br>

</blockquote></div></div>