[LLVMdev] [RFC] Extend LLVM IR to express "fast-math" at a per-instruction level

Tue Oct 30 22:50:33 PDT 2012

On Oct 30, 2012, at 3:11 PM, Evan Cheng <evan.cheng at apple.com> wrote:
>> Disadvantages of using subclass data bits:
>> 
>> - Can only represent flags.  Thus you might end up with a mix of flags and
>> metadata for floating point math, with the metadata holding the non-flag
>> info, and subclass data holding the flags.  In which case it might be better
>> to just have it all be metadata in the first place
>> - Only a limited number of bits (but hey)
>> 
>> Hopefully Chris will weigh in with his opinion.
> 
> FYI. We've already had extensive discussion with Chris on this. He has made it clear this *must* be implemented with subclass data bits, not with metadata.

More specifically, I reviewed the proposal and I agree with it's general design: I think it makes sense to use subclass data for these bits even though fpprecision doesn't.  It follows the analogy of NSW/NUW bits which have worked well.  I also think it makes a lot of sense to separate out the "relaxing FP math" part of the FP problem from orthogonal issues like modeling rounding modes, trapping operations (SNANs), etc.

That said, I agree that the individual proposed bits (e.g. "A") could use some refinement.  I think it is really important to accurately model the concepts that GCC exposes, but it may make sense to decompose them into finer-grained concepts than what GCC exposes.  Also, infer-ability is an important aspect of this: we already have stuff in LLVM that tries to figure out things like "this can never be negative zero".  I'd like it if we can separate the inference of this property from the clients of it.

At a (ridiculous) limit, we could take everything in "A" and see what optimizations we want to permit, and add a separate bit for every suboptimization that it would enable.  Hopefully from that list we can find natural clusters that would make sense to group together.

-Chris