[PATCH] D113107: Support of expression granularity for _Float16.

Wed Jun 29 11:52:00 PDT 2022

zahiraam added a comment.

In D113107#3615372 <https://reviews.llvm.org/D113107#3615372>, @zahiraam wrote:

> In D113107#3606106 <https://reviews.llvm.org/D113107#3606106>, @rjmccall wrote:
>
>> In D113107#3606094 <https://reviews.llvm.org/D113107#3606094>, @zahiraam wrote:
>>
>>> In D113107#3605797 <https://reviews.llvm.org/D113107#3605797>, @rjmccall wrote:
>>>
>>>> I think on balance the right thing to do is probably to add an alternative to `-fexcess-precision`, like `-fexcess-precision=none`.  We can default to `-fexcess-precision=standard` and treat `-fexcess-precision=fast` as an alias for `standard` for now.
>>>
>>> In https://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Optimize-Options.html#index-ffloat-store-900 ,  it looks like when compiling C, the default is -fexcess-precision=standard which would align with this implementation and our default too. So I think we could use the same name for the option. 
>>> -fexcess-precision=none corresponds to the current behavior.
>>> -fexcess-precision=standard = -fexcess-precision=fast corresponds to this implementation.
>>> Agreed?
>>
>> Since you're not landing this option right now anyway, do you mind broaching this with the GCC folks, just to be good neighbors?  You can just say that (1) Clang is looking for a way to request operation-by-operation lowering, (2) it feels like `-fexcess-precision` is the right option to add that to, (3) we don't want to tread on toes by adding an alternative to "their" option without talking to them first, and (4) what do they think about "none"?
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106117

In GCC using no -fexcess-precision is the same than using -fexcess-precision=standard. The patch we are proposing here is implementing this part.  So for  a + b + c, we are generating (_Float16) [((float a) + (float b)) + (float c)]
In GCC using -fexcess-precision=16 is generating yet another flavor of the operation-by-operation emulation. For the same addition than above, gcc is generating (_Float16) [(float) (_Float16) ((float a) + (float b)) + (float c)]. That's different than our current implementation.
See https://godbolt.org/z/aM8fzTsj1
Am I understanding correctly? @pengfei you are interested in the -fexcess-precision=16 part of this right? @rjmccall what do yo think?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113107/new/

https://reviews.llvm.org/D113107