[PATCH] D136176: Implement support for option 'fexcess-precision'.

Mon Nov 21 08:20:37 PST 2022

zahiraam added inline comments.

================
Comment at: clang/docs/UsersManual.rst:1732
+.. option:: -fexcess-precision:
+
+   By default, Clang uses excess precision to calculate ``_Float16``
----------------
rjmccall wrote:
> 
>    The C and C++ standards allow floating-point expressions to be computed
>    as if intermediate results had more precision (and/or a wider range) than the
>    type of the expression strictly allows.  This is called excess precision arithmetic.
>    Excess precision arithmetic can improve the accuracy of results (although not
>    always), and it can make computation significantly faster if the target lacks
>    direct hardware support for arithmetic in a particular type.  However, it can
>    also undermine strict floating-point reproducibility.
> 
>    Under the standards, assignments and explicit casts force the operand to be
>    converted to its formal type, discarding any excess precision.  Because data
>    can only flow between statements via an assignment, this means that the
>    use of excess precision arithmetic is a reliable local property of a single
>    statement, and results do not change based on optimization.  However, when
>    excess precision arithmetic is in use, Clang does not guarantee strict
>    reproducibility, and future compiler releases may recognize more opportunities
>    to use excess precision arithmetic, e.g. with floating-point builtins.
> 
>    Clang does not use excess precision arithmetic for most types or on most targets.
>    For example, even on pre-SSE X86 targets where ``float`` and ``double``
>    computations must be performed in the 80-bit X87 format, Clang rounds
>    all intermediate results correctly for their type.  Clang currently uses excess
>    precision arithmetic by default only for the following types and targets:
> 
>    * ``_Float16`` on X86 targets without ``AVX512-FP16``
> 
>    The ``-fexcess-precision=<value>`` option can be used to control the use of excess
>    precision arithmetic.  Valid values are:
> 
>    * ``standard`` - The default.  Allow the use of excess precision arithmetic under
>      the constraints of the C and C++ standards. Has no effect except on the types
>      and targets listed above.
>    * ``fast`` - Accepted for GCC compatibility, but currently treated as an alias
>      for ``standard``.
>    * ``16`` - Forces ``_Float16`` operations to be emitted without using excess
>      precision arithmetic.
>    
Thanks.

================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:2995
+      StringRef Val = A->getValue();
+       if (TC.getTriple().getArch() == llvm::Triple::x86 && Val.equals("16"))
+        D.Diag(diag::err_drv_unsupported_opt_for_target)
----------------
rjmccall wrote:
> zahiraam wrote:
> > andrew.w.kaylor wrote:
> > > Why is 16 only supported for x86? Is it only here for gcc compatibility?
> > Yes for gcc compatibility (although we are using here that value "none" to disable excess precision instead of using "16") and also because we are dealing with excess precision for _Float16 types only, so sticking to X86.
> `llvm::Triple::x86` is just i386, and I think you want to include x86_64, right?
Yes. Thanks.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136176/new/

https://reviews.llvm.org/D136176