[cfe-dev] [RFC] implementation of _Float16

Thu May 11 08:29:30 PDT 2017

On 05/11/2017 06:22 AM, Sjoerd Meijer wrote:
>
> Hi Hal,
>
> You mentioned “I'd be in favor of changing the current semantics”. 
> Just checking: do you mean the semantics of __fp16?
>
> Because that is exactly what we are trying to avoid with introducing a 
> new true half type; changing semantics of fp16 would break backward 
> compatibility.
>

I don't mean changing the semantics of __fp16, the source-language type. 
I mean changing the semantics of the IR-level half type. I suspect we 
can do this along with an auto-upgrade feature that does not break 
backwards compatibility (by inserting extend/truncate around operations 
in old IR as I described to adjust for the existing semantics as you 
described them).

  -Hal

> > By "when required", do you mean when the result would
>
> > be the same as if the operation had been performed in single
>
> > precision? If so, then no, we need different semantics
>
> I think that is indeed the case, but I am double checking that.
>
> Cheers,
>
> Sjoerd.
>
> *From:*Hal Finkel [mailto:hfinkel at anl.gov]
> *Sent:* 10 May 2017 17:40
> *To:* Sjoerd Meijer; Martin J. O'Riordan
> *Cc:* 'clang developer list'; nd
> *Subject:* Re: [cfe-dev] [RFC] implementation of _Float16
>
> On 05/10/2017 11:15 AM, Sjoerd Meijer wrote:
>
>     The thing that confused me again is that for simple
>     expressions/examples like this:
>
>     __fp16 MyAdd(__fp16 a, __fp16 b) {
>
>       return a + b;
>
>     }
>
>     The IR does not include promotions/truncations which you would
>     expect (because operations are done on floats):
>
>     define half @MyAdd(half %a, half %b) local_unnamed_addr #0 {
>
>     entry:
>
>       %0 = fadd half %a, %b
>
>       ret half %0
>
>     }
>
>     But that is only because there is this optimisation that does not
>     include them if it can prove that the result with/without these
>     converts is the same, so in other cases the promotes/truncates are
>     there as expected.
>
>     This means that Clang produces the necessary promotions when
>     needed, and that a new _Float16 type can also be mapped onto the
>     LLRM IR half type I think (no changes needed). Yes, then the
>     approach could indeed be to treat it as a native type, and only
>     promote operands to floats when required.
>
>
> By "when required", do you mean when the result would be the same as 
> if the operation had been performed in single precision? If so, then 
> no, we need different semantics. That having been said, I'd be in 
> favor of changing the current semantics to require explicit 
> promotions/truncations, change the existing optimization to elide them 
> when they're provably redundant (as we do with other such things), and 
> then only have a single, true, half-precision type. I suspect that 
> we'd need to figure out how to auto-upgrade, but that seems doable.
>
>  -Hal
>
>
> Cheers,
>
> Sjoerd.
>
> *From:*Hal Finkel [mailto:hfinkel at anl.gov]
> *Sent:* 10 May 2017 16:00
> *To:* Martin J. O'Riordan; Sjoerd Meijer
> *Cc:* 'clang developer list'
> *Subject:* Re: [cfe-dev] [RFC] implementation of _Float16
>
> On 05/10/2017 09:01 AM, Martin J. O'Riordan wrote:
>
>     Yes, I see how this would be an issue if it is necessary to keep
>     the storage-only versus native types separate.
>
>     At the moment I have ‘short float’ internally associated with
>     OpenCL’s ‘half’ but I do not enable ‘half’ as a keyword.
>     Independently I have made ‘__fp16’ when used with our target also
>     be a synonym for ‘short float/half’ (simply to avoid adding a new
>     keyword).  This in turn is bound to the IEEE FP16 using
>     ‘HalfFormat = &llvm::APFloat::IEEEhalf();’.
>
>     In our case it is always a native type and never a storage only
>     type, so coupling ‘__fp16’ to ‘half’ made sense. Certainly if the
>     native versus storage-only variants were distinct, then this
>     association I have made would have to be decoupled (not a big-deal).
>
>     Another approach might be to always work with FP16 as-if native,
>     but to provide only Load/Store instructions in the TableGen
>     descriptions for FP16, and to adapt lowering to always perform the
>     arithmetic using FP32 if the selected target does not support
>     native FP16 - would that be feasible in your case?  In this way it
>     is not really any different to how targets that have no FPU can
>     use an alternative integer based implementation (with the help of
>     ‘compiler-rt’).
>
>     I can certainly see how something like ‘ADD’ of ‘f16’ could be
>     changed to use ‘Expand’ in lowering rather than ‘Legal’ as a
>     function of the selected target (or some other target specific
>     option) - we just marked it ‘Legal’ and provided the corresponding
>     instructions in TableGen with very little custom lowering
>     necessary.  I have a mild concern that LLVM would have to have an
>     ‘f16’ which is native and another kind-of ‘f16’ restricted to
>     being only storage.
>
>
> Why? That should only be true if they have different semantics.
>
>  -Hal
>
>
>
> Thanks,
>
> MartinO
>
> *From:*Sjoerd Meijer [mailto:Sjoerd.Meijer at arm.com]
> *Sent:* 10 May 2017 14:19
> *To:* Martin J. O'Riordan <martin.oriordan at movidius.com> 
> <mailto:martin.oriordan at movidius.com>; 'Hal Finkel' <hfinkel at anl.gov> 
> <mailto:hfinkel at anl.gov>
> *Cc:* 'clang developer list' <cfe-dev at lists.llvm.org> 
> <mailto:cfe-dev at lists.llvm.org>
> *Subject:* RE: [cfe-dev] [RFC] implementation of _Float16
>
> Hi Hal, Martin,
>
> Thanks for the feedback.
>
> Yes, the issue indeed is that ‘__fp16’ is already used to implement a 
> storage-only type.  And earlier I wrote that I don’t expect LLVM IR 
> changes, but now I am not so sure anymore if both types map onto the 
> same half LLVM IR type. With  two half precision types, __fp16 and 
> _Float16, where one is a storage only type and the other a native 
> type, somehow the distinction between these two must be made I think.
>
> Cheers,
>
> Sjoerd.
>
> *From:*Martin J. O'Riordan [mailto:martin.oriordan at movidius.com]
> *Sent:* 10 May 2017 14:13
> *To:* 'Hal Finkel'; Sjoerd Meijer
> *Cc:* 'clang developer list'
> *Subject:* RE: [cfe-dev] [RFC] implementation of _Float16
>
> Our Out-of-Tree target implements fully native FP16 operations based 
> on ‘__fp16’ (scalar and SIMD vector), so is the issue for ARM that 
> ‘__fp16’ is already used to implement a storage-only type and that 
> another type is needed to differentiate between a native and a 
> storage-only type? Once the ‘f16’ type appears in the IR (and the 
> vector variants) the code-generation is straightforward enough.
>
> Certainly we have had to make many changes to CLang and to LLVM to 
> fully implement this including suppression of implicit conversion to 
> ‘double’, but nothing scary or obscure.  Many of these changes are 
> simply to enable something that is already normal for OpenCL, but to 
> do so for C and C++.
>
> More controversially we also added a “synonym” for this using ‘short 
> float’ rather than ‘_Float16’ (or OpenCL’s ‘half’), and created a 
> parallel set of the ISO C library functions using ‘s’ to suffix the 
> usual names (e.g. ‘tan’, ‘tanf’, ‘tanl’ plus ‘tans’).  The ‘s’ suffix 
> was unambiguous (though we actually use the double-underscore prefix, 
> e.g. ‘__tans’ to avoid conflict with the user’s names) and the type 
> ‘short float’ was available too without breaking anything. Enabling 
> the ‘h’ suffix for FP constants (again from OpenCL) makes the whole 
> fit smoothly with the normal FP types.
>
> However, for variadic functions (such as ‘printf’) we do promote to 
> ‘double’ because there are no formatting specifiers available for 
> ‘half’ any more than there is support for ‘float’ - it is also 
> consistent with ‘va_arg’ usage for ‘char’ and ‘short’ as ‘int’.  My 
> feeling is that using implementation defined types ‘float’, ‘double’ 
> and ‘long double’ can be extended to include ‘short float’ without 
> dictating that they have any particular bit-sizes (e.g. FP16 for ‘half’).
>
> This solution has worked very well over the past few years and is 
> symmetric with the other floating-point data types.
>
> There are some issues with C++ and overloading because ‘__fp16’ to 
> other FP types (and INT types) is not ranked in exactly the same way 
> as for example ‘float’ is to other FP types; but this is really only 
> because it is not a 1^st class citizen of the type-system and the 
> rules would need to be specified to make this valid.  I have not tried 
> to fix this as it works reasonably well as it is, and it would really 
> be an issue for the C++ committee to decide if they ever choose to 
> adopt another FP data type.  I did add it to the traits in the C++ 
> library though so that it is considered legal for floating-point types.
>
> I’d love to see this adopted as a formal type in a future version of 
> ISO C and ISO C++.
>
> MartinO
>
> *From:*cfe-dev [mailto:cfe-dev-bounces at lists.llvm.org] *On Behalf Of 
> *Hal Finkel via cfe-dev
> *Sent:* 10 May 2017 11:39
> *To:* Sjoerd Meijer <Sjoerd.Meijer at arm.com 
> <mailto:Sjoerd.Meijer at arm.com>>; cfe-dev at lists.llvm.org 
> <mailto:cfe-dev at lists.llvm.org>
> *Subject:* Re: [cfe-dev] [RFC] implementation of _Float16
>
> On 05/10/2017 05:18 AM, Sjoerd Meijer via cfe-dev wrote:
>
>     Hi,
>
>     ARMv8.2-A introduces as an optional extension half-precision
>     data-processing instructions for Advanced SIMD and floating-point
>     in both AArch64 and AArch32 states [1], and we are looking into
>     implementing C/C++-language support for these new ARMv8.2-A
>     half-precision instructions.
>
>     We would like to introduce a new Clang type. The reason is that we
>     e.g. cannot use type __fp16 (defined in the ARM C Language
>     Extensions [2]) because it is a storage type only. This means when
>     using standard C operators values of __fp16 type promote to float
>     when used in arithmetic operations, which we would like to avoid
>     for the ARMv8.2-A half-precision instructions. Please note that
>     the LLVM IR already has a half precision type, onto which for
>     example __fp16 is mapped, so there are no changes or additions
>     required for the LLVM IR.
>
>     As a new Clang type we would like to propose _Float16 as defined
>     in a C11 extension, see [3]. Arithmetic is well defined, it is not
>     only a storage type as __fp16. Our question is whether a partial
>     implementation, just implementing this type and not claiming
>     (full) C11 conformance is acceptable?
>
>
> I would very much like to see fp16 as a first-class floating-point 
> type in Clang and LLVM (i.e. handling that is not just a storage 
> type). Doing this in Clang in a way that is specified by C11 seems 
> like the right approach. I don't see why implementing this would be 
> predicated on implementing other parts of C11.
>
>  -Hal
>
> IMPORTANT NOTICE: The contents of this email and any attachments are 
> confidential and may also be privileged. If you are not the intended 
> recipient, please notify the sender immediately and do not disclose 
> the contents to any other person, use it for any purpose, or store or 
> copy the information in any medium. Thank you.
>
>
>
>
> -- 
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>
>
> -- 
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170511/1371f3ec/attachment.html>