[cfe-dev] [RFC] implementation of _Float16
Hal Finkel via cfe-dev
cfe-dev at lists.llvm.org
Thu May 11 08:29:30 PDT 2017
On 05/11/2017 06:22 AM, Sjoerd Meijer wrote:
>
> Hi Hal,
>
> You mentioned “I'd be in favor of changing the current semantics”.
> Just checking: do you mean the semantics of __fp16?
>
> Because that is exactly what we are trying to avoid with introducing a
> new true half type; changing semantics of fp16 would break backward
> compatibility.
>
I don't mean changing the semantics of __fp16, the source-language type.
I mean changing the semantics of the IR-level half type. I suspect we
can do this along with an auto-upgrade feature that does not break
backwards compatibility (by inserting extend/truncate around operations
in old IR as I described to adjust for the existing semantics as you
described them).
-Hal
> > By "when required", do you mean when the result would
>
> > be the same as if the operation had been performed in single
>
> > precision? If so, then no, we need different semantics
>
> I think that is indeed the case, but I am double checking that.
>
> Cheers,
>
> Sjoerd.
>
> *From:*Hal Finkel [mailto:hfinkel at anl.gov]
> *Sent:* 10 May 2017 17:40
> *To:* Sjoerd Meijer; Martin J. O'Riordan
> *Cc:* 'clang developer list'; nd
> *Subject:* Re: [cfe-dev] [RFC] implementation of _Float16
>
> On 05/10/2017 11:15 AM, Sjoerd Meijer wrote:
>
> The thing that confused me again is that for simple
> expressions/examples like this:
>
> __fp16 MyAdd(__fp16 a, __fp16 b) {
>
> return a + b;
>
> }
>
> The IR does not include promotions/truncations which you would
> expect (because operations are done on floats):
>
> define half @MyAdd(half %a, half %b) local_unnamed_addr #0 {
>
> entry:
>
> %0 = fadd half %a, %b
>
> ret half %0
>
> }
>
> But that is only because there is this optimisation that does not
> include them if it can prove that the result with/without these
> converts is the same, so in other cases the promotes/truncates are
> there as expected.
>
> This means that Clang produces the necessary promotions when
> needed, and that a new _Float16 type can also be mapped onto the
> LLRM IR half type I think (no changes needed). Yes, then the
> approach could indeed be to treat it as a native type, and only
> promote operands to floats when required.
>
>
> By "when required", do you mean when the result would be the same as
> if the operation had been performed in single precision? If so, then
> no, we need different semantics. That having been said, I'd be in
> favor of changing the current semantics to require explicit
> promotions/truncations, change the existing optimization to elide them
> when they're provably redundant (as we do with other such things), and
> then only have a single, true, half-precision type. I suspect that
> we'd need to figure out how to auto-upgrade, but that seems doable.
>
> -Hal
>
>
> Cheers,
>
> Sjoerd.
>
> *From:*Hal Finkel [mailto:hfinkel at anl.gov]
> *Sent:* 10 May 2017 16:00
> *To:* Martin J. O'Riordan; Sjoerd Meijer
> *Cc:* 'clang developer list'
> *Subject:* Re: [cfe-dev] [RFC] implementation of _Float16
>
> On 05/10/2017 09:01 AM, Martin J. O'Riordan wrote:
>
> Yes, I see how this would be an issue if it is necessary to keep
> the storage-only versus native types separate.
>
> At the moment I have ‘short float’ internally associated with
> OpenCL’s ‘half’ but I do not enable ‘half’ as a keyword.
> Independently I have made ‘__fp16’ when used with our target also
> be a synonym for ‘short float/half’ (simply to avoid adding a new
> keyword). This in turn is bound to the IEEE FP16 using
> ‘HalfFormat = &llvm::APFloat::IEEEhalf();’.
>
> In our case it is always a native type and never a storage only
> type, so coupling ‘__fp16’ to ‘half’ made sense. Certainly if the
> native versus storage-only variants were distinct, then this
> association I have made would have to be decoupled (not a big-deal).
>
> Another approach might be to always work with FP16 as-if native,
> but to provide only Load/Store instructions in the TableGen
> descriptions for FP16, and to adapt lowering to always perform the
> arithmetic using FP32 if the selected target does not support
> native FP16 - would that be feasible in your case? In this way it
> is not really any different to how targets that have no FPU can
> use an alternative integer based implementation (with the help of
> ‘compiler-rt’).
>
> I can certainly see how something like ‘ADD’ of ‘f16’ could be
> changed to use ‘Expand’ in lowering rather than ‘Legal’ as a
> function of the selected target (or some other target specific
> option) - we just marked it ‘Legal’ and provided the corresponding
> instructions in TableGen with very little custom lowering
> necessary. I have a mild concern that LLVM would have to have an
> ‘f16’ which is native and another kind-of ‘f16’ restricted to
> being only storage.
>
>
> Why? That should only be true if they have different semantics.
>
> -Hal
>
>
>
> Thanks,
>
> MartinO
>
> *From:*Sjoerd Meijer [mailto:Sjoerd.Meijer at arm.com]
> *Sent:* 10 May 2017 14:19
> *To:* Martin J. O'Riordan <martin.oriordan at movidius.com>
> <mailto:martin.oriordan at movidius.com>; 'Hal Finkel' <hfinkel at anl.gov>
> <mailto:hfinkel at anl.gov>
> *Cc:* 'clang developer list' <cfe-dev at lists.llvm.org>
> <mailto:cfe-dev at lists.llvm.org>
> *Subject:* RE: [cfe-dev] [RFC] implementation of _Float16
>
> Hi Hal, Martin,
>
> Thanks for the feedback.
>
> Yes, the issue indeed is that ‘__fp16’ is already used to implement a
> storage-only type. And earlier I wrote that I don’t expect LLVM IR
> changes, but now I am not so sure anymore if both types map onto the
> same half LLVM IR type. With two half precision types, __fp16 and
> _Float16, where one is a storage only type and the other a native
> type, somehow the distinction between these two must be made I think.
>
> Cheers,
>
> Sjoerd.
>
> *From:*Martin J. O'Riordan [mailto:martin.oriordan at movidius.com]
> *Sent:* 10 May 2017 14:13
> *To:* 'Hal Finkel'; Sjoerd Meijer
> *Cc:* 'clang developer list'
> *Subject:* RE: [cfe-dev] [RFC] implementation of _Float16
>
> Our Out-of-Tree target implements fully native FP16 operations based
> on ‘__fp16’ (scalar and SIMD vector), so is the issue for ARM that
> ‘__fp16’ is already used to implement a storage-only type and that
> another type is needed to differentiate between a native and a
> storage-only type? Once the ‘f16’ type appears in the IR (and the
> vector variants) the code-generation is straightforward enough.
>
> Certainly we have had to make many changes to CLang and to LLVM to
> fully implement this including suppression of implicit conversion to
> ‘double’, but nothing scary or obscure. Many of these changes are
> simply to enable something that is already normal for OpenCL, but to
> do so for C and C++.
>
> More controversially we also added a “synonym” for this using ‘short
> float’ rather than ‘_Float16’ (or OpenCL’s ‘half’), and created a
> parallel set of the ISO C library functions using ‘s’ to suffix the
> usual names (e.g. ‘tan’, ‘tanf’, ‘tanl’ plus ‘tans’). The ‘s’ suffix
> was unambiguous (though we actually use the double-underscore prefix,
> e.g. ‘__tans’ to avoid conflict with the user’s names) and the type
> ‘short float’ was available too without breaking anything. Enabling
> the ‘h’ suffix for FP constants (again from OpenCL) makes the whole
> fit smoothly with the normal FP types.
>
> However, for variadic functions (such as ‘printf’) we do promote to
> ‘double’ because there are no formatting specifiers available for
> ‘half’ any more than there is support for ‘float’ - it is also
> consistent with ‘va_arg’ usage for ‘char’ and ‘short’ as ‘int’. My
> feeling is that using implementation defined types ‘float’, ‘double’
> and ‘long double’ can be extended to include ‘short float’ without
> dictating that they have any particular bit-sizes (e.g. FP16 for ‘half’).
>
> This solution has worked very well over the past few years and is
> symmetric with the other floating-point data types.
>
> There are some issues with C++ and overloading because ‘__fp16’ to
> other FP types (and INT types) is not ranked in exactly the same way
> as for example ‘float’ is to other FP types; but this is really only
> because it is not a 1^st class citizen of the type-system and the
> rules would need to be specified to make this valid. I have not tried
> to fix this as it works reasonably well as it is, and it would really
> be an issue for the C++ committee to decide if they ever choose to
> adopt another FP data type. I did add it to the traits in the C++
> library though so that it is considered legal for floating-point types.
>
> I’d love to see this adopted as a formal type in a future version of
> ISO C and ISO C++.
>
> MartinO
>
> *From:*cfe-dev [mailto:cfe-dev-bounces at lists.llvm.org] *On Behalf Of
> *Hal Finkel via cfe-dev
> *Sent:* 10 May 2017 11:39
> *To:* Sjoerd Meijer <Sjoerd.Meijer at arm.com
> <mailto:Sjoerd.Meijer at arm.com>>; cfe-dev at lists.llvm.org
> <mailto:cfe-dev at lists.llvm.org>
> *Subject:* Re: [cfe-dev] [RFC] implementation of _Float16
>
> On 05/10/2017 05:18 AM, Sjoerd Meijer via cfe-dev wrote:
>
> Hi,
>
> ARMv8.2-A introduces as an optional extension half-precision
> data-processing instructions for Advanced SIMD and floating-point
> in both AArch64 and AArch32 states [1], and we are looking into
> implementing C/C++-language support for these new ARMv8.2-A
> half-precision instructions.
>
> We would like to introduce a new Clang type. The reason is that we
> e.g. cannot use type __fp16 (defined in the ARM C Language
> Extensions [2]) because it is a storage type only. This means when
> using standard C operators values of __fp16 type promote to float
> when used in arithmetic operations, which we would like to avoid
> for the ARMv8.2-A half-precision instructions. Please note that
> the LLVM IR already has a half precision type, onto which for
> example __fp16 is mapped, so there are no changes or additions
> required for the LLVM IR.
>
> As a new Clang type we would like to propose _Float16 as defined
> in a C11 extension, see [3]. Arithmetic is well defined, it is not
> only a storage type as __fp16. Our question is whether a partial
> implementation, just implementing this type and not claiming
> (full) C11 conformance is acceptable?
>
>
> I would very much like to see fp16 as a first-class floating-point
> type in Clang and LLVM (i.e. handling that is not just a storage
> type). Doing this in Clang in a way that is specified by C11 seems
> like the right approach. I don't see why implementing this would be
> predicated on implementing other parts of C11.
>
> -Hal
>
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose
> the contents to any other person, use it for any purpose, or store or
> copy the information in any medium. Thank you.
>
>
>
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170511/1371f3ec/attachment.html>
More information about the cfe-dev
mailing list