[llvm-dev] [PATCH] Add optional _Float16 support
Wang, Pengfei via llvm-dev
llvm-dev at lists.llvm.org
Tue Jul 13 07:48:02 PDT 2021
Hi H.J.,
Our LLVM implementation currently use %xmm0 for both _Complex's real part and imaginary part. Do we have special reason to use two registers?
We are using one register on X64. Considering the performance, especially the register pressure, should it be better to use one register for _Complex _Float16 on 32 bits target?
Thanks
Pengfei
-----Original Message-----
From: H.J. Lu <hjl.tools at gmail.com>
Sent: Tuesday, July 13, 2021 10:26 PM
To: Wang, Pengfei <pengfei.wang at intel.com>; llvm-dev at lists.llvm.org
Cc: Joseph Myers <joseph at codesourcery.com>; GCC Patches <gcc-patches at gcc.gnu.org>; GNU C Library <libc-alpha at sourceware.org>; IA32 System V Application Binary Interface <ia32-abi at googlegroups.com>
Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support
On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei <pengfei.wang at intel.com> wrote:
>
> > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
>
> Can you please explain the behavior here? Is there difference between
> _Float16 and _Complex _Float16 when return? I.e., 1, In which case will _Float16 values return in both %xmm0 and %xmm1?
> 2, For a single _Float16 value, are both real part and imaginary part returned in %xmm0? Or returned in %xmm0 and %xmm1 respectively?
Here is the v2 patch to add the missing _Float16 bits. The PDF file is at
https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI
> Thanks
> Pengfei
>
> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of H.J. Lu
> via llvm-dev
> Sent: Friday, July 2, 2021 6:28 AM
> To: Joseph Myers <joseph at codesourcery.com>
> Cc: llvm-dev at lists.llvm.org; GCC Patches <gcc-patches at gcc.gnu.org>;
> GNU C Library <libc-alpha at sourceware.org>; IA32 System V Application
> Binary Interface <ia32-abi at googlegroups.com>
> Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support
>
> On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers <joseph at codesourcery.com> wrote:
> >
> > On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote:
> >
> > > 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
> >
> > That restricts use of _Float16 to processors with SSE. Is that what
> > we want in the ABI, or should _Float16 be available with base 32-bit
> > x86 architecture features only, much like _Float128 and the decimal
> > FP types
>
> Yes, _Float16 requires XMM registers.
>
> > are? (If it is restricted to SSE, we can of course ensure relevant
> > libgcc functions are built with SSE enabled, and likewise in glibc
> > if that gains
> > _Float16 functions, though maybe with some extra complications to
> > get relevant testcases to run whenever possible.)
> >
>
> _Float16 functions in libgcc should be compiled with SSE enabled.
>
> BTW, _Float16 software emulation may require more than just SSE since we need to do _Float16 load and store with XMM registers.
> There is no 16bit load/store for XMM registers without AVX512FP16.
>
> --
> H.J.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
--
H.J.
More information about the llvm-dev
mailing list