[llvm-dev] [PATCH] Add optional _Float16 support

Tue Jul 13 08:04:36 PDT 2021

On Tue, Jul 13, 2021 at 7:48 AM Wang, Pengfei <pengfei.wang at intel.com> wrote:
>
> Hi H.J.,
>
> Our LLVM implementation currently use %xmm0 for both _Complex's real part and imaginary part. Do we have special reason to use two registers?
> We are using one register on X64. Considering the performance, especially the register pressure, should it be better to use one register for _Complex _Float16 on 32 bits target?

x86-64 psABI is unrelated to i386 psABI.  Using a pair of registers is
more natural for
complex _Float16.  Since it is only used for function return value, I
don't think there is
a register pressure issue.

> Thanks
> Pengfei
>
> -----Original Message-----
> From: H.J. Lu <hjl.tools at gmail.com>
> Sent: Tuesday, July 13, 2021 10:26 PM
> To: Wang, Pengfei <pengfei.wang at intel.com>; llvm-dev at lists.llvm.org
> Cc: Joseph Myers <joseph at codesourcery.com>; GCC Patches <gcc-patches at gcc.gnu.org>; GNU C Library <libc-alpha at sourceware.org>; IA32 System V Application Binary Interface <ia32-abi at googlegroups.com>
> Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support
>
> On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei <pengfei.wang at intel.com> wrote:
> >
> > > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
> >
> > Can you please explain the behavior here? Is there difference between
> > _Float16 and _Complex _Float16 when return? I.e., 1, In which case will _Float16 values return in both %xmm0 and %xmm1?
> > 2, For a single _Float16 value, are both real part and imaginary part returned in %xmm0? Or returned in %xmm0 and %xmm1 respectively?
>
> Here is the v2 patch to add the missing _Float16 bits.   The PDF file is at
>
> https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI
>
> > Thanks
> > Pengfei
> >
> > -----Original Message-----
> > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of H.J. Lu
> > via llvm-dev
> > Sent: Friday, July 2, 2021 6:28 AM
> > To: Joseph Myers <joseph at codesourcery.com>
> > Cc: llvm-dev at lists.llvm.org; GCC Patches <gcc-patches at gcc.gnu.org>;
> > GNU C Library <libc-alpha at sourceware.org>; IA32 System V Application
> > Binary Interface <ia32-abi at googlegroups.com>
> > Subject: Re: [llvm-dev] [PATCH] Add optional _Float16 support
> >
> > On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers <joseph at codesourcery.com> wrote:
> > >
> > > On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote:
> > >
> > > > 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.
> > >
> > > That restricts use of _Float16 to processors with SSE.  Is that what
> > > we want in the ABI, or should _Float16 be available with base 32-bit
> > > x86 architecture features only, much like _Float128 and the decimal
> > > FP types
> >
> > Yes, _Float16 requires XMM registers.
> >
> > > are?  (If it is restricted to SSE, we can of course ensure relevant
> > > libgcc functions are built with SSE enabled, and likewise in glibc
> > > if that gains
> > > _Float16 functions, though maybe with some extra complications to
> > > get relevant testcases to run whenever possible.)
> > >
> >
> > _Float16 functions in libgcc should be compiled with SSE enabled.
> >
> > BTW, _Float16 software emulation may require more than just SSE since we need to do _Float16 load and store with XMM registers.
> > There is no 16bit load/store for XMM registers without AVX512FP16.
> >
> > --
> > H.J.
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
> --
> H.J.

-- 
H.J.