[llvm-dev] [PATCH] Add optional _Float16 support

Fri Jul 2 00:45:46 PDT 2021

On Fri, Jul 2, 2021 at 1:34 AM Jacob Lifshay via Gcc-patches
<gcc-patches at gcc.gnu.org> wrote:
>
> On Thu, Jul 1, 2021, 15:28 H.J. Lu via llvm-dev <llvm-dev at lists.llvm.org>
> wrote:
>
> > On Thu, Jul 1, 2021 at 3:10 PM Joseph Myers <joseph at codesourcery.com>
> > wrote:
> > >
> > > On Thu, 1 Jul 2021, H.J. Lu via Gcc-patches wrote:
> > >
> > > > 2. Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1
> > registers.
> > >
> > > That restricts use of _Float16 to processors with SSE.  Is that what we
> > > want in the ABI, or should _Float16 be available with base 32-bit x86
> > > architecture features only, much like _Float128 and the decimal FP types
> >
> > Yes, _Float16 requires XMM registers.
> >
> > > are?  (If it is restricted to SSE, we can of course ensure relevant
> > libgcc
> > > functions are built with SSE enabled, and likewise in glibc if that gains
> > > _Float16 functions, though maybe with some extra complications to get
> > > relevant testcases to run whenever possible.)
> > >
> >
> > _Float16 functions in libgcc should be compiled with SSE enabled.
> >
> > BTW, _Float16 software emulation may require more than just SSE
> > since we need to do _Float16 load and store with XMM registers.
> > There is no 16bit load/store for XMM registers without AVX512FP16.
> >
>
> Umm, if you just need to load/store 16-bit scalars in XMM registers you can
> use pextrw and pinsrw which don't require AVX. f16x8 can use any of the
> standard full-register load/stores.

It looks like that requires SSE2, with SSE only inserts/extracts
to/from MMX regs
are supported.  But of course GPR half-word loads and GPR->XMM moves of
full size would work.

> https://gcc.godbolt.org/z/ncznr9TM1
>
> Jacob