[llvm-dev] [PATCH] Add optional _Float16 support

Jakub Jelinek via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 2 02:21:33 PDT 2021


On Fri, Jul 02, 2021 at 09:45:46AM +0200, Richard Biener via Gcc-patches wrote:
> > > > are?  (If it is restricted to SSE, we can of course ensure relevant
> > > libgcc
> > > > functions are built with SSE enabled, and likewise in glibc if that gains
> > > > _Float16 functions, though maybe with some extra complications to get
> > > > relevant testcases to run whenever possible.)
> > > >
> > >
> > > _Float16 functions in libgcc should be compiled with SSE enabled.
> > >
> > > BTW, _Float16 software emulation may require more than just SSE
> > > since we need to do _Float16 load and store with XMM registers.
> > > There is no 16bit load/store for XMM registers without AVX512FP16.
> > >
> >
> > Umm, if you just need to load/store 16-bit scalars in XMM registers you can
> > use pextrw and pinsrw which don't require AVX. f16x8 can use any of the
> > standard full-register load/stores.
> 
> It looks like that requires SSE2, with SSE only inserts/extracts
> to/from MMX regs
> are supported.  But of course GPR half-word loads and GPR->XMM moves of
> full size would work.

Loads can be done in SSE2 directly with PINSRW, that supports 16-bit load
from memory to XMM reg.  But SSE2 PEXTRW only supports stores into GPR
and one needs SSE4.1 fo PEXTRW into memory.  So, for the stores and SSE2 one
needs secondary reload...

	Jakub



More information about the llvm-dev mailing list