[llvm-dev] [PATCH] Add optional _Float16 support
John McCall via llvm-dev
llvm-dev at lists.llvm.org
Wed Aug 25 13:32:52 PDT 2021
On Wed, Aug 25, 2021 at 8:36 AM H.J. Lu <hjl.tools at gmail.com> wrote:
> On Mon, Aug 23, 2021 at 10:55 PM John McCall <rjmccall at gmail.com> wrote:
> > On Thu, Jul 29, 2021 at 9:40 AM H.J. Lu <hjl.tools at gmail.com> wrote:
> >> On Tue, Jul 13, 2021 at 9:24 AM H.J. Lu <hjl.tools at gmail.com> wrote:
> >> > On Tue, Jul 13, 2021 at 8:41 AM Joseph Myers <joseph at codesourcery.com>
> wrote:
> >> > > On Tue, 13 Jul 2021, H.J. Lu wrote:
> >> > > > On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei <
> pengfei.wang at intel.com> wrote:
> >> > > > >
> >> > > > > > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1
> registers.
> >> > > > >
> >> > > > > Can you please explain the behavior here? Is there difference
> between _Float16 and _Complex _Float16 when return? I.e.,
> >> > > > > 1, In which case will _Float16 values return in both %xmm0 and
> %xmm1?
> >> > > > > 2, For a single _Float16 value, are both real part and
> imaginary part returned in %xmm0? Or returned in %xmm0 and %xmm1
> respectively?
> >> > > >
> >> > > > Here is the v2 patch to add the missing _Float16 bits. The PDF
> file is at
> >> > > >
> >> > > > https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI
> >> > >
> >> > > This PDF shows _Complex _Float16 as having a size of 2 bytes
> (should be
> >> > > 4-byte size, 2-byte alignment).
> >> > >
> >> > > It also seems to change double from 4-byte to 8-byte alignment,
> which is
> >> > > wrong. And it's inconsistent about whether it covers the long
> double =
> >> > > double (Android) case - it shows that case for _Complex long double
> but
> >> > > not for long double itself.
> >> >
> >> > Here is the v3 patch with the fixes. I also updated the PDF file.
> >>
> >> Here is the final patch I checked in. _Complex _Float16 is changed to
> return
> >> in XMM0 register. The new PDF file is at
> >>
> >> https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI
> >
> >
> > This should be explicit that the real part is returned in bits 0..15 and
> the imaginary part is returned in bits 16..31, or however we conventionally
> designate subcomponents of a vector.
>
> How about this?
>
> diff --git a/low-level-sys-info.tex b/low-level-sys-info.tex
> index 860ff66..8f527c1 100644
> --- a/low-level-sys-info.tex
> +++ b/low-level-sys-info.tex
> @@ -457,6 +457,9 @@ and \texttt{unions}) are always returned in memory.
> & \texttt{__float128} & memory \\
> \hline
> & \texttt{_Complex _Float16} & \reg{xmm0} \\
> + & & The real part is returned in bits 0..15. The imaginary part is
> + returned \\
> + & & in bits 16..31.\\
> \cline{2-3}
> Complex & \texttt{_Complex float} & \EDX:\EAX \\
> floating- & & The real part is returned in \EAX. The imaginary part is
>
>
> https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/uploads/89eb3e52c7e5eadd58f7597508e13f34/intel386-psABI-2021-08-25.pdf
Looks good to me, thanks.
John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210825/8d858939/attachment.html>
More information about the llvm-dev
mailing list