<div dir="ltr"><div dir="ltr">On Wed, Aug 25, 2021 at 8:36 AM H.J. Lu <<a href="mailto:hjl.tools@gmail.com">hjl.tools@gmail.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">On Mon, Aug 23, 2021 at 10:55 PM John McCall <<a href="mailto:rjmccall@gmail.com" target="_blank">rjmccall@gmail.com</a>> wrote:<br>> On Thu, Jul 29, 2021 at 9:40 AM H.J. Lu <<a href="mailto:hjl.tools@gmail.com" target="_blank">hjl.tools@gmail.com</a>> wrote:<br>>> On Tue, Jul 13, 2021 at 9:24 AM H.J. Lu <<a href="mailto:hjl.tools@gmail.com" target="_blank">hjl.tools@gmail.com</a>> wrote:<br>>> > On Tue, Jul 13, 2021 at 8:41 AM Joseph Myers <<a href="mailto:joseph@codesourcery.com" target="_blank">joseph@codesourcery.com</a>> wrote:<br>>> > > On Tue, 13 Jul 2021, H.J. Lu wrote:<br>>> > > > On Mon, Jul 12, 2021 at 8:59 PM Wang, Pengfei <<a href="mailto:pengfei.wang@intel.com" target="_blank">pengfei.wang@intel.com</a>> wrote:<br>

>> > > > ><br>

>> > > > > > Return _Float16 and _Complex _Float16 values in %xmm0/%xmm1 registers.<br>

>> > > > ><br>

>> > > > > Can you please explain the behavior here? Is there difference between _Float16 and _Complex _Float16 when return? I.e.,<br>

>> > > > > 1, In which case will _Float16 values return in both %xmm0 and %xmm1?<br>

>> > > > > 2, For a single _Float16 value, are both real part and imaginary part returned in %xmm0? Or returned in %xmm0 and %xmm1 respectively?<br>

>> > > ><br>

>> > > > Here is the v2 patch to add the missing _Float16 bits.   The PDF file is at<br>

>> > > ><br>

>> > > > <a href="https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI" rel="noreferrer" target="_blank">https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI</a><br>

>> > ><br>

>> > > This PDF shows _Complex _Float16 as having a size of 2 bytes (should be<br>

>> > > 4-byte size, 2-byte alignment).<br>

>> > ><br>

>> > > It also seems to change double from 4-byte to 8-byte alignment, which is<br>

>> > > wrong.  And it's inconsistent about whether it covers the long double =<br>

>> > > double (Android) case - it shows that case for _Complex long double but<br>

>> > > not for long double itself.<br>

>> ><br>

>> > Here is the v3 patch with the fixes.  I also updated the PDF file.<br>

>><br>

>> Here is the final patch I checked in.   _Complex _Float16 is changed to return<br>

>> in XMM0 register.   The new PDF file is at<br>

>><br>

>> <a href="https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI" rel="noreferrer" target="_blank">https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/Intel386-psABI</a><br>

><br>

><br>

> This should be explicit that the real part is returned in bits 0..15 and the imaginary part is returned in bits 16..31, or however we conventionally designate subcomponents of a vector.<br><br>

How about this?<br>

<br>

diff --git a/low-level-sys-info.tex b/low-level-sys-info.tex<br>

index 860ff66..8f527c1 100644<br>

--- a/low-level-sys-info.tex<br>

+++ b/low-level-sys-info.tex<br>

@@ -457,6 +457,9 @@ and \texttt{unions}) are always returned in memory.<br>

     & \texttt{__float128} & memory \\<br>

     \hline<br>

     & \texttt{_Complex _Float16} & \reg{xmm0} \\<br>

+    & & The real part is returned in bits 0..15. The imaginary part is<br>

+        returned \\<br>

+    & & in bits 16..31.\\<br>

     \cline{2-3}<br>

     Complex & \texttt{_Complex float} & \EDX:\EAX \\<br>

     floating- & & The real part is returned in \EAX. The imaginary part is<br>

<br>

<a href="https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/uploads/89eb3e52c7e5eadd58f7597508e13f34/intel386-psABI-2021-08-25.pdf" rel="noreferrer" target="_blank">https://gitlab.com/x86-psABIs/i386-ABI/-/wikis/uploads/89eb3e52c7e5eadd58f7597508e13f34/intel386-psABI-2021-08-25.pdf</a></blockquote><div><br></div><div>Looks good to me, thanks.</div><div><br></div><div>John.</div></div></div>