<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Helvetica;
panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
span.apple-converted-space
{mso-style-name:apple-converted-space;}
span.EmailStyle23
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">A while back, when this idea came up for the first time, I was trying to argue in favor of complex integer support (since Hexagon has instructions that do complex arithmetic on integers). I don’t understand the argument that the integer
result has a different type from the inputs: the standard arithmetic in C/C++ doesn’t change result types, but overflows can produce UB. The complex arithmetic could do the same. Are you talking about having to do a full-precision arithmetic before producing
the final result (to avoid these overflows on intermediate values)?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:Consolas">-- </span>
<span style="font-size:9.0pt;font-family:Consolas"><o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:8.0pt;font-family:Consolas">Krzysztof Parzyszek
<a href="mailto:kparzysz@quicinc.com"><span style="color:#0563C1">kparzysz@quicinc.com</span></a> AI tools development<o:p></o:p></span></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> llvm-dev <llvm-dev-bounces@lists.llvm.org> <b>On Behalf Of
</b>Cranmer, Joshua via llvm-dev<br>
<b>Sent:</b> Tuesday, November 23, 2021 5:17 PM<br>
<b>To:</b> Chris Lattner <clattner@nondot.org><br>
<b>Cc:</b> llvm-dev@lists.llvm.org<br>
<b>Subject:</b> Re: [llvm-dev] Complex intrinsics proposal and roundtable<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p align="center" style="text-align:center"><strong><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;background:yellow">WARNING:</span></strong><span style="font-size:10.5pt;font-family:"Arial",sans-serif;color:black;background:yellow">
This email originated from outside of Qualcomm. Please be wary of any links or attachments, and do not enable macros.</span><o:p></o:p></p>
<div>
<p class="MsoNormal">To answer your second question first: we briefly discussed this at the complex round table last week. Complex integers has not been on anyone’s roadmap. During the discussion, it was pointed out that complex integer arithmetic tends to
involve heterogeneous types—the result type of the arithmetic is not the same as its input time—which is not the case for complex floating-point types, and so handling them the same way may not make the most sense.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I’ve posted a full patch of most of the pieces I’ve implemented here:
<a href="https://reviews.llvm.org/D114398">https://reviews.llvm.org/D114398</a>. One of the goals in the path I have gone down is to enable more consistent optimization of complex types within the compiler itself, as the variety of complex representations in
the ABI means that they can arrive at passes such as vectorization in an inconsistent form. I think there is value in having them in the middle-end of the optimizer, even independent of their value as representing hardware instructions, although going in as
experimental for now sounds like the right way to approach stuff.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> Chris Lattner <<a href="mailto:clattner@nondot.org">clattner@nondot.org</a>>
<br>
<b>Sent:</b> Tuesday, November 23, 2021 2:51<br>
<b>To:</b> Cranmer, Joshua <<a href="mailto:joshua.cranmer@intel.com">joshua.cranmer@intel.com</a>><br>
<b>Cc:</b> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
<b>Subject:</b> Re: [llvm-dev] Complex intrinsics proposal and roundtable<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hi Joshua,<o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">I think that using 2x elements is a much more promising direction. Question though: how much value is there in making these be target independent intrinsics? Are these actually general portable enough (across architectures) to be worth
abstracting for a frontend? If the frontend has to handle all the complexity anyway (e.g. your point about multiply and divide are on point), there is little benefit to adding these.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Separately, do you plan to handle complex integers? Do you plan to support arbitrary bit width elements, and what is the legalization scheme for these?<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">-Chris<o:p></o:p></p>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><o:p> </o:p></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal">On Nov 18, 2021, at 11:03 AM, Cranmer, Joshua via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<o:p></o:p></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal">This is another proposal about introducing complex types into LLVM. Following on from<span class="apple-converted-space"> </span><a href="https://lists.llvm.org/pipermail/llvm-dev/2019-October/136100.html"><span style="color:#0563C1">https://lists.llvm.org/pipermail/llvm-dev/2019-October/136100.html</span></a>,
this is different in that it doesn’t propose complex types directly but instead proposes representing complex numbers as vectors and using intrinsics. See also Florian’s proposal to do this (starting with complex multiply) here:<span class="apple-converted-space"> </span><a href="https://lists.llvm.org/pipermail/llvm-dev/2020-November/146568.html"><span style="color:#0563C1">https://lists.llvm.org/pipermail/llvm-dev/2020-November/146568.html</span></a>.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Representation of complex types<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">The proposal is to represent complex numbers as vectors of 2N floating-point types. For example, <2 x float> would represent a scalar complex number. <4 x float> would represent a vector of 2 complex floating-point numbers, with the first
complex number living in lanes 0 and 1, and the second living in lanes 2 and 3. This representation of complex types matches the vector form in x86.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">The basic arithmetic operations are mapped as follows:<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">+ or -: fadd or fsub <2 x float> %a, %b<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">*: call <2 x float> @llvm.complex.multiply(<2 x float> %a, <2 x float> %b)<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">/: call <2 x float> @llvm.complex.divide(<2 x float> %a, <2 x float> %b)<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt">Building complex values, creal, cimag: existing extractvalue, insertvalue, and shufflevector instructions as appropriate<br>
cabs: call float @llvm.complex.abs(<2 x float> %val)<br>
cconj: call <2 x float> @llvm.complex.conj(<2 x float> %val)<br>
<br>
<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">One complexity that hasn’t been covered in prior proposals is what complex multiplication actually means. Among our major source languages (C/C++/Fortran), there is some variance as to the definition of multiplication, division, and complex
absolute values. This variation is most acute when looking at division. The naïve expansion of computing (a + bi)/(c + di) is<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">denom = c * c + d * d<br>
real = (a * c + b * d) / denom<br>
imag = (b * c - a * d) / denom<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">If you use Fortran, there is a requirement that the division operation be scaled to prevent overflow in computing denom (at the very least, this is how I’ve seen existing Fortran compilers implement it). If you use C, there is an additional
requirement that the resulting complex number be recomputed to infinity for certain cases where real and imaginary are both NaN (see Annex G of the C standard). Using the CX_LIMITED_RANGE pragma, or equivalent command-line option, lifts both of these requirements.
Additionally, gcc provides a -fcx-fortran-rules that lifts only the latter requirement. My understanding is that all hardware implementations of complex multiply implement CX_LIMITED_RANGE rules.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">My proposal is to distinguish between these situations using a mixture of existing fast-math flags and call-site attributes. Without any flags or call-site attributes, these intrinsics would expand to their compiler-rt equivalents of __mulsc3,
__divsc3, etc., which is to say they would have full C requirements (both NaN checking and scaling). The “complex-limited-range” call-site attribute would disable both of these requirements. The “complex-no-scale” call-site attribute would disable the specific
scaling requirement but retain the NaN checking behavior. Additionally, fast math flags can be used to generate behavior: nnan or ninf would trigger the dropping of the NaN checking code by itself.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Implementation experience<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">I have been able to implement patches that pattern-match for complex multiply and divide (in the CX_LIMITED_RANGE cases) early in InstCombine, and haven’t seen issues with that. Doing codegen for the non-CX_LIMITED_RANGE case, requiring
a call to __mulsc3, is difficult because that function returns a C _Complex number, and the C ABIs for complex numbers tend to be inconsistent even among different floating-point types within the same architecture. The truly evil case is the i386 ABI for float,
which is returned as edx:eax (or i64, as generated by clang).<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">If you want to talk more about this, I have a roundtable tomorrow, Friday, at 14:45 Eastern or 11:45 Pacific.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt"> </span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt">--<span class="apple-converted-space"> </span></span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><i><span style="font-size:9.0pt">Joshua Cranmer</span></i><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Helvetica",sans-serif">_______________________________________________<br>
LLVM Developers mailing list<br>
</span><a href="mailto:llvm-dev@lists.llvm.org"><span style="font-size:9.0pt;font-family:"Helvetica",sans-serif;color:#0563C1">llvm-dev@lists.llvm.org</span></a><span style="font-size:9.0pt;font-family:"Helvetica",sans-serif"><br>
</span><a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev"><span style="font-size:9.0pt;font-family:"Helvetica",sans-serif;color:#0563C1">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</span></a><o:p></o:p></p>
</div>
</blockquote>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</div>
</div>
</body>
</html>