<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
span.hoenzb
{mso-style-name:hoenzb;}
span.EmailStyle18
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:796065964;
mso-list-type:hybrid;
mso-list-template-ids:-539884430 67698705 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
{mso-level-text:"%1\)";
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Yes, it helps a lot and we are working on it.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>A few questions,<o:p></o:p></span></p><p class=MsoListParagraph style='text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><span style='mso-list:Ignore'>1)<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>What will be your use model of this library? Will you run optimization phases after linking with the library? If so, what are they?<o:p></o:p></span></p><p class=MsoListParagraph style='text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><span style='mso-list:Ignore'>2)<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Do you care if the names of functions differ from those in libm? For example, it would be gpusin() instead of sin(). <o:p></o:p></span></p><p class=MsoListParagraph style='text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><span style='mso-list:Ignore'>3)<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Do you need a different library for different host platforms? Why?<o:p></o:p></span></p><p class=MsoListParagraph style='text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><span style='mso-list:Ignore'>4)<span style='font:7.0pt "Times New Roman"'> </span></span></span><![endif]><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Any other functions (besides math) you want to see in this library?<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Thanks.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Yuan<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> Dmitry Mikushin [mailto:dmitry@kernelgen.org] <br><b>Sent:</b> Thursday, February 07, 2013 2:09 PM<br><b>To:</b> Justin Holewinski; LLVM Developers Mailing List<br><b>Cc:</b> Yuan Lin<br><b>Subject:</b> [NVPTX] We need an LLVM CUDA math library, after all<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>Hi Justin, gentlemen,<br><br>I'm afraid I have to escalate this issue at this point. Since it was discussed for the first time last summer, it was sufficient for us for a while to have lowering of math calls into intrinsics disabled at DragonEgg level, and link them against CUDA math functions at LLVM IR level. Now I can say: this is not sufficient any longer, and we need NVPTX backend to deal with GPU math.<br><br>> There also is no standard libm for PTX.<br><br>Yes, that's right, but there is an interesting idea to codegen CUDA math headers into LLVM IR and link it with user module at IR level. This method gives a perfect degree of flexibility with respect to high-level languages: the user no longer needs to deal with headers and can have math right in the IR, regardless the language it was lowered from. I can confirm this method works for us very well with C and Fortran, but in order to make accurate replacements of unsupported intrinsics calls, it needs to become aware of NVPTX backend capabilities in the form of:<br><br>bool NVPTXTargetMachine::<o:p></o:p></p><div id=":29y"><p class=MsoNormal>isIntrinsicSupported(Function& intrinsic) and<br>string NVPTXTargetMachine::whichMathCallReplacesIntrinsic(Function& intrinsic)<br><br>> I would prefer not to lower such things in the back-end since different compilers may want to implement such functions differently based on speed vs. accuracy trade-offs.<br><br>Who are those different compilers? We are LLVM, the complete compiler stack, which should handle these things on its specific preference. Derived compilers may certainly think different, and it's their own business to change anything they want and never contribute back. We should not forget there are a lot of derived projects that use LLVM directly, like KernelGen or many of those embedded DSLs recently started flourishing. Their completeness and future relies on LLVM. For these reasons, I would strongly prefer LLVM/NVPTX should supply a reference GPU math implementation and invite you and everyone else to form a joint roadmap to deliver it.<br><br>Before we started, IANAL, but something tells me there could be a licensing issue about releasing the LLVM IR emitted from CUDA headers.<br>Could you please check this with NVIDIA?<br><br>Many thanks,<br>- D.<br><br>2012/9/6 Justin Holewinski <<a href="mailto:justin.holewinski@gmail.com">justin.holewinski@gmail.com</a>>:<br>> On 09/06/2012 10:02 AM, Dmitry N. Mikushin wrote:<br>>><br>>> Dear all,<br>>><br>>> During app compilation we have a crash in NVPTX backend:<br>>><br>>> LLVM ERROR: Cannot select: 0x732b270: i64 = ExternalSymbol'__powisf2'<br>>> [ID=18]<br>>><br>>> As I understand LLVM tries to lower the following call<br>>><br>>> %28 = call ptx_device float @llvm.powi.f32(float 2.000000e+00, i32 %8)<br>>> nounwind readonly<br>>><br>>> to device intrinsic. The table llvm/IntrinsicsNVVM.td does not contain<br>>> such intrinsic, however it should be builtin, according to<br>>> cuda/include/math_functions.h<br>><br>><br>> It actually gets lowered into an external function call.<br>><br>><br>>><br>>> Is my understanding correct, and we need simply add the corresponding<br>>> definition to llvm/IntrinsicsNVVM.td ? How to do that, what are the<br>>> rules?<br>><br>><br>> PTX does not have an instruction (or simple series of instructions) that<br>> implements pow, so this will not be handled. I would prefer not to lower<br>> such things in the back-end since different compilers may want to implement<br>> such functions differently based on speed vs. accuracy trade-offs.<br>><br>> There also is no standard libm for PTX. It is up to the higher-level<br>> compiler to link against a run-time library that provides functions like pow<br>> (see include/math_functions.h in a CUDA distribution).<br>><br>>><br>>> Thanks,<br>>> - D.<br>>> _______________________________________________<br>>> LLVM Developers mailing list<br>>> <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>><o:p></o:p></p><div><div id=":1dx"><p class=MsoNormal><img border=0 id="_x0000_i1025" src="https://mail.google.com/mail/u/1/images/cleardot.gif"><o:p></o:p></p></div></div><p class=MsoNormal><span class=hoenzb><span style='color:#888888'>></span></span><span style='color:#888888'><br><span class=hoenzb>> --</span><br><span class=hoenzb>> Thanks,</span><br><span class=hoenzb>></span><br><span class=hoenzb>> Justin Holewinski</span><br><span class=hoenzb>></span></span><o:p></o:p></p></div></div>
<DIV>
<HR>
</DIV>
<DIV>This email message is for the sole use of the intended recipient(s) and may
contain confidential information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of the original
message. </DIV>
<DIV>
<HR>
</DIV>
<P></P>
</body></html>