<html>


<head>


<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


</head>


<body bgcolor="#FFFFFF" text="#000000">


<p><br>


</p>


<div class="moz-cite-prefix">On 10/7/19 4:57 PM, Artem Belevich wrote:<br>


</div>


<blockquote type="cite" cite="mid:CA+wKYkOzeEdWmUxVW9CDa9OUELYTQ_9z7d2TyL3yoGY-8rTyFw@mail.gmail.com">


<div dir="ltr">


<div dir="ltr">


<div class="gmail_default" style="font-family:verdana,sans-serif"><br>


</div>


</div>


<br>


<div class="gmail_quote">


<div dir="ltr" class="gmail_attr">On Mon, Oct 7, 2019 at 1:22 PM Siva Chandra <<a href="mailto:sivachandra@google.com" moz-do-not-send="true">sivachandra@google.com</a>> wrote:<br>


</div>


<blockquote class="gmail_quote" style="margin:0px 0px 0px


            0.8ex;border-left:1px solid


            rgb(204,204,204);padding-left:1ex">


Hello Hal,<br>


<br>


You had asked me a question about nvtpx on<br>


<a href="https://reviews.llvm.org/D67867" rel="noreferrer" target="_blank" moz-do-not-send="true">https://reviews.llvm.org/D67867</a>. I did some homework on that and below<br>


is what I have learnt.<br>


<br>


For CUDA/nvptx, a libc in general might be irrelevant. However, I<br>


learned from Art (copied in this email) that there is a desire to have<br>


a single library of math functions that clang can rely on for the GPU.<br>


So, even if a libc in general could be irrelevant, a subset of the<br>


libc might indeed become relevant for GPUs.<br>


<br>


We want llvm-libc to expose a thin layer of C symbols over the<br>


underlying C++ implementation library. My patch<br>


(<a href="https://reviews.llvm.org/rL373764" rel="noreferrer" target="_blank" moz-do-not-send="true">https://reviews.llvm.org/rL373764</a>) showcased one way of doing this<br>


for ELF using the section attribute followed by a post-processing<br>


step. We might have to take a different approach for nvptx because ELF<br>


like sections and tooling might not be feasible/available (as there is<br>


no linking phase during GPU-side comiplation for NVIDIA GPUs). Art<br>


explained to me that device code undergoes whole program analysis by<br>


LLVM. Hence, we can provide an explicit C wrapper layer over the C++<br>


implementation library. If source level wrappers are not desirable, we<br>


can consider using IR level aliases (will we have to deal with mangled<br>


names??). This gives the benefit that, while it looks like a normal C<br>


function call from the user's point of view, the whole program<br>


analysis performed by LLVM will eliminate the additional wrapper call<br>


preventing any  performance hits.<br>


</blockquote>


<div><br>


</div>


<div>


<div class="gmail_default" style="font-family:verdana,sans-serif">We're currently using the


<a href="https://github.com/llvm-mirror/clang/blob/master/lib/Headers/__clang_cuda_device_functions.h" moz-do-not-send="true">


wrappers in Clang headers</a>, so this proposal should not make things worse.</div>


<div class="gmail_default" style="font-family:verdana,sans-serif"><br>


</div>


<div class="gmail_default" style="font-family:verdana,sans-serif">The #1 on my wish list for the standard library is to have libm available to clang/llvm as bitcode library, which would make it possible to re-enable lowering to various library calls in LLVM


 when we target NVPTX and, possibly, avoid rather precarious dependency on the binary libdevice bitcode blob which comes with CUDA SDK.</div>


</div>


<div> </div>


<div>


<div class="gmail_default" style="font-family:verdana,sans-serif">AMDGPU folks are also using


<a href="https://github.com/RadeonOpenCompute/ROCm-Device-Libs/tree/master/ocml/src" moz-do-not-send="true">


bitcode libraries</a>, so providing standard math library as bitcode may benefit them, too.</div>


</div>


</div>


</div>


</blockquote>


<p><br>


</p>


<p>Thanks, Siva, Art. +1 to this. Across GPUs from several vendors, and other such platforms, I think that this will be very valuable. It's also not just math functions, although the math functions are likely the performance-sensitive cases, but there are a


 lot of libc functions that we would like to have available to ease transitioning code to accelerators.  For example, snprintf, qsort.</p>


<p> -Hal<br>


</p>


<p><br>


</p>


<blockquote type="cite" cite="mid:CA+wKYkOzeEdWmUxVW9CDa9OUELYTQ_9z7d2TyL3yoGY-8rTyFw@mail.gmail.com">


<div dir="ltr">


<div class="gmail_quote">


<div><br>


</div>


<div>


<div class="gmail_default" style="font-family:verdana,sans-serif">--Artem</div>


<br>


</div>


<div><br>


</div>


<blockquote class="gmail_quote" style="margin:0px 0px 0px


            0.8ex;border-left:1px solid


            rgb(204,204,204);padding-left:1ex">


<br>


Thanks,<br>


Siva Chandra<br>


</blockquote>


</div>


<br>


</div>


</blockquote>


<pre class="moz-signature" cols="72">-- 


Hal Finkel


Lead, Compiler Technology and Programming Languages


Leadership Computing Facility


Argonne National Laboratory</pre>


</body>


</html>