<div class="gmail_quote">On Wed, Dec 14, 2011 at 10:45 AM, Carlos Sánchez de La Lama <span dir="ltr"><<a href="mailto:carlos.delalama@urjc.es">carlos.delalama@urjc.es</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi all,<br>

<div class="im"><br>

>>> I would favor calling conventions over metadata for the simple<br>

>>> reason that this maps more cleanly to the device model.  Device and<br>

>>> kernel functions are represented differently in PTX, including<br>

>>> (sometimes) the way parameters are passed.<br>

<br>

>> For the record, marking the kernels with "calling conventions"<br>

>> instead of metadata is fine also for the pocl use case. It's enough<br>

>> if there is a way to differentiate OpenCL C kernels from the "device<br>

>> functions" for the reason I discussed in the previous email. That is,<br>

>> in the pocl point of view we just need a way to pick the<br>

>> "host-callable" kernel functions as they need the special treatment<br>

>> before they can be called (like a C function).<br>

<br>

</div>Remember OpenCL kernels are also callable from inside another<br>

kernels. It is not a big deal though, as calling conventions in LLVM<br>

IR are just markers to the code generation, they do not have any<br>

effect before that (AFAIK).<br>

<br>

What it is needed is a way to differentiate at LLVM IR level between:<br>

1) Normal functions<br>

2) Functions callable from outside and inside (OpenCL kernels would fall<br>

   in this category).<br>

3) Functions callable only from outside (I there is such case; I am<br>

   not so familiar with CUDA so I do not know if such functions exist on<br>

   CUDA).<br>

<br>

At least 1 and 2 are needed for OpenCL. Whether this is calling<br>

conventions, metadata, or attributes, do not make such a big<br>

difference, in practical terms. Code generation can apply different<br>

calling conventions based on metadata/attributes, and can also detect<br>

the kernels based on calling conventions, so the options are<br>

interchangeable.<br>

<div class="im"><br>

>> BTW what about the other OpenCL data like required_wg_size<br>

</div><div class="im">>> affect the possible "kernel treatment" of pocl and can be converted<br>

>> to some special instructions (I suppose) for the SIMT targets?<br>

>> Currently only the TCE target in Clang adds metadata for the<br>

>> required_wg_size kernel attribute (as we need it in "offline<br>

>> compilation") but IMHO that could be useful in general, as a default<br>

>> metadata (to enable its support in pocl for all targets, for<br>

>> example).<br>

<br>

> Ideally, we would need some standard way of representing this in<br>

> Clang.  The back-end would then need to convert it to whatever form<br>

> the target OpenCL run-time expects.<br>

<br>

</div>This is an interesting point. And there might be more information<br>

present on .cl files that needs to get transported into LLVM IR. While<br>

there has been the argument around that OpenCL "is C" so clang should<br>

not need to generate extra stuff for OpenCL input files, the fact is<br>

that it is not plain C. Basically there are two ways to go on:<br>

<br>

a) OpenCL is a C-based language (C plus additions) and clang can parse<br>

   it, so *all* the information on the .cl file has to be present in<br>

   LLVM IR.<br>

b) OpenCL is just C, so clang does not need to care about extra things<br>

   and implementations should parse .cl files to get the extra<br>

   information, and potentially preprocess to transform the non-C<br>

   constructs into valid C code.<br>

<br>

Just staying in between is good for nothing. An given clang has a CL<br>

mode already (-x cl) recognizes the keywords and supports the non-C in<br>

OpenCL (like vector swizzle), I think (b) can be discarded right away.<br>

But then all the info should get in a generic way into the LLVM.<br></blockquote><div><br></div><div>(b) can be also be discarded because the original OpenCL source is not always available.  It is perfectly valid to compile OpenCL to a binary form (PTX in the case of nVidia GPUs), and then load the binary as an OpenCL program.  In this case, the original .cl file may not even be available.</div>

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div class="im"><br>

> This is a question for cfe-dev.<br>

<br>

</div>So adding cfe-dev in copy.<br></blockquote><div><br></div><div>Thanks.  I forgot to add that. :)</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<br>

BR<br>

<span class="HOEnZb"><font color="#888888"><br>

Carlos<br>

<br>

</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br><br><div>Thanks,</div><div><br></div><div>Justin Holewinski</div><br>