<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif"><br></div><br><div class="gmail_quote"><div dir="ltr">On Mon, Jun 4, 2018 at 8:02 AM Alexander Matz via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org">cfe-dev@lists.llvm.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi everybody!<br>

<br>

I'm working on a pass that instruments CUDA kernels, which then require some additional "arguments" in order to write out their results.<br>

To my knowledge, changes to the signature of device functions must be made in the frontend so host and device are in sync.<br>

In order to avoid having to hack the front-end, we implemented this kind of stuff using global device variables which can be accessed using cudaMemcpyToSymbol.<br>

Unfortunately that function does not allow passing the symbol name of the variable.<br>

<br>

So here is the actual two part question:<br>

- How does clang link global device variables on the device to its "host version" so that the CUDA runtime can use it?<br></blockquote><div><br></div><div><div class="gmail_default" style="font-family:verdana,sans-serif">Clang creates "shadow" variable on the host side and registers {address, name} tuple with CUDA runtime. Whenever you need to pass a pointer to the device side via CUDA runtime, it automatically translates the address of the host-side shadow variable to the device-side address of the variable when it passes it to the kernel.</div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

- Can I duplicate the functionality in IR passes to create global variables after the frontend finishes?<br></blockquote><div><br></div><div><div class="gmail_default" style="font-family:verdana,sans-serif">Probably. You can take a look at the glue IR we generate on the host side.</div><div class="gmail_default" style="font-family:verdana,sans-serif">E.g. compile a simple cuda source with --keep-temps and you should find number of calls to __cudaRegister...() calls.</div></div><div><br></div><div><div class="gmail_default" style="font-family:verdana,sans-serif">Note that you need to have full CUDA compilation as clang will not generate this glue code if it has no-device-side object to include. I.e. --cuda-host-only will not work.</div><br></div><div><div class="gmail_default" style="font-family:verdana,sans-serif">Here's an example (look for __cuda_register_globals) : <a href="https://godbolt.org/g/ib3Hyk">https://godbolt.org/g/ib3Hyk</a></div></div><div><br></div><div><div class="gmail_default" style="font-family:verdana,sans-serif">--Artem</div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Looking at the IR I can only see regular global variables in both host and device IR.<br>

I tried just inserting global variables in IR (externally initialized, external linkage) but get an "invalid device symbol" error if I try to use it at runtime.<br>

I was unable to find the interesting bits in clang itself since I'm not at all familiar with the codebase.<br>

<br>

My apologies if this is not the right mailing this.<br>

I chose this one instead of llvm-dev because I figured clang is responsible for creating and linking __device__ variables.<br>

<br>

Thanks,<br>

Alex<br>

_______________________________________________<br>

cfe-dev mailing list<br>

<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr">--Artem Belevich</div></div></div>