<div dir="ltr"><div>
<p class=""><span class="">Hi Roel, Chris,</span></p>
<p class=""><span class="">This is a summary on how you can add support for a a different offloading device on top of what we have in github for OpenMP:</span></p><p class="">a) Download and install lvm (<a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_clang-2Domp_llvm-5Ftrunk&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=wsI4T2YcNsRmsmAEYJJ-w-Z9sdcHqB2N_ByaRLOPcRE&s=oiO3Hnn_JYQgkFj0X1QDla0Y5brxxh5y40CcfXG-D24&e="><span class="">https://github.com/clang-omp/llvm_trunk</span></a>), and clang (<a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_clang-2Domp_clang-5Ftrunk&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=wsI4T2YcNsRmsmAEYJJ-w-Z9sdcHqB2N_ByaRLOPcRE&s=miK1ELBndN7c46na-W3GGQjTtWIYhC7ZKGZzosm4naA&e="><span class="">https://github.com/clang-omp/clang_trunk</span></a>) as usual</p><p class="">b) install the official llvm OpenMP runtime library <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__openmp.llvm.org&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=wsI4T2YcNsRmsmAEYJJ-w-Z9sdcHqB2N_ByaRLOPcRE&s=7SplUt9g-zIm36XLiTEhlESH6k15rjBB6Uv0TGIyRDk&e=">openmp.llvm.org</a>. Clang will expect that to be present in your library path in order to compile OpenMP code (even if you do not need any OpenMP feature other than offloading).</p><p class=""><span class="">c) Install <a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_clang-2Domp_libomptarget&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=wsI4T2YcNsRmsmAEYJJ-w-Z9sdcHqB2N_ByaRLOPcRE&s=nDsrs4m4EOY4B3stJI8vZ_B6XoULY4nFgirG6iqtSAI&e="><span class="">https://github.com/clang-omp/libomptarget</span></a> (running ‘make' should do it). This library implements the API to control offloading. It also contains a set of plugins to some targets we are testing this with - x86_64, powerpc64 and NVPTX - in ./RTLs. You will need to implement a plug in for your target as well. The interface used for these plugins is detailed in the document proposed in <a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084986.html"><span class="">http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084986.html</span></a></span><span class=""> .</span><span class="">You can look at the existing plugins for a hint. In a nutshell you would have to implement code that allocates and moves data to your device, returns a table of entry points and global variables given a device library and launches execution of a given entry point with the provided list of arguments. </span></p><p class="">d) The current implementation is expecting the device library to use ELF format. There is no reason for that other than the platforms we tested this with so far use ELF format. If your device does not use ELF __tgt_register_lib() (src/omptarget.cpp) would have to be extended to understand your desired format. Otherwise you may just update src/targets_info.cpp with your ELF ID and plugin name.</p><p class="">e) Offloading is driven by clang, so it has to be aware of the required by yourr device. If your device toolchain is not implemented in clang you would have to do that in lib/Driver/ToolChains.cpp.</p><p class="">f) Once everything is in place, you can compile your code by running something like “clang -fopenmp -omptargets=your-target-triple app.c”. If you do separate compilation you could see that two different files are generated for a given source file (the target file has the suffix tgt-your-target-triple). </p>
<p class=""><span class="">I should say that in general OpenMP requires a runtime library for the device as well, however if you do not use any OpenMP pragmas inside your target code you won’t need that.</span></p>
<p class=""><span class="">We started porting our code related with offloading currently in github to clang upstream. The driver support is currently under review in <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D9888&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=wsI4T2YcNsRmsmAEYJJ-w-Z9sdcHqB2N_ByaRLOPcRE&s=r0ZuZzPUTPNynsRRuWcClRdDzdCnksOAfrNjgwriLew&e="><span class="">http://reviews.llvm.org/D9888</span></a>. We are about to send our first offloading codegen patches as well. </span></p>
<p class=""><span class="">I understand that what Chris is proposing is somewhat different that what we have in place, given that the transformations are intended to be in LLVM IR. However, the goal seems to be the same. Hope the summary above gives you some hints on whether your use cases can be accommodated.</span></p><p class=""><span class="">Feel free to ask any questions you may have.</span></p>
<p class=""><span class="">Thanks!</span></p>
<p class=""><span class="">Samuel</span></p></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">2015-06-08 16:46 GMT-04:00 Sergey Ostanevich <span dir="ltr"><<a href="mailto:sergos.gnu@gmail.com" target="_blank">sergos.gnu@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Roel,<br>
<br>
You have to checkout and build llvm/clang as usual.<br>
For runtime support you'll have to build the libomptarget and make a<br>
plugin for your target. Samuel can help you some more.<br>
As for the OpenMP examples I can recommend you the<br>
<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__openmp.org_mp-2Ddocuments_OpenMP4.0.0.Examples.pdf&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=wsI4T2YcNsRmsmAEYJJ-w-Z9sdcHqB2N_ByaRLOPcRE&s=D3bL-re-OszQEKiKU9jJe-7FBaYNODLZs2wLKOwlv5M&e=" target="_blank">http://openmp.org/mp-documents/OpenMP4.0.0.Examples.pdf</a><br>
look into the target constructs.<br>
<br>
Sergos<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
On Mon, Jun 8, 2015 at 6:13 PM, Roel Jordans <<a href="mailto:r.jordans@tue.nl">r.jordans@tue.nl</a>> wrote:<br>
> Hi Sergos,<br>
><br>
> I'd like to try this on our hardware. Is there some example code that I<br>
> could use to get started?<br>
><br>
> Cheers,<br>
> Roel<br>
><br>
><br>
> On 08/06/15 13:27, Sergey Ostanevich wrote:<br>
>><br>
>> Chirs,<br>
>><br>
>> Have you seen an offloading infrastructure design proposal at<br>
>> <a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084986.html" target="_blank">http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084986.html</a> ?<br>
>> It relies on the long-standing OpenMP standard with recent updates to<br>
>> support the heterogenous computations.<br>
>> Could you please review it and comment on how it fits to your needs?<br>
>><br>
>> It's not quite clear from your proposal what source language standard<br>
>> do you plat to support - you just metion that OpenCL will be one of<br>
>> your backends, as far as I got it. What's your plan on sources -<br>
>> C/C++/FORTRAN?<br>
>> How would you control the offloading, data transfer, scheduling and so<br>
>> on? Whether it will be new language constructs, similar to prallel_for<br>
>> in Cilk Plus, or will it be pragma-based like in OpenMP or OpenACC?<br>
>><br>
>> The design I mentioned above has an operable implementation fon NVIDIA<br>
>> target at the<br>
>><br>
>> <a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_clang-2Domp_llvm-5Ftrunk&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=wsI4T2YcNsRmsmAEYJJ-w-Z9sdcHqB2N_ByaRLOPcRE&s=oiO3Hnn_JYQgkFj0X1QDla0Y5brxxh5y40CcfXG-D24&e=" target="_blank">https://github.com/clang-omp/llvm_trunk</a><br>
>> <a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_clang-2Domp_clang-5Ftrunk&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=wsI4T2YcNsRmsmAEYJJ-w-Z9sdcHqB2N_ByaRLOPcRE&s=miK1ELBndN7c46na-W3GGQjTtWIYhC7ZKGZzosm4naA&e=" target="_blank">https://github.com/clang-omp/clang_trunk</a><br>
>><br>
>> with runtime implemented at<br>
>><br>
>> <a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_clang-2Domp_libomptarget&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=wsI4T2YcNsRmsmAEYJJ-w-Z9sdcHqB2N_ByaRLOPcRE&s=nDsrs4m4EOY4B3stJI8vZ_B6XoULY4nFgirG6iqtSAI&e=" target="_blank">https://github.com/clang-omp/libomptarget</a><br>
>><br>
>> you're welcome to try it out, if you have an appropriate device.<br>
>><br>
>> Regards,<br>
>> Sergos<br>
>><br>
>> On Sat, Jun 6, 2015 at 2:24 PM, Christos Margiolas<br>
>> <<a href="mailto:chrmargiolas@gmail.com">chrmargiolas@gmail.com</a>> wrote:<br>
>>><br>
>>> Hello,<br>
>>><br>
>>> Thank you a lot for the feedback. I believe that the heterogeneous engine<br>
>>> should be strongly connected with parallelization and vectorization<br>
>>> efforts.<br>
>>> Most of the accelerators are parallel architectures where having<br>
>>> efficient<br>
>>> parallelization and vectorization can be critical for performance.<br>
>>><br>
>>> I am interested in these efforts and I hope that my code can help you<br>
>>> managing the offloading operations. Your LLVM instruction set extensions<br>
>>> may<br>
>>> require some changes in the analysis code but I think is going to be<br>
>>> straightforward.<br>
>>><br>
>>> I am planning to push my code on phabricator in the next days.<br>
>>><br>
>>> thanks,<br>
>>> Chris<br>
>>><br>
>>><br>
>>> On Fri, Jun 5, 2015 at 3:45 AM, Adve, Vikram Sadanand<br>
>>> <<a href="mailto:vadve@illinois.edu">vadve@illinois.edu</a>><br>
>>> wrote:<br>
>>>><br>
>>>><br>
>>>> Christos,<br>
>>>><br>
>>>> We would be very interested in learning more about this.<br>
>>>><br>
>>>> In my group, we (Prakalp Srivastava, Maria Kotsifakou and I) have been<br>
>>>> working on LLVM extensions to make it easier to target a wide range of<br>
>>>> accelerators in a heterogeneous mobile device, such as Qualcomm's<br>
>>>> Snapdragon<br>
>>>> and other APUs. Our approach has been to (a) add better abstractions of<br>
>>>> parallelism to the LLVM instruction set that can be mapped down to a<br>
>>>> wide<br>
>>>> range of parallel hardware accelerators; and (b) to develop optimizing<br>
>>>> "back-end" translators to generate efficient code for the accelerators<br>
>>>> from<br>
>>>> the extended IR.<br>
>>>><br>
>>>> So far, we have been targeting GPUs and vector hardware, but semi-custom<br>
>>>> (programmable) accelerators are our next goal. We have discussed DSPs<br>
>>>> as a<br>
>>>> valuable potential goal as well.<br>
>>>><br>
>>>> Judging from the brief information here, I'm guessing that our projects<br>
>>>> have been quite complementary. We have not worked on the extraction<br>
>>>> passes,<br>
>>>> scheduling, or other run-time components you mention and would be happy<br>
>>>> to<br>
>>>> use an existing solution for those. Our hope is that the IR extensions<br>
>>>> and<br>
>>>> translators will give your schedulers greater flexibility to retarget<br>
>>>> the<br>
>>>> extracted code components to different accelerators.<br>
>>>><br>
>>>> --Vikram S. Adve<br>
>>>> Visiting Professor, School of Computer and Communication Sciences, EPFL<br>
>>>> Professor, Department of Computer Science<br>
>>>> University of Illinois at Urbana-Champaign<br>
>>>> <a href="mailto:vadve@illinois.edu">vadve@illinois.edu</a><br>
>>>> <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=wsI4T2YcNsRmsmAEYJJ-w-Z9sdcHqB2N_ByaRLOPcRE&s=fPhiWq07W8gbfZe_0YrFvvCJ_TyrKVG8lTkpeea2yuY&e=" target="_blank">http://llvm.org</a><br>
>>>><br>
>>>><br>
>>>><br>
>>>><br>
>>>> On Jun 5, 2015, at 3:18 AM, <a href="mailto:llvmdev-request@cs.uiuc.edu">llvmdev-request@cs.uiuc.edu</a> wrote:<br>
>>>><br>
>>>>> Date: Thu, 4 Jun 2015 17:35:25 -0700<br>
>>>>> From: Christos Margiolas <<a href="mailto:chrmargiolas@gmail.com">chrmargiolas@gmail.com</a>><br>
>>>>> To: LLVM Developers Mailing List <<a href="mailto:llvmdev@cs.uiuc.edu">llvmdev@cs.uiuc.edu</a>><br>
>>>>> Subject: [LLVMdev] Supporting heterogeneous computing in llvm.<br>
>>>>> Message-ID:<br>
>>>>><br>
>>>>> <<a href="mailto:CAC3KUCx0mpBrnrGjDVxQzxtBpnJXtw3herZ_E2pQoSqSyMNsKA@mail.gmail.com">CAC3KUCx0mpBrnrGjDVxQzxtBpnJXtw3herZ_E2pQoSqSyMNsKA@mail.gmail.com</a>><br>
>>>>> Content-Type: text/plain; charset="utf-8"<br>
>>>>><br>
>>>>> Hello All,<br>
>>>>><br>
>>>>> The last two months I have been working on the design and<br>
>>>>> implementation<br>
>>>>> of<br>
>>>>> a heterogeneous execution engine for LLVM. I started this project as an<br>
>>>>> intern at the Qualcomm Innovation Center and I believe it can be useful<br>
>>>>> to<br>
>>>>> different people and use cases. I am planning to share more details and<br>
>>>>> a<br>
>>>>> set of patches in the next<br>
>>>>> days. However, I would first like to see if there is an interest for<br>
>>>>> this.<br>
>>>>><br>
>>>>> The project is about providing compiler and runtime support for the<br>
>>>>> automatic and transparent offloading of loop or function workloads to<br>
>>>>> accelerators.<br>
>>>>><br>
>>>>> It is composed of the following:<br>
>>>>> a) Compiler and Transformation Passes for extracting loops or functions<br>
>>>>> for<br>
>>>>> offloading.<br>
>>>>> b) A runtime library that handles scheduling, data sharing and<br>
>>>>> coherency<br>
>>>>> between the<br>
>>>>> host and accelerator sides.<br>
>>>>> c) A modular codebase and design. Adaptors specialize the code<br>
>>>>> transformations for the target accelerators. Runtime plugins manage the<br>
>>>>> interaction with the different accelerator environments.<br>
>>>>><br>
>>>>> So far, this work so far supports the Qualcomm DSP accelerator but I<br>
>>>>> am<br>
>>>>> planning to extend it to support OpenCL accelerators. I have also<br>
>>>>> developed<br>
>>>>> a debug port where I can test the passes and the runtime without<br>
>>>>> requiring<br>
>>>>> an accelerator.<br>
>>>>><br>
>>>>><br>
>>>>> The project is still in early R&D stage and I am looking forward for<br>
>>>>> feedback and to gauge the interest level. I am willing to continue<br>
>>>>> working<br>
>>>>> on this as an open source project and bring it to the right shape so it<br>
>>>>> can<br>
>>>>> be merged with the LLVM tree.<br>
>>>>><br>
>>>>><br>
>>>>> Regards,<br>
>>>>> Chris<br>
>>>>><br>
>>>>> P.S. I intent to join the llvm social in Bay Area tonight and I will be<br>
>>>>> more than happy to talk about it.<br>
>>>>> -------------- next part --------------<br>
>>>>> An HTML attachment was scrubbed...<br>
>>>>> URL:<br>
>>>>><br>
>>>>> <<a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20150604/289e4438/attachment-0001.html" target="_blank">http://lists.cs.uiuc.edu/pipermail/llvmdev/attachments/20150604/289e4438/attachment-0001.html</a>><br>
>>>><br>
>>>><br>
>>>><br>
>>>> _______________________________________________<br>
>>>> LLVM Developers mailing list<br>
>>>> <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
>>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
>>><br>
>>><br>
>>><br>
>>><br>
>>> _______________________________________________<br>
>>> LLVM Developers mailing list<br>
>>> <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
>>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
>>><br>
>> _______________________________________________<br>
>> LLVM Developers mailing list<br>
>> <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
>> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
>><br>
> _______________________________________________<br>
> LLVM Developers mailing list<br>
> <a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
</div></div></blockquote></div><br></div>