<div dir="ltr">Hi folks,<br><br>The sixth issue of LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella, is now available at: <a href="https://llvm-gpu-news.github.io/2021/02/19/issue-6.html">https://llvm-gpu-news.github.io/2021/02/19/issue-6.html</a><div><br>I'm also pasting the content below, in case you prefer to read in your email client.<br><div><br>-Jakub<br><br>======================================================================<br><br># LLVM GPU News Issue #6, February 19 2021<br>Authors: Jakub Kuderski, Lei Zhang, Johannes Doerfert</div></div><div><br></div><div>Welcome to the sixth issue of LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella.<br>This issue covers the period from February 5 to February 18 2021.<br><br>We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.<br><br><br>## Industry News and Conference Talks<br><br>*  Vulkan, a cross-platform graphics API, is [five years old now](<a href="https://www.phoronix.com/scan.php?page=news_item&px=Vulkan-Turns-Five-Years-Old">https://www.phoronix.com/scan.php?page=news_item&px=Vulkan-Turns-Five-Years-Old</a>).<br>*  In another Apple M1 GPU tinkering effort, Dougall Johnson published an in-progress doc attempting to [explain the M1 GPU architecture](<a href="https://dougallj.github.io/applegpu/docs.html">https://dougallj.github.io/applegpu/docs.html</a>). The project [repository contains various tools](<a href="https://github.com/dougallj/applegpu">https://github.com/dougallj/applegpu</a>), including an assembler, disassembler, emulator, and a test suite.<br><br><br>##  LLVM and Clang<br><br>### Discussions<br><br>*  David Blaikie is [looking for volunteers](<a href="https://lists.llvm.org/pipermail/llvm-dev/2021-February/148467.html">https://lists.llvm.org/pipermail/llvm-dev/2021-February/148467.html</a>) with GPU and/or LLVM middle-end background to help review the ["Abstracting over SSA form IRs to implement generic analyses"](<a href="https://lists.llvm.org/pipermail/llvm-dev/2020-December/147433.html">https://lists.llvm.org/pipermail/llvm-dev/2020-December/147433.html</a>) proposal. One of the main uses of the proposed abstractions is supposed to be the Divergence Analysis.<br>*  Sameer Sahasrabuddhe continues the attempts to [enable Divergence Analysis](<a href="https://reviews.llvm.org/D96615">https://reviews.llvm.org/D96615</a>) with the New Pass Manager. Alina Sbirlea [pointed out that there are two feasible ways](<a href="https://lists.llvm.org/pipermail/llvm-dev/2021-February/148600.html">https://lists.llvm.org/pipermail/llvm-dev/2021-February/148600.html</a>) to make the `SimpleLoopUnswitch` pass work: either disable non-trivial unswitching for targets with divergence, or compute Diverge Analysis results within the pass.<br><br>### Commits <br><br>*  Fixes to [AMDGPU maximum memory scope for scratch, LDS, and GDS](<a href="https://reviews.llvm.org/D96643">https://reviews.llvm.org/D96643</a>) address spaces.<br>*  [Support for the AMDGPU `gfx90a` target](<a href="https://reviews.llvm.org/D96906">https://reviews.llvm.org/D96906</a>) was posted, but may have been committed prematurely.<br>*  CUDA/HIP [option for specifying compilation unit ID](<a href="https://reviews.llvm.org/D95007">https://reviews.llvm.org/D95007</a>), `-fuse-cuid`. <br>*  (In-review) HIP option to enable [sanitizer support for the AMDGPU target](<a href="https://reviews.llvm.org/D96835">https://reviews.llvm.org/D96835</a>), `-fgpu-sanitize`. This is experimental and off by default.<br>*  (In-review) A new [clspv target for libclc](<a href="https://reviews.llvm.org/D94013">https://reviews.llvm.org/D94013</a>). [clspv](<a href="https://github.com/google/clspv">https://github.com/google/clspv</a>) is an open-source OpenCL C to Vulkan SPIR-V compiler.<br><br><br>## MLIR<br><br>### Discussions<br><br>### Commits<br><br>*  NVVM/ROCDL kernel function conversions [now rely on](<a href="https://reviews.llvm.org/D96591">https://reviews.llvm.org/D96591</a>) target-specific attributes for better control.<br>*  NVVM/ROCDL to LLVM IR conversions [now adopt](<a href="https://reviews.llvm.org/D96592">https://reviews.llvm.org/D96592</a>) the interface-based LLVM translation.<br>*  In SPIR-V dialect, more [types](<a href="https://reviews.llvm.org/D96169">https://reviews.llvm.org/D96169</a>) and [ops](<a href="https://reviews.llvm.org/D96527">https://reviews.llvm.org/D96527</a>) were defined to support graphics use cases.<br>*  More [patterns](<a href="https://reviews.llvm.org/D96042">https://reviews.llvm.org/D96042</a>) were added to convert vector ops to SPIR-V ops.<br><br><br>## OpenMP (Target Offloading)<br><br>### Discussions<br><br>*  Konstantin Sidorov is interested in Google Summer of Code [project ideas related to Machine Learning-assisted compiler optimizations](<a href="https://lists.llvm.org/pipermail/llvm-dev/2021-January/147908.html">https://lists.llvm.org/pipermail/llvm-dev/2021-January/147908.html</a>). Johannes Doerfert [suggested a predictor](<a href="https://lists.llvm.org/pipermail/llvm-dev/2021-February/148386.html">https://lists.llvm.org/pipermail/llvm-dev/2021-February/148386.html</a>) for grid/block/thread block size for OpenMP GPU kernels.<br><br>### Commits<br><br>*  NVIDIA devices will from now on [require CUDA 9.0 or higher](<a href="https://reviews.llvm.org/D97003">https://reviews.llvm.org/D97003</a>).<br>*  We will natively [support CUDA 11.1 and 11.2](<a href="https://reviews.llvm.org/D97004">https://reviews.llvm.org/D97004</a>).<br>*  All target directives, not only target regions, will now utilize [asynchronous actions](<a href="https://reviews.llvm.org/D96379">https://reviews.llvm.org/D96379</a>) if the plugin supports them (which includes the CUDA plugin).<br>*  The [NVIDIA device runtime](<a href="https://reviews.llvm.org/D94745">https://reviews.llvm.org/D94745</a>) and the [AMDGPU device runtime](<a href="https://reviews.llvm.org/D96533">https://reviews.llvm.org/D96533</a>) are now build as C++ with OpenMP code, not as CUDA/HIP anymore.<br>*  The CUDA plugin can be built [without having CUDA installed](<a href="https://reviews.llvm.org/D95155">https://reviews.llvm.org/D95155</a>) on a system (or known to clang), this should allow us to distribute LLVM with OpenMP offload support more easily.<br>*  Various bugs have been fixed, including but not limited to:<br>    -  [PR49158](<a href="https://llvm.org/PR49158">https://llvm.org/PR49158</a>) fixed by allowing unused functions in declare target regions if they are [not emitted](<a href="https://reviews.llvm.org/D95928">https://reviews.llvm.org/D95928</a>),<br>    -  [PR49207](<a href="https://llvm.org/PR49207">https://llvm.org/PR49207</a>) fixed by avoiding stack locations in [asynchronous actions](<a href="https://reviews.llvm.org/D96667">https://reviews.llvm.org/D96667</a>).<br><br><br>## External Compilers<br><br>### LLPC<br><br>### Mesa<br><br>*  llvmpipe, a CPU OpenGL implementation, landed [support for more SPIR-V extensions](<a href="https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8972">https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8972</a>), bringing it closer to full GL4.6 support.<br><br>### SYCL<br><br>*  Khronos released the final version of the [SYCL 2020 spec](<a href="https://www.khronos.org/blog/sycl-2020-what-do-you-need-to-know">https://www.khronos.org/blog/sycl-2020-what-do-you-need-to-know</a>). SYCL 2020 is based on C++17 and contains [over 40 new features](<a href="https://www.khronos.org/news/press/khronos-releases-sycl-2020-final-specification">https://www.khronos.org/news/press/khronos-releases-sycl-2020-final-specification</a>), including Unified Shared Memory, built-in parallel reduction operations, atomic operations with C++ atomics semantics.<br></div><div><br></div><div><br></div>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div>Jakub Kuderski</div></div></div>