[llvm-dev] LLVM GPU News Issue #1, December 11 2020
Jakub (Kuba) Kuderski via llvm-dev
llvm-dev at lists.llvm.org
Fri Dec 11 09:26:43 PST 2020
I'm starting a bi-weekly newsletter on all the GPU things under the LLVM
umbrella: GPU backends in LLVM, GPU dialects in MLIR, middle-end work
related to GPU compilation, external LLVM- and MLIR-based GPU compilers,
relevant conference talks, etc. I'm going to publish new issues every other
Friday on llvm-dev and a dedicated website: https://llvm-gpu-news.github.io
The first issue is available at:
I'm also pasting the content below, in case you prefer to read in your
The high-level goals are to surface common themes in GPU compilation for
different hardware, and to raise awareness of the general LLVM
community about important aspects of GPU compilation.
# LLVM GPU News Issue #1, December 11 2020
Welcome to the first issue of LLVM GPU News, a bi-weekly newsletter on all
GPU things under the LLVM umbrella. This issues covers the period from
November 27 to December 10 2020.
We welcome your feedback and suggestions. Let us know if we missed anything
want us to bring attention to your (sub)project, revisions under review, or
Please see the bottom of the page for details on how to submit suggestions
## Industry News and Conference Talks
* [AMD published the RDNA 2 Instruction Set Architecture manual.](
Some notable changes from the previous GCN ISA are:
* ray tracing support,
* new dot product ALU operations for accelerated inference and deep
* VGPR and LDS allocation-unit size were doubled,
* legacy multiply-add instructions were removed (superseded by
* Jay Foad ran into [issues with preserved and required transitive
analyses in the Legacy Pass Manager](
in AMDGPU. Jay proposes to add a new pass preservation rule, but some
existing passes currently violate it.
There are no replies as of writing.
* Arthur Eubanks is working [towards enabling the New Pass Manager](
Arthur looked into AMDGPU support for the NPN and points out that
[passes that depended on `TargetMachine::adjustPassManager` need to be
tweaked to work with the NPN](
* João Paulo L. de Carvalho asked about
[modeling address space casts in the Scalar Evolution analysis](
This prevents simple SYCL loops from being vectorized. There are no
replies as of writing.
* Nichols A. Romero proposed to add Fortran tests to the LLVM Test Suite.
[The tests will focus on language features, high-performance proxy
programs, and OpenMP multi-threading and GPU offloading.](
The response seems overwhelmingly positive so far.
* (In-review) Ongoing work and discussion on
[Adding convergence control operand bundle and intrinsics](
https://reviews.llvm.org/D85603) to LLVM IR.
* [Clang Offload Bundler gained AMDGPU code object V4 ABI documentation.](
* Various fixes to AMDGPU assembler diagnostics: [\[1\]](
* (In-review) [Don't sink ptrtoint/inttoptr sequences into non-noop
address space casts.](https://reviews.llvm.org/D92210)
This resolves an [illegal memory access with atomic shared memory
* [CUDA/HIP hostness function overloading fixes.](
A new `-fgpu-exclude-wrong-side-overloads` Clang flag controls the
* [`gpu.allocate` and `gpu.deallocate` ops were added to runtime function
* The `GpuAsyncRegionPass` learned to
[move `gpu.wait` ops from `async.execute` regions to its dependencies](
This prevents unnecessary host synchronization.
## External Compilers
Please submit pointers to your mailing lists, forums, or newsletters if you
want your LLVM-
or MLIR-based GPU compiler project to be covered in future LLVM GPU News
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev